Mathematical and statistical methods#
A number of mathematical functions that calculate statistics over an entire array or over the data along an axis are accessible as methods of the array class. So you can use aggregations such as sum, mean and standard deviation by either calling the array instance method or using the top-level NumPy function.
Below I generate some random data and calculate some aggregated statistics:
[1]:
import numpy as np
data = np.random.randn(7, 3)
data
[1]:
array([[ 0.52892401, -0.82705139, -0.13426779],
[-0.43476595, 0.15431376, -0.15927356],
[ 0.5437757 , -0.27273503, -0.74511308],
[ 0.41921053, 0.78804831, -1.39898524],
[-0.08745354, 0.24346498, 0.5995653 ],
[ 2.18987033, 0.07709088, 0.81486999],
[ 0.42570339, 1.23702332, 1.12807273]])
[2]:
data.mean()
[2]:
0.24239465071821545
[3]:
np.mean(data)
[3]:
0.24239465071821545
[4]:
data.sum()
[4]:
5.090287665082524
Functions like mean
and sum
require an optional axis argument that calculates the statistic over the specified axis, resulting in an array with one less dimension:
[5]:
data.mean(axis=0)
[5]:
array([0.51218064, 0.20002212, 0.01498119])
[6]:
data.sum(axis=0)
[6]:
array([3.58526448, 1.40015484, 0.10486835])
With data.mean(0)
, which is the same as data.mean(axis=0)
, the mean is calculated over the rows, while data.sum(0)
calculates the sum over the rows.
Other methods like cumsum
and cumprod
, however, do not aggregate but create a new array with the intermediate results.
In multidimensional arrays, accumulation functions such as cumsum
and cumprod
return an array of the same size but with the partial aggregates calculated along the specified axis:
[7]:
data.cumsum()
[7]:
array([ 0.52892401, -0.29812737, -0.43239516, -0.86716111, -0.71284735,
-0.87212091, -0.32834522, -0.60108025, -1.34619332, -0.92698279,
-0.13893449, -1.53791972, -1.62537326, -1.38190829, -0.78234299,
1.40752735, 1.48461823, 2.29948822, 2.72519162, 3.96221494,
5.09028767])
[8]:
data.cumprod()
[8]:
array([ 5.28924012e-01, -4.37447338e-01, 5.87350864e-02, -2.55360156e-02,
-3.94055863e-03, 6.27626816e-04, 3.41288209e-04, -9.30812494e-05,
6.93560562e-05, 2.90747892e-05, 2.29123384e-05, -3.20540232e-05,
2.80323775e-06, 6.82490215e-07, 4.09197451e-07, 8.96089358e-07,
6.90803200e-08, 5.62914796e-08, 2.39634740e-08, 2.96433762e-08,
3.34398842e-08])
Basic statistical methods for arrays are:
Method |
Description |
---|---|
|
Sum of all elements in the array or along an axis. |
|
Arithmetic mean; for arrays with length zero, |
|
Standard deviation and variance respectively |
|
Minimum and maximum |
|
Indices of the minimum and maximum elements respectively |
|
Cumulative sum of the elements, starting with |
|
Cumulative product of the elements, starting with |