Array-oriented programming – vectorisation#

Using NumPy arrays allows you to express many types of data processing tasks as concise array expressions that would otherwise require writing for-loops. This practice of replacing loops with array expressions is also called vectorisation. In general, vectorised array operations are significantly faster than their pure Python equivalents.

import numpy as np

First we create a NumPy array with one hundred thousand integers:

myarray = np.arange(100000)

Then we square all the elements in this array with numpy.square:

%time np.square(myarray)
CPU times: user 559 µs, sys: 2.55 ms, total: 3.11 ms
Wall time: 269 µs
array([         0,          1,          4, ..., 9999400009, 9999600004,

For comparison, we now measure the time of Python’s quadratic function:

%time for _ in range(10): myarray2 = myarray ** 2
CPU times: user 807 µs, sys: 4.07 ms, total: 4.87 ms
Wall time: 440 µs

And finally, we compare the time with the calculation of the quadratic function of all values of a Python list:

mylist = list(range(100000))
%time for _ in range(10): mylist2 = [x ** 2 for x in mylist]
CPU times: user 115 ms, sys: 390 ms, total: 505 ms
Wall time: 46.7 ms