Array-oriented programming – vectorisation#
Using NumPy arrays allows you to express many types of data processing tasks as concise array expressions that would otherwise require writing for
-loops. This practice of replacing loops with array expressions is also called vectorisation. In general, vectorised array operations are significantly faster than their pure Python equivalents.
[1]:
import numpy as np
First we create a NumPy array with one hundred thousand integers:
[2]:
myarray = np.arange(100000)
Then we square all the elements in this array with numpy.square:
[3]:
%time np.square(myarray)
CPU times: user 559 µs, sys: 2.55 ms, total: 3.11 ms
Wall time: 269 µs
[3]:
array([ 0, 1, 4, ..., 9999400009, 9999600004,
9999800001])
For comparison, we now measure the time of Python’s quadratic function:
[4]:
%time for _ in range(10): myarray2 = myarray ** 2
CPU times: user 807 µs, sys: 4.07 ms, total: 4.87 ms
Wall time: 440 µs
And finally, we compare the time with the calculation of the quadratic function of all values of a Python list:
[5]:
mylist = list(range(100000))
%time for _ in range(10): mylist2 = [x ** 2 for x in mylist]
CPU times: user 115 ms, sys: 390 ms, total: 505 ms
Wall time: 46.7 ms