iPython Profiler#
IPython provides access to a wide range of functions to measure times and create profiles. The following magic IPython commands are explained here:
Command |
Description |
---|---|
|
Time to execute a single statement |
|
Average time it took to execute a single statement repeatedly |
|
Run code with the profiler |
|
Run code with the line-by-line profiler |
|
Measure the memory usage of a single statement |
|
Executes the code with the line-by-line memory profiler |
The last four commands are not contained in IPython itself, but in the modules line_profiler and memory_profiler.
See also
%timeit
and %time
#
We saw the %timeit
line and %%timeit
cell magic in the introduction of the magic functions in IPython magic commands. They can be used to measure the timing of the repeated execution of code snippets:
[1]:
%timeit sum(range(100))
321 ns ± 1.6 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
Note that %timeit
executes the execution multiple times in a loop. If the number of loops is not specified with -n
, %timeit
automatically adjusts the number so that sufficient measurement accuracy is achieved:
[2]:
%%timeit
total = 0
for i in range(1000):
for j in range(1000):
total += i * (-1) ** j
99.1 ms ± 310 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
Sometimes repeating an operation is not the best option, for example when we have a list that we want to sort. Here we may be misled by repeated surgery. Sorting a presorted list is much faster than sorting an unsorted list, so repeating it distorts the result:
[3]:
import random
L = [random.random() for i in range(100000)]
%timeit L.sort()
205 µs ± 4.34 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Then the %time
function might be a better choice. %time
should also be the better choice for long-running commands, when short system-related delays are unlikely to affect the result:
[4]:
import random
L = [random.random() for i in range(100000)]
%time L.sort()
CPU times: user 10.4 ms, sys: 302 µs, total: 10.7 ms
Wall time: 10.6 ms
Sorting an already sorted list:
[5]:
%time L.sort()
CPU times: user 373 µs, sys: 5 µs, total: 378 µs
Wall time: 379 µs
Note how much faster the pre-sorted list is to be sorted, but also note how much longer the timing with %time
takes compared to %timeit
, even for the pre-sorted list. This is due to the fact that %timeit
is doing some clever things to keep system calls from interfering with the timing. This prevents, for example, the garbage collection of Python objects that are no longer used and that could otherwise affect the time measurement. Because of this, the %timeit
results are usually
noticeably faster than the %time
results.
Profiling for scripts: %prun
#
A program is made up of many individual instructions, and sometimes it is more important to measure those instructions in context than to measure them yourself. Python includes a built-in Code-Profiler. However, IPython offers a much more convenient way to use this profiler in the form of the magic function %prun
.
As an example, let’s define a simple function that does some calculations:
[6]:
def sum_of_lists(N):
total = 0
for i in range(5):
L = [j ^ (j >> i) for j in range(N)]
total += sum(L)
return total
[7]:
%prun sum_of_lists(1000000)
In the notebook the output looks something like this:
14 function calls in 9.597 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
5 8.121 1.624 8.121 1.624 <ipython-input-15-f105717832a2>:4(<listcomp>)
5 0.747 0.149 0.747 0.149 {built-in method builtins.sum}
1 0.665 0.665 9.533 9.533 <ipython-input-15-f105717832a2>:1(sum_of_lists)
1 0.065 0.065 9.597 9.597 <string>:1(<module>)
1 0.000 0.000 9.597 9.597 {built-in method builtins.exec}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
The result is a table that shows the execution time for each function call, sorted by total time. In this case, most of the time is consumed with list comprehension within sum_of_lists
. This gives us clues as to where we could improve the efficiency of the algorithm.
Profiling line by line: %lprun
#
Profiling with %prun
is useful, but sometimes a line-by-line profile report is more insightful. This isn’t built into Python or IPython, but there is a package available, line_profiler, that enables this. This can be provided in your kernel with
$ spack env activate python-374
$ spack install py-line-profiler ^python@3.7.4%gcc@9.1.0
Alternatively, you can install line-profiler
with other package managers, for example
$ pipenv install line_profiler
Now you can load IPython with the line_profiler
extension:
[8]:
%load_ext line_profiler
The %lprun
command profiles each function line by line. In this case, you must explicitly specify which functions are of interest for creating the profile:
[9]:
%lprun -f sum_of_lists sum_of_lists(5000)
The result looks something like this:
Timer unit: 1e-06 s
Total time: 0.015145 s
File: <ipython-input-6-f105717832a2>
Function: sum_of_lists at line 1
Line # Hits Time Per Hit % Time Line Contents
==============================================================
1 def sum_of_lists(N):
2 1 1.0 1.0 0.0 total = 0
3 6 11.0 1.8 0.1 for i in range(5):
4 5 14804.0 2960.8 97.7 L = [j ^ (j >> i) for j in range(N)]
5 5 329.0 65.8 2.2 total += sum(L)
6 1 0.0 0.0 0.0 return total
The time is given in microseconds and we can see which line the function spends most of its time on. We may then be able to modify the script in such a way that the efficiency of the function can be increased.
More information about %lprun
and the available options can be found in the IPython help function %lprun?
.
Create a storage profile: %memit
and %mprun
#
Another aspect of profiling is the amount of memory that an operation uses. This can be evaluated with another IPython extension, the memory_profiler
. This can be provided in your kernel with
$ spack env activate python-311
$ spack install py-memory-profiler
Alternatively you can install memory-profiler
with other package managers, for example
$ pipenv install memory_profiler
[10]:
%load_ext memory_profiler
[11]:
%memit sum_of_lists(1000000)
peak memory: 176.50 MiB, increment: 76.33 MiB
We see that this feature occupies approximately 100 MB of memory.
For a line-by-line description of memory usage, we can use the %mprun
magic. Unfortunately, this magic only works for functions that are defined in separate modules and not for the notebook itself. So we first use the %%file
magic to create a simple module called mprun_demo.py
that contains our sum_of_lists
function.
[12]:
%%file mprun_demo.py
from memory_profiler import profile
@profile
def my_func():
a = [1] * (10 ** 6)
b = [2] * (2 * 10 ** 7)
del b
return a
Writing mprun_demo.py
[13]:
from mprun_demo import my_func
%mprun -f my_func my_func()
Filename: /Users/veit/cusy/trn/Python4DataScience/docs/performance/mprun_demo.py
Line # Mem usage Increment Occurrences Line Contents
=============================================================
3 123.8 MiB 123.8 MiB 1 @profile
4 def my_func():
5 131.5 MiB 7.7 MiB 1 a = [1] * (10 ** 6)
6 284.1 MiB 152.6 MiB 1 b = [2] * (2 * 10 ** 7)
7 284.1 MiB 0.0 MiB 1 del b
8 284.1 MiB 0.0 MiB 1 return a
Here the Increment
column shows how much each row affects the total memory consumption: Note that when we calculate b
we need about 160 MB of additional memory; however, this is not released again by deleting b
.
More information about %memit
, %mprun
and their options can be found in the IPython help with %memit?
.
pyheatmagic#
pyheatmagic is an extension that allows the IPython magic command %%heat
to display Python code as a heatmap with Py-Heat.
It can be easily installed in the kernel with
$ pipenv install py-heat-magic Installing py-heat-magic… …
Loading the extension in IPython#
[14]:
%load_ext heat
Display the heat map#
[15]:
%%heat
def powfun(a, b):
"""Method to raise a to power b using pow() function."""
return pow(a, b)
def powop(a, b):
"""Method to raise a to power b using ** operator."""
return a**b
def powmodexp(a, b):
"""Method to raise a to power b using modular exponentiation."""
base = a
res = 1
while b > 0:
if b & 1:
res *= base
base *= base
b >>= 1
return res
def main():
"""Test function."""
a, b = 2377757, 773
pow_function = powfun(a, b)
pow_operator = powop(a, b)
pow_modular_exponentiation = powmodexp(a, b)
if __name__ == "__main__":
main()

Alternatively, the heatmap can also be saved as a file, for example with
%%heat -o pow-heatmap.png