Text
This is the fifth and last post in the GarlicSim module of the week series. This is a series of blog posts in which I explore different modules from the garlicsim.general_misc package. These are modules which are not specific to computer simulations; they can be relevant to any kind of Python program, and you are welcome to import them from garlicsim.general_misc and use them in your projects.
Part one: caching.cache: A caching decorator that understands arguments
Part two: caching.CachedType: A metaclass for sharing instances
Part three: address_tools: More powerful replacements for eval and repr
Part four: ContextManager and the manage_context method
Part five: cute_profile: Profile your Python code on the fly
This is the end of the GSMOTW series for now. I hope you enjoyed it as much as I did!
(Programmers experienced with profilers may skip this section.)
To “profile” a piece of code means to run it while checking how long it takes, and how long each function call inside the code takes. When you use a “profiler” to profile your program, you get a table of (a) all the functions calls that were made by the program, (b) how many times each function was called and (c) how long the function calls took.
A profiler is an indispensable programming tool, because it allows the programmer to understand which parts of his code take the longest. Usually, when using a profiler, you discover that only a few small parts of his code take most of the runtime of your program. And quite often, it’s not the parts of code that you thought were the slow ones.
Once you realize which parts of the program cause slowness, you can focus your efforts on those problematic parts only, optimizing them or possibly redesigning the way they work so they’re not slow anymore. Then the whole program becomes faster.
cute_profilePython supplies a module called cProfile in its standard library. cProfile is a good profiler, but its interface can be inconvenient to work with. The cute_profile module has a more flexible interface, and it uses cProfile under the hood to do the actual profiling.
Let’s profile an example program. Our example would be a function called get_perfects, which finds perfect numbers:
def get_divisors(x):
"""Get all the integer divisors of `x`."""
return [i for i in xrange(1, x) if (x % i == 0)]
def is_perfect(x):
"""Is the number `x` perfect?"""
return sum(get_divisors(x)) == x
def get_perfects(top):
"""Get all the perfect numbers up to the number `top`."""
return [i for i in xrange(1, top) if is_perfect(i)]
print(get_perfects(2000))
The result is [6, 28, 496]. However, this function takes a few seconds to run. That’s fairly long. Let’s use cute_profile to find out why this function is taking so long. We’ll add the cute_profile.profile_ready decorator around get_perfects:
from garlicsim.general_misc import cute_profile
@cute_profile.profile_ready()
def get_perfects(top):
"""Get all the perfect numbers up to the number `top`."""
return [i for i in xrange(1, top) if is_perfect(i)]
Now before we run get_perfects, we set it to profile:
get_perfects.profiling_on = True
And now we run it:
print(get_perfects(2000))
We still get the same result, but now a profiling table gets printed:
6000 function calls in 6.142 seconds
Ordered by: cumulative time
ncalls tottime percall cumtime percall filename:lineno(function)
1 0.000 0.000 6.142 6.142 :1()
1 0.025 0.025 6.142 6.142 perfect_numbers.py:9(get_perfects)
1999 0.045 0.000 6.117 0.003 perfect_numbers.py:6(is_perfect)
1999 6.060 0.003 6.060 0.003 perfect_numbers.py:3(get_divisors)
1999 0.012 0.000 0.012 0.000 {sum}
1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
This table shows how long each function took. If you want to understand exactly what each number says in this table, there’s an explanation here.
The tottime column says how much time was spent inside this function, across all calls, and without counting the time that was spent in sub-functions. See how the get_divisors function in our example has a very high tottime of 6.060. This means that get_divisors is what’s causing our program to run slow, and if we’ll want to optimize the program, we should try to come up with a smarter way of finding all of a number’s divisors than going one-by-one over all numbers.
profile_ready has a bunch of other options that I won’t explore there. In brief:
condition argument is something like a “breakpoint condition” in an IDE: It can be a function, usually a lambda, that takes the decorated function and any arguments and returns whether or not to profile it this time.off_after means whether you want the function to stop being profiled after being profiled one time. Default is True.sort is an integer saying by which column the final results table should be sorted.cute_profileSource for cute_profile, well-documented. Here are the tests.
There is also a Python 3 version of cute_profile, and here are its tests. It’s available with the Python 3 fork of GarlicSim.
I’ll be happy to get any opinions, critiques and code reviews on cute_profile!