System wide python profiling

I recently had to profile an embedded system that runs a bunch of python code. The goal was to improve the start-up speed and identify bottlenecks. This is a large and complex system that has many process starting and stopping and forking all over the place. It looked to be a difficult task to hook the profiler into each and every process that would get launched. I came up with a solution to temporarily hook the profiler into the interpreter itself and dump the stats at exit using the atexit module.

It turns out, python has the nifty ability to run code at start-up by adding it to the sitecustomize.py file. This file should be put in your python’s lib directory, in my case: /usr/lib/python2.7/sitecustomize.py. Anything you add to this file will get run when the python interpreter gets started. For example, if your file has only:

print "Hello World"

You will get this on starting python:

Hello World
Python 2.7.2 (default, Jan 30 2013, 17:47:27) 
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>

So, in order to get the interpreter to fire up the profiler and dump the stats at exit, I just need to add some code to this file to do just that. Here is what I used:

import cProfile
import atexit
import os,sys
def ProfExit(p):
  p.disable()
  prof_f_name = "%d"%os.getpid()
  p.dump_stats("/root/profiles/%s"%prof_f_name)
profile_hook = cProfile.Profile()
atexit.register(ProfExit, profile_hook)
profile_hook.enable()

So what does this do? It imports the profiler and the atexit module. It creates an instance of the profiler, registers with atexit to stop the profiler and dump the stats to a file named with the process ID of that python process, and finally starts the profiler. So every python process run on the system will now be profiled! FYI, the stats won’t get dumped until the process exits, so make sure you stop all of them.

What do I do will all of these stats files now that I have them. Well, you could go inspect them with the pstats module… or you could use this other great tool I found gprof2dot to convert them into call graphs. You must also install Graphviz to convert the dot files into images (apt-get install graphviz). But the results are worth it! The image show an example of the output, each block is a function called, and color codded by how much time was spent in that call. This makes it pretty easy to identify slow paths and easy optimizations. To generate call graphs for each stats file, I used something like this, from the directory where my stats files were:

for i in *; do ../gprof2dot.py -f pstats $i | dot -Tsvg -o ../callgraphs/$i.svg; done;

 

Call Graph

Call Graph