- 1 Easier Python Debugging
- 1.1 Summary
- 1.2 Owner
- 1.3 Current status
- 1.4 Detailed Description
- 1.5 Benefit to Fedora
- 1.6 Scope
- 1.7 How To Test
- 1.8 User Experience
- 1.9 Dependencies
- 1.10 Contingency Plan
- 1.11 Documentation
- 1.12 Release Notes
- 1.13 Comments and Discussion
Easier Python Debugging
gdb debugger has been extended so that it can report detailed information on the internals of the Python 2 and Python 3 runtimes. Backtraces involving Python will now by default show mixed C and Python-level information on what such processes are doing, without requiring expertise in the use of
We believe this ability is unique to Fedora, and will be valuable for Python developers seeking additional visibility into their CPython processes.
- Name: David Malcolm
- Email: <firstname.lastname@example.org>
- Targeted release: Fedora 13
- Last updated: 2010-02-04
- Percentage of completion: 50%
A working proof-of-concept can be seen at http://fedorapeople.org/gitweb?p=dmalcolm/public_git/libpython.git;a=summary
Still to be done: integrate so that the libpython hooks are run automatically by gdb
I was stuck on this issue when getting at "PyFrameObject *f" from the current frame, but in my current implementation I've sidestepped this by simply writing a pretty-printer for PyFrameObject* which gdb successfully invokes during a backtrace.
(Was also help up for a while by this now-fixed GCC issue.
See also this bug: https://bugzilla.redhat.com/show_bug.cgi?id=552654
We ship Python wrappers for numerous libraries implemented in C and C++. Bugs (either in the libraries themselves, or in the usage of those libraries) can lead to complicated backtraces from gdb, and it can be hard to figure out what's going on at the python level.
Walking through the stack frames, going up from the bottom (textually), or down from the top (numerically):
- frames 26 and below show a pygtk application starting up.
- An event comes in frame 24/25, and is dispatched into pulsecore (frames 23->18; pstream_packet_callback, pa_context_simple_ack_callback) which:
- calls a Python callback (down to frame 15),
- ...which invokes python code down to frame 3.
- ...where it calls back into native code; whereupon the segfault happens, calling Py_DecRef on some object pointer.
Note that as it stands, all we see from the backtrace is that python code was run: we have no way as-is of telling what that python code was.
In the above example, it happens that there is a bug in the application's Python code, which is sufficiently serious to cause a SIGSEGV error. This example uses the
ctypes module, which is designed to expose machine-level details. It's fairly easily to write a one-liner of python code using this module which causes the python process to immediately fail with either a SIGSEGV or SIGABRT.
When using "native" C/C++ libraries, it's sadly common for bugs in the library to leads to SIGSEGV errors that immediately cause the whole python process to terminate. Beyond that, poorly-designed error-handling in such libraries uses
abort() at the C level, which immediately terminates the entire process. It's useful to be able to determine what was "really" going on when this happens.
A trickier problem is when a threading assertion fails: many libraries make assumptions about threads and locks, and allow the programmer to register callbacks, but imposes conditions upon the kind of code run in those callbacks. When the threads and callback-registration hooks are wrapped at the python level, these conditions continue to be required at the Python level, but mistakes here often lead to low-level error-handling that's difficult to debug.
For example, the GTK widget library requires that all communication with the X server happen within a GDK lock, to avoid garbling the single "conversation" between the process and the X server. The common way to implement this in a multi-threaded application is to restrict all calls to GTK to a single "primary" thread. See attachment 379251 to rhbug:543278 bug 543278 for an example of where a secondary thread in an application violates this, which leads to a low-level
gdk_x_error() failure in the main thread: frames 16 to 28 of this backtrace are running Python code, but it's not at all clear from the backtrace _what_ said code is actually doing.
Current state-of-the-art for debugging CPython backtraces
Python already has a gdbinit file with plenty of domain-specific hooks for debugging CPython, and we ship it in our
python-devel subpackage. If you copy this to
~/.gdbinit you can then use "pyframe" and other commands to debug things, and figure out where we are in Python code from gdb. I used it when deciphering the example backtraces referred to above.
- this script isn't very robust; if the data in the "inferior" process is corrupt, attempting to print it can lead to a SIGSEGV within that process
- you have to go into gdb manually and run these commands by hand, and it's hard to do this correctly; any mistakes when doing this will typically cause a SIGSEGV in the inferior process; see e.g. bug 532552
- the script is written in the gdb language and is thus hard to work with and extend
gdb should provide rich information on what's going on at the Python level automatically. I plan to hook this in using gdb-archer, and make it automatic:
- Biggest win: automatically display python frame information in PyEval_EvalFrameEx in gdb backtraces, including in ABRT:
- python source file, line number, and function names
- values of locals, if available
- name of function for wrapped C functions
See Alex's work: http://blogs.gnome.org/alexl/2008/11/18/gdb-is-dead-long-live-gdb/ and more recently: http://blogs.gnome.org/alexl/2009/09/21/archer-gdb-macros-for-glib/
I'd want to have the python backtrace work integrated with the glib backtrace work: pygtk regularly shows me backtraces with a mixture of both
Alex's work is in in glib git: http://git.gnome.org/browse/glib/commit/?id=efe9169234e226f594b4254618f35a139338c35f which does a:
See http://tromey.com/blog/?p=522 for info on this.
This needs a more recent version of gdb than in F-12; I'll need to build a local copy of "archer-tromey-python" branch of gdb to work on this.
Archer upstream: http://sourceware.org/gdb/wiki/ProjectArcher
Benefit to Fedora
Backtraces from gdb (such as those from ABRT) that involve python code will show what's going on at the Python level, as well as at the C level. This will make it much easier for developers to read backtraces when a library wrapped by python encounters a bug (e.g. PyGTK)
For python developers, it should be possible to attach to a running python process using gdb, then run
thread apply all backtrace to get an overview of all C and Python code running in all threads within that process - I believe this ability would be unique to Fedora, and be valuable for Python developers seeking additional visibility into their CPython processes.
This will require extensions to the
python srpm, and analogous changes to the
It may well require co-ordination with the
gdb srpm (such as API changes), and with the
glib2 changes written by Alex referred to above.
How To Test
Ideas for test cases/coverage:
- try attaching to a running (multithreaded) python process and ensure that
thread apply all backtracegenerates meaningful results
- ensure it plays well with Alex's GLib/GTK work; debug a multithreaded pygtk app
- ensure it fails gracefully if python-debuginfo isn't installed
- ensure that it fails gracefully if the inferior process has corrupted data (e.g. overwrites on the heap)
- ensure that it fails gracefully if the inferior process has a corrupted stack
- ensure that it works well under ABRT. It's easy to write one-liner python scripts that abuse the
ctypesmodule in such a way as to cause
[david@brick ~]$ python -c "import ctypes; ctypes.string_at(0xffffffff)" Segmentation fault (core dumped) [david@brick ~]$ python -c "import ctypes; ctypes.string_at(0x0)" python: Objects/stringobject.c:115: PyString_FromString: Assertion `str != ((void *)0)' failed. Aborted (core dumped)
- repeat all of the above for
In each case, gdb should give you meaningful information at the Python level, as well as at the C level.
(Once this is actually working, I'll post some before/after comparisons of "screenshots" in gdb i.e. textual dumps)
This feature will require coordination with, and possible changes in, the
The contingency plan would be to remove the additional .py files, deactivating the feature.
See the "Detailed Description" section above; this feature page contains much information.
- Python: the
gdbdebugger has been extended so that it can report detailed information on the internals of the Python 2 and Python 3 runtimes. Backtraces involving Python will now by default show mixed C and Python-level information on what such processes are doing, without requiring expertise in the use of