How to write boost.python converters

Introduction

The boost.python library is powerful, sometimes subtle, and, when
interacting with python objects, tries to make it possible to write
C++ that “feels” like Python. This is as compared with the Python C
API, where the experience is very far removed from writing python
code.

Part of making C++ feel more like Python is allowing natural
assignment of C++ objects to Python variables. For instance, assigning
an STL string to a python object looks like this:

// Create a C++ string
std::string msg("Hello, Python");

// Assign it to a python object
boost::python::object py_msg = msg;

Likewise (though somewhat less naturally), it is also important to be
able to extract C++ objects from Python objects. Boost.python provides
the extract type for this:

boost::python::object obj = ... ;
std::string msg = boost::python::extract<std::string>(obj);

To allow this kind of natural assignment, boost.python provides a
system for registering converters between the languages. Unfortunately, the boost.python documentation does a pretty poor job of describing how to write them. A bit of searching on the internet will turn up a few links like these:

While these are fine (and, in truth, are the basis for what I know
about the conversion system), they are not as explicit as I would
like.

So, in an effort to clarify the conversion system both for myself and
(hopefully) others, I wrote this little primer. I’ll step through a
full example showing how to write converters for Qt’s QString class. In the end, you should have all the information you need to write and register your own converters.

Converting QString

A boost.python type converter consists of two major parts. The first
part, which is generally the simpler of the two, converts a C++ type
into a Python type. I’ll refer to this as the to-python converter. The
second part converts a Python object into a C++ type. I’ll refer to
this as the from-python converter.

In order to have your converters be used at runtime, the boost.python
framework requires you to register them. The boost.python API provides
separate methods for registering to-python and from-python
converters. Because of this, you are free to provide conversion in
only one direction for a type if you so choose.

Note that, for certain elements of what I’m about to describe, there
is more than one way to do things. For example, in some cases where I
choose to use static member functions, you could also use free
functions. I won’t point these out, but if you wear your C++
thinking-cap you should be able to see what is mandatory and what
isn’t.

to-python Converters

A to-python converter converts a C++ type to a Python object. From an
API perspective, a to-python converter is used any time that you
construct a boost::python::object from another C++ type. For
example:

// Construct object from an int
boost::python::object int_obj(42);

// Construct object from a string
boost::python::object str_obj = std::string("llama");

// Construct object from a user-defined type
Foo foo;
boost::python::object foo_obj(foo);

You implement a to-python converter using a struct with static member
function named convert. convert takes the C++ object to be
converted as its argument, and it returns a PyObject*. A
to-python converter for QStrings looks like this:

/** to-python convert to QStrings */
struct QString_to_python_str
{
    static PyObject* convert(QString const& s)
      {
        return boost::python::incref(
          boost::python::object(
            s.toLatin1().constData()).ptr());
      }
};

The crux what this does is as follows:

  1. Extract the QString’s underlying character data using
    toLatin1().constData()
  2. Construct a boost::python::object with the character data
  3. Retrieve the boost::python::object’s PyObject* with the ptr()
    function
  4. Increment the reference count on the PyObject* and return that
    pointer.

That last step bears a little explanation. Suppose that you didn’t
increment the ref-count on the returned pointer. As soon as the
function returned, the boost::python::object in the function would
destruct, thereby reducing the ref-count to zero. When the PyObject’s
ref-count goes to zero, Python will consider the object dead and it
may be garbage-collected, meaning you would return a deallocated
object from convert().

Once you’ve written the to-python converter for a type, you need to
register it with boost.python’s runtime. You do this with the
aptly-named to_python_converter template:

// register the QString-to-python converter
boost::python::to_python_converter<
  QString,
  QString_to_python_str>()

The first template parameter is the C++ type for which you’re
registering a converter. The second is the converter struct. Notice
that this registration process is done at runtime; you need to call
the registration functions before you try to do any custom type
converting.

from-python Converters

from-python converters are slightly more complex because, beyond simply
providing a function to convert from Python to C++, they also have to
provide a function that determines if a Python type can safely be
converted to the C++ type. Likewise, they often require more knowledge
of the Python C API.

from-python converters are used whenever boost.python’s extract
type is called. For example:

// get an int from a python object
int x = boost::python::extract<int>(int_obj);

// get an STL string from a python object
std::string s = boost::python::extract<std::string>(str_obj);

// get a user-defined type from a python object
Foo foo = boost::python::extract<Foo>(foo_obj);

The recipe I use for creating from-python converters is similar to
to-python converters: create a struct with some static methods and
register those with the boost.python runtime system.

The first method you’ll need to define is used to determine whether an
arbitrary Python object is convertible to the type you want to
extract. If the conversion is OK, this function should return the
PyObject*; otherwise, it should return NULL. So, for QStrings you would
write:

struct QString_from_python_str
{

  . . .

  // Determine if obj_ptr can be converted in a QString
  static void* convertible(PyObject* obj_ptr)
    {
      if (!PyString_Check(obj_ptr)) return 0;
      return obj_ptr;
    }

  . . .

};

This simply says that a PyObject* can be converted to a QString if
it is a Python string.

The second method you’ll need to write does the actual conversion. The primary
trick in this method is that boost.python will provide you with a
chunk of memory into which you must in-place construct your new C++
object. All of the funny “rvalue_from_python” stuff just has to do
with boost.python’s method for providing you with that memory chunk:

struct QString_from_python_str
{

  . . .

  // Convert obj_ptr into a QString
  static void construct(
    PyObject* obj_ptr,
    boost::python::converter::rvalue_from_python_stage1_data* data)
    {
      // Extract the character data from the python string
      const char* value = PyString_AsString(obj_ptr);

      // Verify that obj_ptr is a string (should be ensured by convertible())
      assert(value);

      // Grab pointer to memory into which to construct the new QString
      void* storage = (
        (boost::python::converter::rvalue_from_python_storage<QString>*)
        data)->storage.bytes;

      // in-place construct the new QString using the character data
      // extraced from the python object
      new (storage) QString(value);

      // Stash the memory chunk pointer for later use by boost.python
      data->convertible = storage;
    }

  . . .

};

The final step for from-python converters is, of course, to register
the converter. To do this, you use
boost::python::converter::registry::push_back(). The first
argument is a pointer to the function which tests for convertibility,
the second is a pointer to the conversion function, and the third is a
boost::python::type_id for the C++ type. In this case, we’ll put the
registration into the constructor for the struct we’ve been building
up:

struct QString_from_python_str
{
  QString_from_python_str()
    {
      boost::python::converter::registry::push_back(
        &convertible,
        &construct,
        boost::python::type_id<QString>());
    }

  . . .

};

Now, if you simply construct a single QString_from_python_str
object in your initialization code (just like you how you called
to_python_converter() for the to-python registration), conversion
from Python strings to QString will be enabled.

Taking a reference to the PyObject in convert()

One gotcha to be aware of in your construct() function is that the
PyObject argument is a ‘borrowed’ reference. That is, its reference
count has not already been incremented for you. If you plan to keep a
reference to that object, you must use boost.python’s borrowed
construct. For example:

class MyClass
{
public:
  MyClass(boost::python::object obj) : obj_ (obj) {}

private:
  boost::python::object obj_;
};

struct MyClass_from_python
{
  . . .

  static void construct(
    PyObject* obj_ptr,
    boost::python::converter::rvalue_from_python_stage1_data* data)
    {
      using namespace boost::python;

      void* storage = (
        (converter::rvalue_from_python_storage<MyClass>*)
        data)->storage.bytes;

      // Use borrowed to construct the object so that a reference
      // count will be properly handled.
      handle<> hndl(borrowed(obj_ptr));
      new (storage) MyClass(object(hndl));

      data->convertible = storage;
    }
};

Failing to use borrowed in this situation will generally lead to
memory corruption and/or garbage collection errors in the python
runtime.

For more information on boost.python objects, handles, and reference
counting, see the following:

When converters don’t exist

Finally, a cautionary note. The boost.python type-conversion system
works well, not only at the job of moving objects across the
C++-python languages barrier, but at making code easier to read and
understand. You must always keep in mind, though, this comes at the
cost of very little compile-time checking.

That is, the boost::python::object copy-constructor is templatized and
accepts any type without complaint. This means that your code will
compile just fine even if you’re constructing boost::python::objects
from types that have no registered converter. At runtime, these
constructors will find that they have no converter for the requested
type and this will result in exceptions.

These exceptions (boost::python::error_already_set) will tend to
happen in unexpected places, and you could spend quite a bit of time
trying to figure them out. I say all of this so that maybe, when
you encounter strange exceptions when using boost::python, you’ll
remember to check that your converters are registered first. Hopefully
it’ll save you some time.

Appendix: Full code for QString converter

struct QString_to_python_str
{
    static PyObject* convert(QString const& s)
      {
        return boost::python::incref(
          boost::python::object(
            s.toLatin1().constData()).ptr());
      }
};

struct QString_from_python_str
{
    QString_from_python_str()
    {
      boost::python::converter::registry::push_back(
        &convertible,
        &construct,
        boost::python::type_id<QString>());
    }

    // Determine if obj_ptr can be converted in a QString
    static void* convertible(PyObject* obj_ptr)
    {
      if (!PyString_Check(obj_ptr)) return 0;
      return obj_ptr;
    }

    // Convert obj_ptr into a QString
    static void construct(
    PyObject* obj_ptr,
    boost::python::converter::rvalue_from_python_stage1_data* data)
    {
      // Extract the character data from the python string
      const char* value = PyString_AsString(obj_ptr);

      // Verify that obj_ptr is a string (should be ensured by convertible())
      assert(value);

      // Grab pointer to memory into which to construct the new QString
      void* storage = (
        (boost::python::converter::rvalue_from_python_storage<QString>*)
        data)->storage.bytes;

      // in-place construct the new QString using the character data
      // extraced from the python object
      new (storage) QString(value);

      // Stash the memory chunk pointer for later use by boost.python
      data->convertible = storage;
    }
};

void initializeConverters()
{
  using namespace boost::python;

  // register the to-python converter
  to_python_converter<
    QString,
    QString_to_python_str>();

  // register the from-python converter
  QString_from_python_str();
}
  1. #1 by website design on 2009/09/27 - 16:51

    Yeah, its a good system but it can be daunting to do things that arent spelled out very clearly in the documentation.

  2. #2 by Kirit on 2009/09/27 - 17:54

    This also took me some time to work out. To augment your examples, here are links to another string converter and one to a more complex type (JSON).

    http://svn.felspar.com/public/fost-py/trunk/Cpp/fost-python/pystring.cpp
    http://svn.felspar.com/public/fost-py/trunk/Cpp/fost-python/pyjson.cpp

    The last part that this adds and you don’t mention is that you must ensure that the registration happens exactly once. Depending on what you’re doing this might not be so easy. The registration functions in the examples can be run from every extension DLL/so/dylib and the types will only be registered once.

    • #3 by abingham on 2009/10/01 - 08:06

      I hadn’t realized that one-time registration was required. What happens if registration happens more than once?

  3. #4 by Bhāskara II on 2009/10/01 - 04:16

    I suggest that when using boost::python, the key is to remember to check that your converters are registered first; otherwise, you’ll run into trouble. Just a suggestion.

    • #5 by abingham on 2009/10/01 - 08:11

      Yes, definitely. Did any part of my post suggest otherwise? If so, I’ll try to clean it up. The registration is, as you say, critical.

  4. #6 by DygraphicProgrammer on 2009/10/02 - 17:25

    Thank you for writing this. boost::python is really powerful, but the documentation blows. I would like to encourage you to write more along this line.

    • #7 by abingham on 2009/10/02 - 18:29

      I’d be happy to write more on this topic or boost.python in general. Is there anything in particular that you’d like written up?

  5. #8 by peter on 2010/12/28 - 15:53

    I have a question on how to convert arbitrary python type to c++ type. Here is my case:
    In upgrade.py VersionInfo is defined as

    Class VersionInfo(object):
    def __init__(self, location_uri = ”):
    self.location_uri = location_uri

    def func():
    return VersionInfo()

    In main.cpp
    PyRun_String(“…\nresult=func()”, …);

    My question is how to make a converter for VersionInfo, since in your QString converter, the construct function you defined use PyString_AsString to extract python string data, but how about those arbitrary python object, how to extract their data and construct corresponding c++ object?

    • #9 by abingham on 2011/01/01 - 22:50

      Peter, the main way to approach what you’re trying to do is to access the attributes on your python objects using the boost::python::object::attr() method. In the example you give, your construct method might look like this (it assumes that you have some C++ class called VersionInfo):

      static void construct(
      PyObject* obj_ptr,
      boost::python::converter::rvalue_from_python_stage1_data* data)
      {
      // First just create a bp::object around obj_ptr for convenience
      bp::object obj(bp::handle(obj_ptr));

      // Get the location_uri attribute from the object
      bp::str location_uri = bp::extract(obj.attr(“location_uri”));

      // Grab pointer to memory into which to construct the new VersionInfo
      void* storage = (
      (boost::python::converter::rvalue_from_python_storage*)
      data)->storage.bytes;

      // in-place construct the new VersionInfo using the character data
      // extraced from the location_uri python object
      new (storage) VersionInfo(
      PyString_AsString(location_uri.ptr()));

      // Stash the memory chunk pointer for later use by boost.python
      data->convertible = storage;
      }

      Note that this is mostly just off the top of my head…it should be correct in spirit, though there may be issues with it!

      The main point is that you can bp::object::attr() gives you access to attributes on python objects. You can use the python C API to accomplish exactly the same thing, but I find boost.python to be generally simpler and clearer.

      I’d like to make some more posts on this and related topics. Maybe I’ll find time in the near future. In the meantime, let me know if you have any more questions.

  6. #10 by Faheem Mitha on 2012/02/08 - 05:03

    It is ironical that this is the best documentation for Boost Python converters, which are arguably a central feature, and certainly a killer feature, of this library, but yet are almost completely undocumented in the official documentation. Have you thought of adding this in a more official place in the Boost Python documentation? Maybe the BP wiki?

    • #11 by abingham on 2012/02/08 - 17:17

      Thanks! Yeah, I was very frustrated about the lack of documentation for this feature. It hadn’t occurred to me to try getting it into boost, but I’ll look into it.

  7. #12 by Andy on 2012/07/18 - 18:36

    +1 to Faheem’s comment. This is the best documentation for conversion, and it would be great to put it into the standard Boost docs.

    • #13 by abingham on 2012/07/19 - 07:00

      OK, I’ll try to make this a priority 🙂

  8. #14 by Jonatan on 2013/02/22 - 22:14

    Any chance you could update this post to Python 3?

    • #15 by abingham on 2013/02/24 - 20:37

      That’s a good idea. I’ll try to get to it soon.

  1. FILE* and Boost.Python | Bradley Froehle
  2. Boost::Python, dates et conversions | blog.freelan.org

Leave a reply to abingham Cancel reply