Embedding Python in C++ Applications with boost::python: Part 4

Posted on 06 January 2012 by Joseph

In Part 2 of this ongoing tutorial, I introduced code for parsing Python exceptions from C++. In Part 3, I implemented a simple configuration parsing class utilizing the Python ConfigParser module. As part of that implementation, I mentioned that for a project of any scale, one would want to catch and deal with Python exceptions within the class, so that clients of the class wouldn’t have to know about the details of Python. From the perspective of a caller, then, the class would be just like any other C++ class.

The obvious way of handling the Python exceptions would be to handle them in each function. For example, the get function of the C++ ConfigParser class we created would become:

std::string ConfigParser::get(const std::string &attr, const std::string &section){
    try{
        return py::extract(conf_parser_.attr("get")(section, attr));
    }catch(boost::python::error_already_set const &){
        std::string perror_str = parse_python_exception();
        throw std::runtime_error("Error getting configuration option: " + perror_str);
    }
}
The error handling code remains the same, but now the `main` function becomes:
int main(){
    Py_Initialize();
    try{
        ConfigParser parser;
        parser.parse_file("conf_file.1.conf");
        ...
        // Will raise a NoOption exception 
         cout << "Proxy host: " << parser.get("ProxyHost", "Network") << endl;
    }catch(exception &e){
        cout << "Here is the error, from a C++ exception: " << e.what() << endl;
    }
}

When the Python exception is raised, it will be parsed and repackaged as a std::runtime_error, which is caught at the caller and handled like a normal C++ exception (i.e. without having to go through the parse_python_exception rigmarole). For a project that only has a handful of functions or a class or two utilizing embedded Python, this will certainly work. For a larger project, though, one wants to avoid the large amount of duplicated code and the errors it will inevitably bring.

For my implementation, I wanted to always handle the the errors in the same way, but I needed a way to call different functions with different signatures. I decided to leverage another powerful area of the boost library: the functors library, and specifically boost::bind and boost::function. boost::function provides functor class wrappers, and boost::bind (among other things) binds arguments to functions. The two together, then, enable the passing of functions and their arguments that can be called at a later time. Just what the doctor ordered!

To utilize the functor, the function needs to know about the return type. Since we're wrapping functions with different signatures, a function template does the trick nicely:

template <class return_type>
return_type call_python_func(boost::function<return_type ()> to_call, const std::string &error_pre){
    std::string error_str(error_pre);

    try{
        return to_call();
    }catch(boost::python::error_already_set const &){
        error_str = error_str + parse_python_exception();
        throw std::runtime_error(error_str);
    }
}

This function takes the functor object for a function calling boost::python functions. Each function that calls boost::python code will now be split into two functions: the private core function that uses the Python functionality and a public wrapper function that uses the call_python_func function. Here is the updated get function and its partner:

string ConfigParser::get(const string &attr, const string &section){
    return call_python_func<string>(boost::bind(&ConfigParser::get_py, this, attr, section),
                                    "Error getting configuration option: ");
}

string ConfigParser::get_py(const string &attr, const string &section){
    return py::extract<string>(conf_parser_.attr("get")(section, attr));
}

The get function binds the passed-in arguments, along with the implicit this pointer, to the get_py function, which in turn calls the boost::python functions necessary to perform the action. Simple and effective.

Of course, there is a tradeoff associated here. Instead of the repeated code of the try...catch blocks and Python error handling, there are double the number of functions declared per class. For my purposes, I prefer the second form, as it more effectively utilizes the compiler to find errors, but mileage may vary. The most important point is to handle Python errors at a level of code that understands Python. If your entire application needs to understand Python, you should consider rewriting in Python rather than embedding, perhaps with some C++ modules as needed.

As always, you can follow along with the tutorial by cloning the github repo.

Embedding Python in C++ Applications with boost::python: Part 3

Posted on 05 January 2012 by Joseph

In Part 2 of this tutorial, I covered a methodology for handling exceptions thrown from embedded Python code from within the C++ part of your application. This is crucial for debugging your embedded Python code. In this tutorial, we will create a simple C++ class that leverages Python functionality to handle an often-irritating part of developing real applications: configuration parsing.

In an attempt to not draw ire from the C++ elites, I am going to say this in a diplomatic way: I suck at complex string manipulations in C++. STL strings and stringstreams greatly simplify the task, but performing application-level tasks, and performing them in a robust way, always results in me writing more code that I would really like. As a result, I recently rewrote the configuration parsing mechanism from Granola Connect (the daemon in Granola Enterprise that handles communication with the Granola REST API) using embedded Python and specifically the ConfigParser module.

Of course, string manipulations and configuration parsing are just an example. For Part 3, I could have chosen any number of tasks that are difficult in C++ and easy in Python (web connectivity, for instance), but the configuration parsing class is a simple yet complete example of embedding Python for something of actual use. Grab the code from the Github repo for this tutorial to play along.

First, let’s create a class definition that covers very basic configuration parsing: read and parse INI-style files, extract string values given a name and a section, and set string values for a given section. Here is the class declaration:

class ConfigParser{
    private:
        boost::python::object conf_parser_;

        void init();
    public:
        ConfigParser();

        bool parse_file(const std::string &filename);
        std::string get(const std::string &attr,
                        const std::string &section = "DEFAULT");
        void set(const std::string &attr,
                 const std::string &value,
                 const std::string &section = "DEFAULT");
};

The ConfigParser module offers far more features than we will cover in this tutorial, but the subset we implement here should serve as a template for implementing more complex functionality. The implementation of the class is fairly simple; first, the constructor loads the main module, extracts the dictionary, imports the ConfigParser module into the namespace, and creates a boost::python::object member variable holding a RawConfigParser object:

ConfigParser::ConfigParser(){
    py::object mm = py::import("__main__");
    py::object mn = mm.attr("__dict__");
    py::exec("import ConfigParser", mn);
    conf_parser_ = py::eval("ConfigParser.RawConfigParser()", mn);
}

The file parsing and the getting and setting of values is performed using this config_parser_ object:

bool ConfigParser::parse_file(const std::string &filename){
    return py::len(conf_parser_.attr("read")(filename)) == 1;
}

std::string ConfigParser::get(const std::string &attr, const std::string &section){
    return py::extract<std::string>(conf_parser_.attr("get")(section, attr));
}

void ConfigParser::set(const std::string &attr, const std::string &value, const std::string &section){
    conf_parser_.attr("set")(section, attr, value);
}

In this simple example, for the sake of brevity exceptions are allowed to propagate. In a more complex environment, you will almost certainly want to have the C++ class handle and repackage the Python exceptions as C++ exceptions. This way you could later create a pure C++ class if performance or some other concern became an issue.

To use the class, calling code can simply treat it as a normal C++ class:

int main(){
    Py_Initialize();
    try{
        ConfigParser parser;
        parser.parse_file("conf_file.1.conf");
        cout << "Directory (file 1): " << parser.get("Directory", "DEFAULT") << endl;
        parser.parse_file("conf_file.2.conf");
        cout << "Directory (file 2): " << parser.get("Directory", "DEFAULT") << endl;
        cout << "Username: " << parser.get("Username", "Auth") << endl;
        cout << "Password: " << parser.get("Password", "Auth") << endl;
        parser.set("Directory", "values can be arbitrary strings", "DEFAULT");
        cout << "Directory (force set by application): " << parser.get("Directory") << endl;
        // Will raise a NoOption exception 
        // cout << "Proxy host: " << parser.get("ProxyHost", "Network") << endl;
    }catch(boost::python::error_already_set const &){
        string perror_str = parse_python_exception();
        cout << "Error during configuration parsing: " << perror_str << endl;
    }
}

And that's that: a key-value configuration parser with sections and comments in under 50 lines of code. This is just the tip of the iceberg too. In almost the same length of code, you can do all sorts of things that would be at best painful and at worse error prone and time consuming in C++: configuration parsing, list and set operations, web connectivity, file format operations (think XML/JSON), and myriad other tasks are already implemented in the Python standard library.

In Part 4, I'll take a look at how to more robustly and generically call Python code using functors and a Python namespace class.

Embedding Python in C++ Applications with boost::python: Part 2

Posted on 04 January 2012 by Joseph

In Part 1, we took a look at embedding Python in C++ applications, including several ways of calling Python code from your application. Though I earlier promised a full implementation of a configuration parser in Part 2, I think it’s more constructive to take a look at error parsing. Once we have a good way to handle errors in Python code, I’ll create the promised configuration parser in Part 3. Let’s jump in!

If you got yourself a copy of the git repo for the tutorial and were playing around with it, you may have experienced the way boost::python handles Python errors – the error_already_set exception type. If not, the following code will generate the exception:

    namespace py = boost::python;
    ...
    Py_Initialize();
    ...
    py::object rand_mod = py::import("fake_module");

…which outputs the not-so-helpful:

terminate called after throwing an instance of 'boost::python::error_already_set'
Aborted

In short, any errors that occur in the Python code that boost::python handles will cause the library to raise this exception; unfortunately, the exception does not encapsulate any of the information about the error itself. To extract information about the error, we’re going to have to resort to using the Python C API and some Python itself. First, catch the error:

    try{
        Py_Initialize();
        py::object rand_mod = py::import("fake_module");
    }catch(boost::python::error_already_set const &){
        std::string perror_str = parse_python_exception();
        std::cout << "Error in Python: " << perror_str << std::endl;
    }

Above, we've called the parse_python_exception function to extract the error string and print it. As this suggests, the exception data is stored statically in the Python library and not encapsulated in the exception itself. The first step in the parse_python_exception function, then, is to extract that data using the PyErr_Fetch Python C API function:

std::string parse_python_exception(){
    PyObject *type_ptr = NULL, *value_ptr = NULL, *traceback_ptr = NULL;
    PyErr_Fetch(&type_ptr, &value_ptr, &traceback_ptr);
    std::string ret("Unfetchable Python error");
    ...

As there may be all, some, or none of the exception data available, we set up the returned string with a fallback value. Next, we try to extract and stringify the type data from the exception information:

    ...
    if(type_ptr != NULL){
        py::handle<> h_type(type_ptr);
        py::str type_pstr(h_type);
        py::extract<std::string> e_type_pstr(type_pstr);
        if(e_type_pstr.check())
            ret = e_type_pstr();
        else
            ret = "Unknown exception type";
    }
    ...

In this block, we first check that there is actually a valid pointer to the type data. If there is, we construct a boost::python::handle to the data from which we then create a str object. This conversion should ensure that a valid string extraction is possible, but to double check we create an extract object, check the object, and then perform the extraction if it is valid. Otherwise, we use a fallback string for the type information.

Next, we perform a very similar set of steps on the exception value:

    ...
    if(value_ptr != NULL){
        py::handle<> h_val(value_ptr);
        py::str a(h_val);
        py::extract<std::string> returned(a);
        if(returned.check())
            ret +=  ": " + returned();
        else
            ret += std::string(": Unparseable Python error: ");
    }
    ...

We append the value string to the existing error string. The value string is, for most built-in exception types, the readable string describing the error.

Finally, we extract the traceback data:

    if(traceback_ptr != NULL){
        py::handle<> h_tb(traceback_ptr);
        py::object tb(py::import("traceback"));
        py::object fmt_tb(tb.attr("format_tb"));
        py::object tb_list(fmt_tb(h_tb));
        py::object tb_str(py::str("\n").join(tb_list));
        py::extract<std::string> returned(tb_str);
        if(returned.check())
            ret += ": " + returned();
        else
            ret += std::string(": Unparseable Python traceback");
    }
    return ret;
}

The traceback goes similarly to the type and value extractions, except for the extra step of formatting the traceback object as a string. For that, we import the traceback module. From traceback, we then extract the format_tb function and call it with the handle to the traceback object. This generates a list of traceback strings which we then join into a single string. Not the prettiest printing, perhaps, but it gets the job done. Finally, we extract the C++ string type as above and append it to the returned error string and return the entire result.

In the context of the earlier error, the application now generates the following output:

Error in Python: : No module named fake_module

Generally speaking, this function will make it much easier to get to the root cause of problems in your embedded Python code. One caveat: if you are configuring a custom Python environment (especially module paths) for your embedded interpreter, the parse_python_exception function may itself throw a boost::error_already_set when it attempts to load the traceback module, so you may want to wrap the call to the function in a try...catch block and parse only the type and value pointers out of the result.

As I mentioned above, in Part 3 I will walk through the implementation of a configuration parser built on top of the ConfigParser Python module. Assuming, of course, that I don't get waylaid again.

Embedding Python in C++ Applications with boost::python: Part 1

Posted on 03 January 2012 by Joseph

In the Introduction to this tutorial series, I took at look at the motivation for integrating Python code into the Granola code base. In short, it allows me to leverage all the benefits of the Python language and the Python standard library when approaching tasks that are normally painful or awkward in C++. The underlying subtext, of course, is that I didn’t have to port any of the existing C++ code to do so.

Today, I’d like to take a look at some first steps at using boost::python to embed Python in C++ and interact with Python objects. I’ve put all the code from this section in a github repo, so feel free to check the code out and play along.

At it’s core, embedding Python is very simple, and requires no C++ code whatsoever – the libraries provided with a Python distribution include C bindings. I’m going to skip over all that though, and jump straight into using Python in C++ via boost::python, which provides class wrappers and polymorphic behavior much more consistent with actual Python code than the C bindings would allow. In the later parts of this tutorial, we’ll cover a few things that you can’t do with boost::python (notably, multithreading and error handling).

So anyway, to get started you need to download and build boost, or retrieve a copy from your package manager. If you choose to build it, you can build just the boost::python library (it is unfortunately not header-only), though I would suggest getting familiar with the entire set of libraries if you do a lot of C++ programming. If you are following along with the git repo, make sure you change the path in the Makefile to point to your boost installation directory. And thus concludes the exposition. Let’s dive in!

First, we need to be able to build an application with Python embedded. With gcc this isn’t too difficult; it is simply a matter of including boost::python and libpython as either static or shared libraries. Depending on how you build boost, you may have trouble mixing and matching. In the tutorial code on github, we will use the static boost::python library (libboost_python.a) and the dynamic version of the Python library (libpython.so).

One of the soft requirements I had for my development efforts at MiserWare was to make the environment consistent across all of our support operating systems: several Windows and an ever-changing list of Linux distros. As a result, Granola links against a pinned version of Python and the installation packages include the Python library files required to run our code. Not ideal, perhaps, but it results in an environment where I am positive our code will run across all supported operating systems.

Let’s get some code running. You’ll need to include the correct headers, as you might imagine.

    Py_Initialize();
    py::object main_module = py::import("__main__");
    py::object main_namespace = main_module.attr("__dict__");

Note that you must initialize the Python interpreter directly (line 1). While boost::python greatly eases the task of embedding Python, it does not handle everything you need to do. As I mentioned above, we’ll see some more shortcomings in future sections of the tutorial. After initializing, the __main__ module is imported and the namespace is extracted. This results in a blank canvas upon which we can then call Python code, adding modules and variables.

    boost::python::exec("print 'Hello, world'", main_namespace);
    boost::python::exec("print 'Hello, world'[3:5]", main_namespace);
    boost::python::exec("print '.'.join(['1','2','3'])", main_namespace);

The exec function runs the arbitrary code in the string parameter within the specified namespace. All of the normal, non-imported code is available. Of course, this isn’t very useful without being able to import modules and extract values.

    boost::python::exec("import random", main_namespace);
    boost::python::object rand = boost::python::eval("random.random()", main_namespace);
    std::cout << py::extract<double>(rand) << std::endl;

Here we’ve imported the random module by executing the corresponding Python statement within the __main__ namespace, bringing the module into the namespace. After the module is available, we can use functions, objects, and variables within the namespace. In this example, we use the eval function, which returns the result of the passed-in Python statement, to create a boost::python object containing a random value as returned by the random() function in the random module. Finally, we extract the value as a C++ double type and print it.

This may seem a bit.. soft. Calling Python by passing formatted Python strings into C++ functions? Not a very object-oriented way of dealing with things. Fortunately, there is a better way.

    boost::python::object rand_mod = boost::python::import("random");
    boost::python::object rand_func = rand_mod.attr("random");
    boost::python::object rand2 = rand_func();
    std::cout << boost::python::extract(rand2) << std::endl;

In this final example, we import the random module, but this time using the boost::python import function, which loads the module into a boost Python object. Next, the random function object is extracted from the random module and stored in a boost::python object. The function is called, returning a Python object containing the random number. Finally, the double value is extracted and printed. In general, all Python objects can be handled in this way – functions, classes, built-in types.

It really starts getting interesting when you start holding complex standard library objects and instances of user-defined classes. In the next tutorial, I’ll take a full class through its paces and build a bona fide configuration parsing class around the ConfigParser module discuss the details of parsing Python exceptions from C++ code.

Embedding Python in C++ Applications with boost::python: Introduction

Posted on 02 January 2012 by Joseph

About a year ago, we at MiserWare decided to augment the core power management function of Granola with web connectivity, allowing users to track the savings of all of their machines (and soon, to configure and apply policies and schedules) from a single location – the Granola Dash.

The problem was, though, that our codebase was entirely in C++. I examined several options. Ultimately, I decided that writing the web connectivity code in Python and embedding it in Granola would give me the best agility for my buck. I found boost::python and used it as the (excellent) basis of my implementation.

As the months have gone on, I have improved my understanding and implementation of embedded Python in this context, and I have increasingly reached for it to solve all sorts of problems that are painful in C++ and painless in Python – configuration parsing, complex data structures marshaled in JSON, automatic updating, and basically anything else that isn’t core algorithms (for performance reasons) or system interaction (for compatibility).

Here were my initial requirements:

  • instantiate Python objects and interact with them in a natural way
  • pass data into Python functions
  • extract data from Python functions and objects
  • handle errors from with the Python code

After the code started getting more sophisticated, I realized the following were also important topics:

  • call Python code from multiple (actual) threads of execution
  • parse Python exceptions into usable data structures

This series of tutorials is my attempt to document my experiences and help out others who want to take advantage of Python in their C++ applications. In Part 1, I’ll cover the basics of embedding Python and using boost::python, and outline a simple C++/Python application. Afterwards, I’ll cover the topics above and provide some code to solve a lot of the problems that I struggled with initially.


Copyright © 2016 Joseph Turner