Embedding Python in C++ Applications with boost::python: Part 3

Posted on 05 January 2012 by Joseph

In Part 2 of this tutorial, I covered a methodology for handling exceptions thrown from embedded Python code from within the C++ part of your application. This is crucial for debugging your embedded Python code. In this tutorial, we will create a simple C++ class that leverages Python functionality to handle an often-irritating part of developing real applications: configuration parsing.

In an attempt to not draw ire from the C++ elites, I am going to say this in a diplomatic way: I suck at complex string manipulations in C++. STL strings and stringstreams greatly simplify the task, but performing application-level tasks, and performing them in a robust way, always results in me writing more code that I would really like. As a result, I recently rewrote the configuration parsing mechanism from Granola Connect (the daemon in Granola Enterprise that handles communication with the Granola REST API) using embedded Python and specifically the ConfigParser module.

Of course, string manipulations and configuration parsing are just an example. For Part 3, I could have chosen any number of tasks that are difficult in C++ and easy in Python (web connectivity, for instance), but the configuration parsing class is a simple yet complete example of embedding Python for something of actual use. Grab the code from the Github repo for this tutorial to play along.

First, let’s create a class definition that covers very basic configuration parsing: read and parse INI-style files, extract string values given a name and a section, and set string values for a given section. Here is the class declaration:

class ConfigParser{
    private:
        boost::python::object conf_parser_;

        void init();
    public:
        ConfigParser();

        bool parse_file(const std::string &filename);
        std::string get(const std::string &attr,
                        const std::string &section = "DEFAULT");
        void set(const std::string &attr,
                 const std::string &value,
                 const std::string &section = "DEFAULT");
};

The ConfigParser module offers far more features than we will cover in this tutorial, but the subset we implement here should serve as a template for implementing more complex functionality. The implementation of the class is fairly simple; first, the constructor loads the main module, extracts the dictionary, imports the ConfigParser module into the namespace, and creates a boost::python::object member variable holding a RawConfigParser object:

ConfigParser::ConfigParser(){
    py::object mm = py::import("__main__");
    py::object mn = mm.attr("__dict__");
    py::exec("import ConfigParser", mn);
    conf_parser_ = py::eval("ConfigParser.RawConfigParser()", mn);
}

The file parsing and the getting and setting of values is performed using this config_parser_ object:

bool ConfigParser::parse_file(const std::string &filename){
    return py::len(conf_parser_.attr("read")(filename)) == 1;
}

std::string ConfigParser::get(const std::string &attr, const std::string &section){
    return py::extract<std::string>(conf_parser_.attr("get")(section, attr));
}

void ConfigParser::set(const std::string &attr, const std::string &value, const std::string &section){
    conf_parser_.attr("set")(section, attr, value);
}

In this simple example, for the sake of brevity exceptions are allowed to propagate. In a more complex environment, you will almost certainly want to have the C++ class handle and repackage the Python exceptions as C++ exceptions. This way you could later create a pure C++ class if performance or some other concern became an issue.

To use the class, calling code can simply treat it as a normal C++ class:

int main(){
    Py_Initialize();
    try{
        ConfigParser parser;
        parser.parse_file("conf_file.1.conf");
        cout << "Directory (file 1): " << parser.get("Directory", "DEFAULT") << endl;
        parser.parse_file("conf_file.2.conf");
        cout << "Directory (file 2): " << parser.get("Directory", "DEFAULT") << endl;
        cout << "Username: " << parser.get("Username", "Auth") << endl;
        cout << "Password: " << parser.get("Password", "Auth") << endl;
        parser.set("Directory", "values can be arbitrary strings", "DEFAULT");
        cout << "Directory (force set by application): " << parser.get("Directory") << endl;
        // Will raise a NoOption exception 
        // cout << "Proxy host: " << parser.get("ProxyHost", "Network") << endl;
    }catch(boost::python::error_already_set const &){
        string perror_str = parse_python_exception();
        cout << "Error during configuration parsing: " << perror_str << endl;
    }
}

And that's that: a key-value configuration parser with sections and comments in under 50 lines of code. This is just the tip of the iceberg too. In almost the same length of code, you can do all sorts of things that would be at best painful and at worse error prone and time consuming in C++: configuration parsing, list and set operations, web connectivity, file format operations (think XML/JSON), and myriad other tasks are already implemented in the Python standard library.

In Part 4, I'll take a look at how to more robustly and generically call Python code using functors and a Python namespace class.

Embedding Python in C++ Applications with boost::python: Part 2

Posted on 04 January 2012 by Joseph

In Part 1, we took a look at embedding Python in C++ applications, including several ways of calling Python code from your application. Though I earlier promised a full implementation of a configuration parser in Part 2, I think it’s more constructive to take a look at error parsing. Once we have a good way to handle errors in Python code, I’ll create the promised configuration parser in Part 3. Let’s jump in!

If you got yourself a copy of the git repo for the tutorial and were playing around with it, you may have experienced the way boost::python handles Python errors – the error_already_set exception type. If not, the following code will generate the exception:

    namespace py = boost::python;
    ...
    Py_Initialize();
    ...
    py::object rand_mod = py::import("fake_module");

…which outputs the not-so-helpful:

terminate called after throwing an instance of 'boost::python::error_already_set'
Aborted

In short, any errors that occur in the Python code that boost::python handles will cause the library to raise this exception; unfortunately, the exception does not encapsulate any of the information about the error itself. To extract information about the error, we’re going to have to resort to using the Python C API and some Python itself. First, catch the error:

    try{
        Py_Initialize();
        py::object rand_mod = py::import("fake_module");
    }catch(boost::python::error_already_set const &){
        std::string perror_str = parse_python_exception();
        std::cout << "Error in Python: " << perror_str << std::endl;
    }

Above, we've called the parse_python_exception function to extract the error string and print it. As this suggests, the exception data is stored statically in the Python library and not encapsulated in the exception itself. The first step in the parse_python_exception function, then, is to extract that data using the PyErr_Fetch Python C API function:

std::string parse_python_exception(){
    PyObject *type_ptr = NULL, *value_ptr = NULL, *traceback_ptr = NULL;
    PyErr_Fetch(&type_ptr, &value_ptr, &traceback_ptr);
    std::string ret("Unfetchable Python error");
    ...

As there may be all, some, or none of the exception data available, we set up the returned string with a fallback value. Next, we try to extract and stringify the type data from the exception information:

    ...
    if(type_ptr != NULL){
        py::handle<> h_type(type_ptr);
        py::str type_pstr(h_type);
        py::extract<std::string> e_type_pstr(type_pstr);
        if(e_type_pstr.check())
            ret = e_type_pstr();
        else
            ret = "Unknown exception type";
    }
    ...

In this block, we first check that there is actually a valid pointer to the type data. If there is, we construct a boost::python::handle to the data from which we then create a str object. This conversion should ensure that a valid string extraction is possible, but to double check we create an extract object, check the object, and then perform the extraction if it is valid. Otherwise, we use a fallback string for the type information.

Next, we perform a very similar set of steps on the exception value:

    ...
    if(value_ptr != NULL){
        py::handle<> h_val(value_ptr);
        py::str a(h_val);
        py::extract<std::string> returned(a);
        if(returned.check())
            ret +=  ": " + returned();
        else
            ret += std::string(": Unparseable Python error: ");
    }
    ...

We append the value string to the existing error string. The value string is, for most built-in exception types, the readable string describing the error.

Finally, we extract the traceback data:

    if(traceback_ptr != NULL){
        py::handle<> h_tb(traceback_ptr);
        py::object tb(py::import("traceback"));
        py::object fmt_tb(tb.attr("format_tb"));
        py::object tb_list(fmt_tb(h_tb));
        py::object tb_str(py::str("\n").join(tb_list));
        py::extract<std::string> returned(tb_str);
        if(returned.check())
            ret += ": " + returned();
        else
            ret += std::string(": Unparseable Python traceback");
    }
    return ret;
}

The traceback goes similarly to the type and value extractions, except for the extra step of formatting the traceback object as a string. For that, we import the traceback module. From traceback, we then extract the format_tb function and call it with the handle to the traceback object. This generates a list of traceback strings which we then join into a single string. Not the prettiest printing, perhaps, but it gets the job done. Finally, we extract the C++ string type as above and append it to the returned error string and return the entire result.

In the context of the earlier error, the application now generates the following output:

Error in Python: : No module named fake_module

Generally speaking, this function will make it much easier to get to the root cause of problems in your embedded Python code. One caveat: if you are configuring a custom Python environment (especially module paths) for your embedded interpreter, the parse_python_exception function may itself throw a boost::error_already_set when it attempts to load the traceback module, so you may want to wrap the call to the function in a try...catch block and parse only the type and value pointers out of the result.

As I mentioned above, in Part 3 I will walk through the implementation of a configuration parser built on top of the ConfigParser Python module. Assuming, of course, that I don't get waylaid again.

Embedding Python in C++ Applications with boost::python: Part 1

Posted on 03 January 2012 by Joseph

In the Introduction to this tutorial series, I took at look at the motivation for integrating Python code into the Granola code base. In short, it allows me to leverage all the benefits of the Python language and the Python standard library when approaching tasks that are normally painful or awkward in C++. The underlying subtext, of course, is that I didn’t have to port any of the existing C++ code to do so.

Today, I’d like to take a look at some first steps at using boost::python to embed Python in C++ and interact with Python objects. I’ve put all the code from this section in a github repo, so feel free to check the code out and play along.

At it’s core, embedding Python is very simple, and requires no C++ code whatsoever – the libraries provided with a Python distribution include C bindings. I’m going to skip over all that though, and jump straight into using Python in C++ via boost::python, which provides class wrappers and polymorphic behavior much more consistent with actual Python code than the C bindings would allow. In the later parts of this tutorial, we’ll cover a few things that you can’t do with boost::python (notably, multithreading and error handling).

So anyway, to get started you need to download and build boost, or retrieve a copy from your package manager. If you choose to build it, you can build just the boost::python library (it is unfortunately not header-only), though I would suggest getting familiar with the entire set of libraries if you do a lot of C++ programming. If you are following along with the git repo, make sure you change the path in the Makefile to point to your boost installation directory. And thus concludes the exposition. Let’s dive in!

First, we need to be able to build an application with Python embedded. With gcc this isn’t too difficult; it is simply a matter of including boost::python and libpython as either static or shared libraries. Depending on how you build boost, you may have trouble mixing and matching. In the tutorial code on github, we will use the static boost::python library (libboost_python.a) and the dynamic version of the Python library (libpython.so).

One of the soft requirements I had for my development efforts at MiserWare was to make the environment consistent across all of our support operating systems: several Windows and an ever-changing list of Linux distros. As a result, Granola links against a pinned version of Python and the installation packages include the Python library files required to run our code. Not ideal, perhaps, but it results in an environment where I am positive our code will run across all supported operating systems.

Let’s get some code running. You’ll need to include the correct headers, as you might imagine.

    Py_Initialize();
    py::object main_module = py::import("__main__");
    py::object main_namespace = main_module.attr("__dict__");

Note that you must initialize the Python interpreter directly (line 1). While boost::python greatly eases the task of embedding Python, it does not handle everything you need to do. As I mentioned above, we’ll see some more shortcomings in future sections of the tutorial. After initializing, the __main__ module is imported and the namespace is extracted. This results in a blank canvas upon which we can then call Python code, adding modules and variables.

    boost::python::exec("print 'Hello, world'", main_namespace);
    boost::python::exec("print 'Hello, world'[3:5]", main_namespace);
    boost::python::exec("print '.'.join(['1','2','3'])", main_namespace);

The exec function runs the arbitrary code in the string parameter within the specified namespace. All of the normal, non-imported code is available. Of course, this isn’t very useful without being able to import modules and extract values.

    boost::python::exec("import random", main_namespace);
    boost::python::object rand = boost::python::eval("random.random()", main_namespace);
    std::cout << py::extract<double>(rand) << std::endl;

Here we’ve imported the random module by executing the corresponding Python statement within the __main__ namespace, bringing the module into the namespace. After the module is available, we can use functions, objects, and variables within the namespace. In this example, we use the eval function, which returns the result of the passed-in Python statement, to create a boost::python object containing a random value as returned by the random() function in the random module. Finally, we extract the value as a C++ double type and print it.

This may seem a bit.. soft. Calling Python by passing formatted Python strings into C++ functions? Not a very object-oriented way of dealing with things. Fortunately, there is a better way.

    boost::python::object rand_mod = boost::python::import("random");
    boost::python::object rand_func = rand_mod.attr("random");
    boost::python::object rand2 = rand_func();
    std::cout << boost::python::extract(rand2) << std::endl;

In this final example, we import the random module, but this time using the boost::python import function, which loads the module into a boost Python object. Next, the random function object is extracted from the random module and stored in a boost::python object. The function is called, returning a Python object containing the random number. Finally, the double value is extracted and printed. In general, all Python objects can be handled in this way – functions, classes, built-in types.

It really starts getting interesting when you start holding complex standard library objects and instances of user-defined classes. In the next tutorial, I’ll take a full class through its paces and build a bona fide configuration parsing class around the ConfigParser module discuss the details of parsing Python exceptions from C++ code.

Embedding Python in C++ Applications with boost::python: Introduction

Posted on 02 January 2012 by Joseph

About a year ago, we at MiserWare decided to augment the core power management function of Granola with web connectivity, allowing users to track the savings of all of their machines (and soon, to configure and apply policies and schedules) from a single location – the Granola Dash.

The problem was, though, that our codebase was entirely in C++. I examined several options. Ultimately, I decided that writing the web connectivity code in Python and embedding it in Granola would give me the best agility for my buck. I found boost::python and used it as the (excellent) basis of my implementation.

As the months have gone on, I have improved my understanding and implementation of embedded Python in this context, and I have increasingly reached for it to solve all sorts of problems that are painful in C++ and painless in Python – configuration parsing, complex data structures marshaled in JSON, automatic updating, and basically anything else that isn’t core algorithms (for performance reasons) or system interaction (for compatibility).

Here were my initial requirements:

  • instantiate Python objects and interact with them in a natural way
  • pass data into Python functions
  • extract data from Python functions and objects
  • handle errors from with the Python code

After the code started getting more sophisticated, I realized the following were also important topics:

  • call Python code from multiple (actual) threads of execution
  • parse Python exceptions into usable data structures

This series of tutorials is my attempt to document my experiences and help out others who want to take advantage of Python in their C++ applications. In Part 1, I’ll cover the basics of embedding Python and using boost::python, and outline a simple C++/Python application. Afterwards, I’ll cover the topics above and provide some code to solve a lot of the problems that I struggled with initially.

Brief Bio

Posted on 01 January 2012 by Joseph

I am currently the VP of Product at Mobile System 7, where I work with customers, sales, engineering, and business leadership to design and build Interlock, the most powerful and comprehensive Identity Analytics and Adaptive Access Control platform in the world. I gather requirements, prioritize development resources, help structure marketing, and work with the sales team to improve our processes and our product. I also lend a hand with development.

Previously, I co-founded and led the technical side of MiserWare from 2007 until 2013. At MiserWare, we produced Granola, the only commercially available software that offered dramatic reduction in computer power consumption within a hard performance constraint. We extended this technological core to create an enterprise scale software suite for managing the energy of IT resources. Unfortunately, in 2013 MiserWare shut down due to lack of funding.

Many years ago, I was a PhD student and a Cunningham Fellow at Virginia Tech, though I left the program to found MiserWare. I graduated Magna Cum Laude from the University of South Carolina in December 2008. I was a member of the SCAPE Lab from Fall 2004, first at USC, then at Virginia Tech, prior to leaving in 2007.

In my spare time, I rock climb whenever weather and time permits; take poor photographs; write, play, and record music; and do a whole bunch of recreational math and programming.

Professional Statement

My general interests lay with software entrepreneurship. Creating engaging products and communicating their value to customers has never been easier or more bootstrappable than it is right now. Since I was a kid, I have been thrilled by computers and computing; combining that with the challenges and thrills (not to mention the sometimes brutal economics) of a business has been a fascinating path of discovery for me. Times aren’t always easy, but they are rarely boring.

In my role at Mobile System 7, I am striving to discover the most efficient way to create market fit within the bounds of high-tech, resource intensive software development and often slow enterprise sales cycles. I have been experimenting with ideas from The Lean Startup and the micro ISV community, coupling the fast feedback loops they recommend with lessons learned about enterprise sales at MiserWare to rapidly improve not only our product but also our sales process.

In addition to my role in founding and helping shape Miserware, I either was or led the engineering side of the company from its inception, and this has given me the opportunity to work on more aspects of software and computing than I can count. Roughly in chronological order, and leaving out the small stuff: designing and implementing core algorithms, designing software architectures, test-driven development, interviewing and hiring, choosing and building a team, agile development, project management, benchmarking, creating datacenter-scale technologies, timeline and release management, creating user interfaces, creating and facilitating cross-platform software, software maintenance and updating, and creating a disruptive next-generation have all been on my plate at one point or another. Being in the drivers seat for a nascent company is a sure way to get your hands dirty with all kinds of different dirt.


Copyright © 2018 Joseph Turner