The Perils of Embedding Python 2.7

Posted on 01 April 2013 by Joseph

TL;DR: A change to the ‘import site’ mechanism between Python 2.6 and 2.7 can mean silent application exit for applications with embedded Python.

As you may know from some of my other posts, we at MiserWare leverage an embedded Python interpreter in our Granola product, with success despite the pitfalls and sometimes cryptic (or missing) documentation. This week, I upgraded us from Python 2.6.4 to Python 2.7.2. Everything was going smoothly, until I began testing installation packages for software with an embedded interpreter on our clean (i.e. non-development environment) test systems.

Since the packaging, in this case MSI files, hadn’t changed significantly for the upcoming release, and since my tests on my development machine of the software itself had gone smoothly, I expected the installation tests to be what they usually are: tedious, boring, and successful. I fired up the installer, went through the brief UI sequence, hit Finish and… the software didn’t start. Hmm. I cleaned the system completely, re-installed, same problem. I locate the file on disk, double click, same problem. I run the program from a command prompt, same problem.

At this point, I figure that the problem is with the packaging itself, so I double check my WiX files, but everything seems to be fine. By this point, it still hadn’t occurred to me that the Python upgrade could be the problem, so I drop in a few debugging MessageBox calls, rebuild, and re-install. After a couple rounds of that, I discover that the software is failing inside the Py_Initialize call. Hmm.

I fire up an old development environment that hadn’t upgraded to Python 2.7 yet and load the project in the Visual Studio debugger. Running the program causes program exit with return value 1. What? All of our return value error codes are negative numbers. Was the embedded Python library causing the application itself to exit?

It turns out, that is exactly what was happening. After some step-debugging through the disassembly of python27.dll with the Python source tree open in a separate console, I finally locate the source of the exit in Python/pythonrun.c:

static void
initsite(void)
{
    PyObject *m;
    m = PyImport_ImportModule("site");
    if (m == NULL) {
        PyErr_Print();
        Py_Finalize();
        exit(1);
    }
    else {
        Py_DECREF(m);
    }
}

Yikes! Previously (at least, in 2.6.4), ‘import site‘ failure would at worst print an error message to the console if there was an attached console… in the case of Granola, a Windows GUI application, there wasn’t, so all I had to work with was an application exit code being raised from within a linked library. If I was going to be diplomatic, I’d say that the decision to accept that patch was made with the interpreter executable and not embedding applications in mind. And it gets better.

Now armed with the location of the failure, it’s time to fix the application. It turns out that the call to initsite() occurs prior to the parsing of the environment variables. What that means is that PYTHONPATH, PYTHONHOME, and other ways to tell the interpreter where to find modules don’t have an impact on the search for the site module, though that is not documented. [N.B., the reason it worked on my development machine was that it DOES honor a Windows registry setting that the Python installer creates, pointing to its own lib path.]

Without detailing the pain I went through to determine this, I’ll tell you the answer. You basically have one option (in Windows at least): you need to lay out your Python modules relative to your embedding application just as they appear in the normal Python installation tree, which is to say putting them all in Lib/ and DLLs/ folders at the same directory level as your application. You can also optionally call Py_SetPythonHome to specify a different directory under which to search for Lib/ and DLLs/. Oh, and just in case you go looking for it, that layout isn’t documented; you’ll only see the preferred layout for Unix systems which won’t work in Windows.

What’s the takeaway? For me, it just means one thing: don’t expect consistency out of future versions of Python, at least with regard to embedding applications.

2012 Roundup

Posted on 18 February 2013 by Joseph

Inspired by a blog post I read via Hacker News, I thought I’d create a list of things I did and didn’t do in 2012. You know, for posterity.

Computing

Work

  • Granola Enterprise. A lot changed in MiserWare’s flagship product Granola Enterprise in 2012. In February we debuted a new queue runner-driven data model for our backend, allowing us to do much more with the data coming from clients. In April, we rolled out what we intended to do with that: a new historical view of organizational energy consumption, complete with insights into areas of wasted energy and proactive tips for eliminating it.

    We also changed our preferred payment mode to a monthly subscription plan, allowing users to pay at the end of the month out of savings they have already earned, and only pay for the machines that were active during that interval. The end goal of this was to transition to a more organic model. We are still in the process of improving our funnel.

  • FatBatt. With the core product for Granola Enterprise maturing, we began to look for new areas to direct our development efforts. For me, as I’m sure for many, battery life in my laptop has always been a thorn in my side. I began to examine ways to improve battery life, during which I noticed how little information we are actually given about our battery life. As an example, how long was your last battery discharge? How long does it take to fully charge your battery? I couldn’t answer these questions, and I thought this would be a good place to begin to approach a solution.

    The result is FatBatt, a program intended to help you make intelligent decisions about your energy consumption on your mobile devices. It offers insight into your past discharges as well as providing statistically accurate estimates about the future given your current setup. This lets you make decisions like what the appropriate tradeoff for monitor brightness versus battery life is.

“Major” side projects

  • Sprout! Someone introduced me to the pen-and-paper game Sprouts), and I thought it would be fun and illuminating to try and create a version of this game using browser technology. I made it as far as the one-computer interface before I put it aside. Never released outside of my own development environment.

  • Brackcity The purchase of a ping-pong table for the office spurred a huge amount of ping-pong competition among me and my coworkers. I created an interface for doing algebraically-correct rankings based on historical gameplay. Needless to say, the main result has been a further intensification of the competitiveness. Never released outside of the office.

  • Silvi A couple months ago I set out to scratch an itch of mine by creating a web service to better organize my thoughts. The result is Silvi, a way to create trees out of your information. The original intent was to develop a better way to collaborate on documentation, but using it myself along the way illuminated lots of other uses for it: organizing class notes for school, documenting a new programming language I was learning in a structured way, organizing documentation for products and services, and more. This is the furthest I have taken a personal side project, and I am very excited to see where it goes.

Languages, tools, frameworks

Python, Javascript, CoffeeScript, LESS, C , C++, PHP, SQL.

Node, Flask, Pywin32, py2exe, OpenCV, numpy, sqlalchemy, WMI, MailChimp API.

Books

I read about 25 books this year. As in 2011, my focus was on postmodern fiction, but this year I branched out and read some of my old flames and some nonfiction. Highlights:

  • Thinking, Fast and Slow – Daniel Kahneman
  • 1Q84 – Haruki Murakami
  • Freedom – Jonathan Franzen
  • Machine Learning – Tom Mitchell
  • The Book of Laughter and Forgetting – Milan Kundera
  • (Assorted) – Edward Tufte
  • The Unbearable Lightness of Being – Milan Kundera

Climbing

The opening of Crimpers climbing gym in Christiansburg fueled a lot of my growth in climbing, both via meeting new climbers and improving my technique and fitness level. Both contributed to me sending my first (and second) V5 boulder problem this year.

I climbed at several new areas this year.

  • Grayson Highlands in southern VA was largely developed by my friend Aaron Parlier, and as luck would have it, we bumped into him on our first trip down. He pointed out a bunch of the hotspots, and the rest has been tip-shredding history.

  • Though I had been down there once before, the release of the guidebook for bouldering at Moore’s Wall in NC led to me spending several weekends down there. The style of bouldering, with big compression moves, fits me well and there are a number of projects I’m eager to send there.

  • With neither guide nor guidebook, it was hard to find the highlights, but a recent trip to The Hill in Peaks of Otter, VA showed some really interesting problems on interesting (if not entirely clean) rock. I have high hopes for this area in 2013, particularly if I can get some guidance.

Personal

I traveled to (and had a booth at) CES in Las Vegas. My first taste of Sin City was surprisingly delicious: I ate at quite a few good restaurants and had a generally large time. I also visited San Jose and San Francisco, and spent some time in Santa Barbara implementing a case study. I also spent quite a bit of time in Chicago in the early part of the year visiting my (now ex-) girlfriend Natalia.

I ate at some great restaurants, including Picasso and Nob Hill in Las Vegas, Sutro’s in San Francisco, and (actually this was late December 2011) Next in Chicago. In the upcoming year, I’d really like to go to minibar by José Andrés in Washington, DC.

I sold my house in Christiansburg during a decidedly down market. The house had been on the market for over a year, and it was such a relief to sell it before the winter. As part of my exodus, I rid myself of a huge number of belongings, greatly simplifying my life.

I bought a truck, my first entirely autonomous car purchase. The process, going through a dealer, certainly has a greasy feel to it, but in the end I am happy with my purchase. I outfitted the truck with a very, very cheap camper shell and built a sleeping platform: it is ready to act as my home for camping and climbing trips.

2013 goals

I didn’t explicitly establish goals for myself for 2012, though my goals certainly included:

  • Sell my house (check!)
  • Travel more (FAIL)
  • Climb more (check!)
  • Work on side projects (check!)
  • Get back into good health (check!)
  • Create an organic software sales model for MiserWare (ongoing, not a win yet)

For 2013, I’ve decided to be more ambitious. My goals include:

  • Squat 2x BW
  • Deadlift 2x BW
  • Clean and jerk 1.4x BW
  • Snatch 1.1x BW
  • Press BW
  • Run a 5:50 mile
  • Climb V6 outdoors
  • No injuries
  • Break even on a side project
  • Make 10x investment in FatBatt
  • Read 50+ books, including the entire Terry Pratchett oeuvre
  • Write and record 5 songs I don’t hate

Windows XP Sleep Criteria

Posted on 01 February 2013 by Joseph

Working today to debug a problem with Granola that had been reported by a couple of different users, I got the opportunity yet again to get down and dirty with Windows XP power management.aspx#pmfunctionsxpandearlier). Windows APIs in general can range from robust and well-documented to quirky and confusing. Power management definitely falls into the latter category. It is a less-used API, which means that there are few forums online discussing anything but the most straightforward uses. Add to that the fact that most of the functionality was brand new in XP and was completely rewritten for Vista and you have a set of functions that can be difficult to use and understand, top to bottom. It’s almost as if Microsoft never intended for this API to be used.

The problem I tackled today (and attempted to tackle several other times this week) seemed straightforward: some users were reporting that running Granola disabled or made erratic their screensaver coming on, monitor powering down, and computer going to sleep. My first thought was that Granola wasn’t updating the internal view of the power scheme as the user changed it, but no, that worked fine. My second thought was that perhaps the I/O of logging and communicating over named pipes was causing the machine to stay awake, but that wasn’t it either. I tried one thing after another, only to be shut down again and again.

Windows uses a fairly sophisticated set of criteria.aspx) for determining an appropriate time to put the monitor and system to sleep. According to the documentation, “[a]s long as the system determines that there is user or application activity, it will not enter sleep.” That encompasses the obvious: user interaction is keyboard and mouse activity, application activity is processor utilization, memory activity, or I/O such as network activity. I checked all of these things in turn, only to find that none of them applied. I was obviously not touching the mouse or keyboard; the application itself uses almost no processor, memory, disk, or network. So what was going on?

I’ll cut the story short here. Calls to the power management API were being considered user interaction. The issue causing my confusion was actually twofold: first, these sleep criteria are not really as clear as they seem; second, the power management API was never really intended to be used like Granola uses it. I’ll speak to the first issue later, but as to the second, clearly the developers of Windows XP thought that only users would be changing the power settings. How they thought the user could be changing them without using the keyboard or mouse is an even more interesting question; perhaps these settings are intrinsically linked to the same structures that monitor user interaction with the console.

Finally, I’d reached the end of the road. I uploaded a new version of the software that eliminated the power management functionality and indeed the monitor shut down. I waited 2 minutes more for the system to sleep, but no luck. I continued to wait, and the system never slept. Exit the application, and the system sleeps. Start it up again and the system becomes the computer version of New York. Oh noes! Not again!!

To cut the story short again: the system would not sleep while running unvetted software. “Unvetted software” in this instance meant software that hadn’t been installed by the Windows installer. In Windows XP and later versions of Windows, running software that wasn’t put on the system by an installer produces a UI alert asking the user if it is OK to run the software even if the software was signed by a valid signature. Allowing the software to run apparently puts the system into such a state that it cannot sleep, even though it CAN power down the monitor. This is well outside of the documented interface.

And this is the murkiness of the criteria that I was speaking of earlier. Again, “[a]s long as the system determines that there is user or application activity, it will not enter sleep.” HOW the system determines this is what is unclear. With a closed-source system like Windows, this statement isn’t really helpful from an API-specification standpoint. It may as well say “the system makes an arbitrary decisions that you as an application developer can’t know about.” Ultimately, the power management API as it existed in XP was never intended to be used beyond its basic functionality. Why else would the specification be so unclear and sparse in detail?

SSH Tunneling and Apache vhosts

Posted on 03 January 2013 by Joseph

For better or worse, our web development workflow begins on in-house servers that are the same software stack as our development webservers, particularly for new features that may change the data model. I’m currently working on a new feature for the new-and-improved Granola Enterprise that adds interesting and actionable aggregate data at the group and installation level, a perfect feature to work on in our cloistered environment. Each developer maintains their own Apache name-based virtual host to track their feature branch and any different data they need to track.

Today, I’m taking a cross-country flight to California to set up a case study of the new energy footprint generation capabilities of Granola Enterprise. The flight is long (>5 hours) and has the double advantage of both plenty of room (seat next to me is empty) and in-flight Internets, so I figured I’d get some work done. Getting to the development environment is a piece of cake: we have an Internet-facing ssh server. Setting things up so I can load my vhost in a browser is slightly more complicated, but is ultimately pretty easy using ssh port forwarding.

For a single-host Apache instance, it is really, really easy. Just ssh into your server and forward a local port to port 80 on the internal development machine. If your ssh server is ssh.example.com, your username is example, and your internal development machine is developmentmachine, you could forward local port 8800 like this:

ssh -L 8800:developmentmachine:80 example@ssh.example.com

To get to your webpage, then, just go to http://localhost:8800 in your browser. Simple. It can be even simpler if you forward local port 80 instead of a non-privileged port, but in that case you need to run the command as root (or with sudo).

With virtual hosts, it’s only a bit tricker. Name-based virtual hosts work by looking at the hostname in the HTTP headers, so that information must be right to wind up in the right place. The solution is to give your own machine the same name as your target vhost in your /etc/hosts file. Using the example above, you’d add this line:

127.0.0.1   developmentmachine

Now, instead of going to localhost in your browser, go to the normal name of your development vhost (http://developmentmachine:8800), and tada! you’re in. Bonus points: if you use port 80 (again as root) all your bookmarks work.

Now to do some real work instead of writing blog posts! :)

Select discontinuous items or ranges from a Python list

Posted on 02 January 2013 by Joseph

If you need to select several discontinuous items (and/or ranges) from a Python list, you can use the operator module’s itemgetter second-order function. In the realm of lists, it accepts arguments as either integers or slice objects and returns a function object that when called on a list returns the elements specified.

What? Like this:

>>> from operator import itemgetter
>>> get_items = itemgetter(1, 4, 6, slice(8, 12))
>>> get_items
<operator.itemgetter object at 0x02160D70>
>>> get_items(range(20))
(1, 4, 6, [8, 9, 10, 11])

I’ll leave it as an exercise to the reader to figure out how to flatten the resulting tuple. If it proves challenging, I’d suggest trying some or all of the 99 Prolog Problems (but a list ain’t one?), in Python of course :)


Copyright © 2018 Joseph Turner