Simplifying branches with virtualenv

In my last post I linked to a review of virtualenv, a tool for creating “isolated Python environments”. In a nutshell, virtualenv builds sandbox installations of Python to which you can “switch” in order to isolate yourself from other installations on your machine. Each sandbox has its own Python executables, libraries, and headers, so, for example, packages installed in a sandbox won’t interfere with your primary system installation (or other sandboxes.) This has many uses, and I’ve been using it to help manage code branches for Python projects.

Sometimes the easiest way to work with Python code is to simply install it, i.e. to site-packages. By doing that, it’s trivially available to Python, doesn’t require any changes to PYTHONPATH, etc. It also has the added benefit that you’re generally dealing with your code in a more “natural” state, so you avoid many potential configuration issues.

Of course, if you’re working with multiple branches, you can really confuse yourself if you continually re-install between them. Unless you’re really diligent, you can end up with mismatches between libraries, scripts, unit tests, etc. The problem can be compounded by needing to keep track of different versions of libraries you depend on. Plus, it’s a real pain to be constantly having to check yourself and your environment.

This is where virtualenv can really save you a lot of trouble. By creating a virtualenv sandbox for each branch, you make all of the above problems go away for (almost) free. You get independent installations of each branch’s code, their associated libraries, scripts, and so forth, which will never get polluted with code from the other branches.

The formula I follow is pretty simple. Suppose I have a directory structure like this:

project_name/
  branch_name/
    src/
    unittests/
    resources/

To create a virtualenv sandbox, I would do something like this:
cd project_name/branch_name
virtualenv python

This will create a new directory structure like this:

project_name/
  branch_name/
    src/
    unittests/
    resources/
    python/
      bin/
        {activate, easy_install, python, etc.}
      include/
        pythonX.Y/
      lib/
        pythonX.Y/
          site-packages/

That new structure should be familiar to any Python developer; virtualenv has created a whole new python installation for you, copied from your main installation (or, whichever Python executable was used to run virtualenv.)

Notice the file python/bin/activate. This file contains the code needed to bootstrap you into that sandbox. To use it, do something like this:

cd project_name/branch_name/python/bin
source activate

When you do this, activate does two main things:

  • Update your PATH to see the sandbox’s bin directory
  • Update your prompt to indicate the sandbox directory

By updating your PATH, this script effectively makes this sandbox installation your active Python installation. Installed libraries will go there, it will be the source of site-packages, etc. This is the primary magic of virtualenv: by giving you a simple way to create and activate sandbox environments, it gives you a convenient way to manage multiple, possibly conflicting code bases in a very natural way.

The prompt updating is a good idea, but it could use some polishing. What it does is prepend to your prompt the name of the directory which you supplied to the virtualenv command. So, if your original prompt was

user@host%

based on the scheme I’m describing here it would get changed to

(python)user@host%

While this at least lets you know that you’re in a sandbox, because of how I structure things (i.e. by putting all of my sandboxes in a directory called ‘python’) all of my sandbox prompts look the same. I would rather that they actually tell me the name of the sandbox. You can do this by editing activate. Look for a line near the end of the file like this:
PS1="(`basename \"$VIRTUAL_ENV\"`)$PS1"
and change it to this:
PS1="(BRANCH_NAME)$PS1"
Now, when you source the bootstrap file, you’ll get a prompt like this:
(BRANCH_NAME)user@host%

There’s more to virtualenv than I’ve covered here, so it might be worth checking out if what I’ve described here tickles your fancy. virtualenv is a good idea, well executed, and hopefully you might find it as useful as I have.

  1. #1 by Hugo Ferreira on 2010/04/03 - 23:05

    Been fiddling just today with a similar matter and ended up solving it using bash parameter expansion rules.

    You could probably get away without having to change the “activate” script by putting this generic prompt in your bash profile:

    –8<——–

    # Disable virtualenv default prompt
    export VIRTUAL_ENV_DISABLE_PROMPT=1

    # Place the following anywhere on your PS1:
    # ${VIRTUAL_ENV:+(`basename ${VIRTUAL_ENV%/*}`)}

    –8<——–

    For example, the following PS1,
    PS1='\h:\W \u${VIRTUAL_ENV:+ (`basename ${VIRTUAL_ENV%/*}`)}\$ '

    … would give you a prompt like this when a virtualenv is activated,
    hostname:~ user (branch_name)$

    … and one like this when none is activated:
    hostname:~ user$

    A quick explanation of the expansions:

    ${VIRTUAL_ENV%/*}
    — this cuts away the "/python" part

    `basename …`
    — gives you just the "branch_name"

    ${VIRTUAL_ENV:+(…)}
    — uses the alternate value of "(branch_name)" whenever $VIRTUAL_ENV has a value

    • #2 by Nathan Duran on 2012/01/13 - 21:39

      I like that solution better. Much more portable, and it doesn’t get lost when you have to blow away your virtualenv directory because something got hosed.

Leave a reply to Nathan Duran Cancel reply