Yesterday I released an alpha version of a project I've named JiffyLab. In this post I expand on the rationale from the README, and talk a bit about developing with Docker.
Python is a wonderful first language, but sometimes introducing people to Python is bogged down by making sure that everyone has a usable development environment. This can often set a tone of frustration for beginners, as well as completely drain instructor and assistant's time, instead of letting everyone "dive in".
There are other advantages to having a standardized environment:
- If the instructor is projecting the same thing as what the student sees, the student will be less likely to be thrown off by inconsequential details such as differences in the shell prompt (> vs $ etc), different syntax highlighting colors, or the use of some tool or feature not installed on the student's machine.
- When all students are using the same exact setup, they are more likely to be capable of helping their neighbor, because if they got it working on their screen, they can probably get it working on their neighbor's - peers and instructors can more effectively visually "diff" what might be different.
- Even if the setup of a student machine goes smoothly at a technical level, it can still take time, especially if a significant number of material needs to be downloaded over slow links, or requires significant build time.
Messing around with tools and your machine's setup is of course part of being a developer - learning how to manage your system and Python 'PATH', learning an editor, virtualenvs, pip etc all need to happen. But not in your first hours as a new developer.
Another reason to have students work through the challenge of getting a working dev environment on their own machines is so that they can continue to teach themselves and learn on their own once past the introduction. This is very important, and should be built into any worthwhile instruction, it just doesn't need to happen at the beginning. Once students have learned some material, they will have much more context to understand what it is they are setting up, and will potentially have a greater motivation for getting it all working. So in the end I believe this trade-off is a bit of a red herring, as it is not about "either or", but "which comes first".
Finally there is some advantage to working in a "tunneled" environment if the primary way you may interact with tools or data is remotely on a server. When you have only learned with the aid of local GUI tools on hand, it can be a very difficult transition to doing things remotely.
The very original inspiration for this project came from reading Danny's writings from several years ago about the problems in getting a roomful of people set up for Python instruction. At the time I was thinking of something along the lines of user creation on a server with SSH + FTP, but I never did much with it. Then at Pycon 2012, Docker was shown at a lightening talk, and this tweet came to mind and I realized that Docker would be perfect for creating safe, but full featured learning environments. It was around this time that I first really got into how useful IPython notebooks are. The final affirmation came while helping out at a Software Carpentry workshop in early June. While many were technically set up, some of the other issues noted above were apparent, and this became a project worth building out.
Who is it for
As pointed out in the tradeoffs section, this is mostly geared towards intro classes where people are bringing their own machines, or to the first part of a longer course. If you are providing machines for a class, you can probably get them all into some suitable shape for what you need. And for an extended or more in depth class, you want to have people ultimately be able to do what they need to on their own machines.
Developing the project
Docker is a relatively new, and fast moving project. There are so many things about it that make sense. While a VM may take many minutes to start up, a container takes about a second (or less). If you are familiar with chroot, Docker is a little like chroot on steroids.
So Docker is great at lightweight isolation of processes. VMs are still better when you want to set more concrete limits and isolation in terms of resources like memory or disk space etc.
Docker uses LXC to handle the isolation of processes, and uses a set of read-only and read-write filesystem layers to allow changes to be made in a running container, which don't alter the files of the image from which the container was launched from. See the well illustrated example on the Docker site.
Like port forwarding on a firewall, Docker has a way to expose a TCP port running inside the container as a port on the host system. So for example if you run a webserver in a container on port 80, and tell docker to expose that port, it will assign a high numbered port on the machine that will route to that process running in the container. That port number is only chosen by docker when the container is started.
A relatively simple Flask webapp uses the Docker remote API to create containers on the fly, and then present users with links to the port mapped services such as IPython and the web based shell, (ShellInABox currently).
While it is not in any particular priority order, here is what I want to work on next as I find time.