Getting involved with Software Carpentry
I've recently finished up helping with my first Software Capentry Bootcamp, which was held at the Salk Institute in La Jolla. Given this was my first experience with SWC, I don't have any comparative insight to other bootcamps. One of the positive attributes of culture of SWC is to write up experiences to share with other people who might be helping out in future workshops. There were three instructors, Dhavide Aruliah led off with the shell, Jessica Kerr followed with a sessions on Git and SQL and Bill Punch led Python which comprised most of day 2.
We didn't have any pre-workshop survey data, but anecdotal evidence from conversations suggests most of the participants were postdocs and all were biologists, many involved in either genomics or neuroscience. The group seemed to be remarkably in sync most of the time, 90% + either were getting it or not, but we didn't have what sounds like the typical problem of a 3-way split between the bored, just-right, completely-lost.
As a helper, I realized that having a relatively thorough understanding of various platform setups is pretty important - helping beginners get set up is its own set of special knowledge. I have done a fair amount of this at various Python sprints, but the Windows set up was still a bit foreign to me. Here are some of the challenges that came up along the way for helpers:
The windows users were split between the cygwin and virtualbox use. Far more problems were encountered with cygwin - a number of setups seemed to have multiple installs of cygwin, and many with incomplete features. Reinstalling helped in some situations, but you need to scan all the optional install items and make sure all you need are checked, checking "all" doesn't seem to do that.
We had some sample files to download as a zip - some moved these into "My Documents" and then had to deal with spaces in filenames in the shell very early in their lerarning curve.
Some people instinctively logged into a SSH server they had used on campus, as this is what they associated with the command line. While this does work for some things. In addition to making it hard to work with any local editor application, getting remote X11 setup for IDLE was way outside the scope of system setup.
A couple of other challenges were related to using IDLE for Python:
- IDLE wasn't on the VM
- Macs no longer having X11 by default and needing to install X11 QuartzX for IDLE, which does not document a required restart to get $DISPLAY env set.
- IDLE default font of courier (on Mac at least) does not display '_' character, making variable names look like they had spaces.
- copy and paste issues between X11, native, and virtualbox VMs.
- No line numbers displayed made debugging errors less clear when IDLE did not highlight location.
Other minor challenges as a helper were:
- Helping people with non-english
- keyboard layouts (Korean, French)
- people assuming that downloading anaconda install sh - was all they needed to do
As a helper, I tried as best I could to help people solve the content problems only with clues and breadcrumbs instead of just telling them what their mistake was and how to fix it - this sometimes took up more of their time, but in the end I think is more valuable.
Helping someone while the instruction continues often resulted in bit of a struggle to get them caught up, as you had to try to keep one ear/eye on where the group was going on. In these cases I would often take over and speed explain what they had missed while typing in missed steps and try to get them back to where the instructor was on screen as best I could - not always easy to do, and I think left a few people sometimes feeling chronically behind. The alternative to this catch-up via rushing would basically lock the helper to being a time-delayed-instructor-echo, covering everything with that one person at the same pace, but just delayed by some minutes.
Fluent people will often know what a like-speaker meant when they misspeak. A non-native English speaker will often misunderstand someone saying something in English that is minorly incorrect, where a native speaker will appreciate the context of what was said and "know what you meant". The same is true with some aspects of technology. As a somewhat humorous example (which led to a bit of creative scramble to fix) one of the instructors had a prompt on screen of just the character: '>', so the screen instruction looked like:
Even most of the beginners deduced that the '>' was the prompt from the lines above that instruction, but one person typed the above in literally, which directs <nothing> into that file, thereby converting the idle binary to a zero-byte file!
Some of the themes we successfully hit on repeatedly were:
- Incremental development
- Recognizing conventions
- Knowing how to find answers on your own is a critical skill to start early
- recognizing that tools and concepts are different things (ie IDLE, IPython, Terminal vs "the Python interpreter")
One of the areas that is a hard compromise to sort out is how much to standardize the environment for the bootcamp to make the learning more consistent, vs making sure that the users machines are setup so that they can continue with these tools and materials after the workshop. I feel that given how beginner oriented the material is, a more standardized learning environment has more pros than cons. I think there was plenty of time during the lunch break, and it could be extended, so that on the second day, we could work with people to get their machines set up - by that time they would know a bit more about what they were installing and why. This is a tradeoff no matter what, and others may disagree.
Some of the struggles I think the participants had were:
- For non english speakers, keeping up with fast talking instructors, and the use of English idioms and metaphors.
- The inevitable slips of instructors using terminology not yet introduced.
- We introduced git too early and without enough context, git should probably come last and can then be used with python source files.
- Similar to teenagers in programming classes who want to write halo on day one, some participants had very ambitious goals, and will have to be patient with their own learning curves.
For instructors I think the main struggle with the participants was poor honesty in feedback to the question "is everyone getting this?"
Other things that I perceived as challenging for instructors were: - dealing with the expected lowering of IQ and proficiency while at the podium - dealing with the balance between valued interjection and interruption from other instructors and helpers (we are all guilty of being eager to share). Dealing with the nigh impossible challenge of fitting heaps of materials into 2 days.
There was not much in the way of eager questions from the audience. There was one person who asked maybe 80% of the verbal questions, and while there were a couple times he was a bit persistent, the truth is he was probably doing a huge service to the others who might also have been confused but were intimidated to ask.
During the workshop there were varying degrees of success in breaking the material up with exercises - I felt like participants and instructors alike did better when the exercise breaks came regularly. This means trading time for new material, but I think is a worthwhile trade off. I'd even think that it would be great to end day 2 with an extended exercise that tries to combine all the parts, even if that means a full 45-60min block for them to work through it.
Some other things that I think are tough nuts to crack for the 2-day format:
- How to point participants not only to continue learning the basics, which are often well documented, but also to find science specific domain tools, which often are not.
- How to introduce, as early as practical, techniques for working with larger amounts of data and associated memory constraints.
- How to structure, online or in person, a way for the participant group to continue to learn collectively with each other.
All in all it was a great experience, you could tell by how engaged the participants were that there is a big demand for this material. I hope to participate in more bootcamps in the future.