4  Epilogue

4.1 A note on dependency and reproducibility

Before we conclude our introduction to package managers and version control we would like to share some thoughts on reproducibility.

In our world of modern infrastructure we depend on a lot of other projects and quite often we do not even realize what tools we need in the background for our little program to run. Best example, do you have any idea what the programming languages for the commands git, python, or pdm are and what system dependencies they have?

This is exquisitely illustrated in the following xkcd comic.

Figure 4.1: The xkcd comic #2347 discussing dependency, explainxkcd. Click the image to get to the original image source https://imgs.xkcd.com/comics/dependency.png.

There only needs to be one little pebble misplaced and everything collapses.

In industry and academia it is often a problem that some computation can not be reproduced. Let it be an experiment that you need to rerun two years after your initial run, or generate some computations for an annual report. Or the more trivial matter of running you code on a different machine (of a colleague or after you got a new computer). Such situations require your program to provide the same output for the same input. Quite often we do not think about this during the development of our code but if we stick to some basic concepts we are already closer to reproducibility. Trust us if we say: Sooner or later you will need this!

In this last sections we discussed two tools that will help us in this quest:

  • package management
  • version control

Our code will have dependencies and keeping the correct list and pinning the correct version helps us greatly. The dependencies also undergo development and are not standing still so we need to be aware that we run on a specific version. The same is true for our own code and version control helps us keep track of what, why, and how we did it. This helps us correct errors, return to an old but working versions, figure out who to blame for the error.

Besides these two tools we should also write our code in such a way that it is easy to follow and we should try to keep it as little attached to our current work environment as possible - in particular with paths to files - and we can avoid a situation like the following:

Sure I can share my code but I am glad it runs on my machine. I have no idea how to get it to run on yours.

We know that this is a bit much for beginners but if we stick to it from the start we have less to change later on.

Anyway, lets end with:

Figure 4.2: The xkcd comic #1513 discussing code quality, explainxkcd. Click the image to get to the original image source https://imgs.xkcd.com/comics/code_quality.png.

and Python style guide - PEP 8.