Open Software

Info

Overview

I have authored and continue to maintain a number of open source projects, primarily R code that compiles open access data sets. My open access software repository with CERN can be accessed either via the convenient code.seanfobbe.com or through a direct link. Active development occurs on GitHub.

You can also view and download my Linux configuration (e.g. dot files, package lists, install scripts) for Fedora and Debian in a continuously updated GitHub repository.

Principles

When I engage in data science, I am serious about the science part. That is why I am an avid proponent of open source software and strive to make my publications based on computational results fully reproducible. I endorse the famous admonition of Buckheit and Donoho (2010: 385):

An article about computational results is advertising, not scholarship. The actual scholarship is the full software environment, code and data, that produced the result.1

Naturally, this is easier said than done. As a matter of course I try to release open access all data sets I create, publish the full source code (including version numbers of all dependencies) and make available my computational results with stable identifiers in long-term storage on Zenodo, the scientific repository of the high-energy physics research organization CERN.

Favorite Open Source Tools

My research is only made possible through countless open source tools that I have had the fortune to be able to download and use for free. I would like to mention a few of my favorites as a way of saying ’thank you’ and just in case someone else finds them as useful as I do.

Note

  1. The original sentiment — paraphrasing an idea of geophysicist Jon Claerbout — was published in Buckheit and Donoho 1995, although the exact quote is from Donoho 2010. See: Buckheit, Jonathan B and David L Donoho. 1995. ‘WaveLab and Reproducible Research’. In Wavelets and Statistics, edited by Anestis Antoniadis and Georges Oppenheim, 55–81. New York: Springer, 1995. See also: Donoho, David L. 2010. ‘An Invitation to Reproducible Computational Research’. Biostatistics 11 (3): 385–388. ↩︎