Good programming practices in R

I write sloppy R scripts. It is a byproduct of working with a high-level language that allows you to quickly write code on the fly (see this post for a nice description of the problem in Python code) and the result of my limited formal training in computer programming. The lack of formal training makes scientists self-conscience of the bits of code that they cobble together to solve research problems, but a professional software engineer reassuringly points out that most software runs on messy code. Although sharing sloppy code is better for research progress than not sharing any code at all, you can make the code sharing experience better by picking up some good programming habits. Even if you don’t intend to share your code, which is arguably bad for science, adopting good programming habits should improve your workflow by making old bits of code more easily understandable and re-usable.

The Department of Biostatistics at Vanderbilt University provides a nice list of programming tips for statisticians. For R-specific recommendations, see Google’s R Style Guide. Hadley Wickham provides his own recommendations, which are generally—but not always—aligned with Google’s R Style Guide. I previously thought that the best way to improve your code was by adding comments, but I hadn’t thought about how copious comments may actually make your code less readable. I intend to adopt nearly all of the recommendations in Google’s R Style Guide. Well, I will immediately adopt the style guidelines for any code that makes a public appearance on this blog, but changes to my usual programming habits will likely be more gradual.