LibGuides: Code Sharing: Best practices in writing and sharing code

Writing good code

Outside of the style guides for your computational language of choice, there are many resources on writing clean, well-documented, reusable code customized for researchers who are not professionally trained software engineers.

Five Recommendations for FAIR Software
The Good Research Code Handbook
Research Software Engineering with Python
Writing Clean Scientific Software
BES Guides to Better Science: Reproducible Code
R Code Review Checklist This checklist is designed to serve as an issue template to assist in the code review process for data wrangling/analysis projects developed in R. The focus of the checklist is not R package development and review; rather, it is aimed at teams of data scientists and/or data analysts who write scripts to generate tables, listings, figures, or any other analytic output. The checklist follows principles set forth in the tidyverse style guide + Good enough practices in scientific computing, and adheres to deodorizing strategies from Code smells and feels.

Simple Software Engineering Best Practices

Use clear variable names

Variable names should be clear and distinct. Concise variable names may be unclear to a future reader of your code (including you) and having many similar short variable names introduces a risk of using the wrong variable unintentionally.

Don't repeat yourself (DRY)

In essence - anything done repeatedly should be made a function. Duplicating code makes it harder to maintain, as any change must be replicated everywhere the code has been copied. Seeing the same code block repeatedly also makes the code harder to read and distinguish what any one part of the code is supposed to do.

Levels of abstraction & the stepdown rule

Levels of abstraction, familiar to formally trained software engineers, may be a new concept for many writing code for their research.

Functions should only do one thing and do it all at the same level of abstraction. A code document should also begin with its highest level of abstraction, and work downwards from there. One metaphor likens it to reading a newspaper article - the headline and lede tell you what the article is about, and you find out more and more details as you read down the page.

Reading Code from Top to Bottom: The Stepdown Rule

Metadata and citability

While code documentation and metadata ranges from simple to complex and vary significantly by discipline. There are some efforts to homogenize this and best practices, such as those provided by the FAIR Biomedical Research Software Guidelines project, are to use at a minimum:

A human-readable README.txt file, with basic information about functionality, versioning, citation, and provenance.
A human-readable CHANGELOG.txt file, if the software is a new version of an existing object.
A machine-readable codemeta.json file, which is a simple form of machine-readable metadata supported by Zenodo, GitHub, DataCite, Figshare, and more.
- See the online CodeMeta generator for a fast and easy way to make these files.
A machine-readable CITATION.cff file, which integrates with GitHub, Zotero, Zenodo, and more to accurately display citation information for your project on the websites and repositories hosting it.
- See the online cffinit generator for a fast and easy way to make these files.