LibGuides: Digital Repositories at Chapman University: Documentation

Documentation

All data sets published on Chapman Figshare must be accompanied by documentation. At minimum, your data documentation should address:

Where and how data was obtained/generated/collected/compiled.
- include citations to other data, tools, software, etc.
How files are organized and named, and what they contain.
How the data is structured and defined variables.
What was done to the data and files (if not sharing raw data).
- i.e. steps or code used to clean or standardize data

Readme files

The simplest and most flexible way to document your data is through a README file - a text document that acts as a 'user manual' for your dataset. README files are most often found as plain text (.txt, .md) or PDF files. It's possible to insert a codebook or data dictionary into a README but it may not always be practical.

Readme templates

Markdown DataShare readme template
Editable in a text editor, Markdown editor, or in an online editor such as Dillinger.io.
Export file as text (.txt), Markdown (md), or PDF file.

Google doc DataShare readme template
Editable in MS Word or Google Docs.
Export file as text (.txt), Markdown (md), or PDF file.

Data dictionaries and codebooks

Traditionally a codebook defines the variables of a data set while a data dictionary defines variables and provides extra details such as information on the origin of the data, its relationship to other data, and how to use it. However, the two terms are fairly interchangeable so use whichever you like.

These documents are important because they explain attributes that are not within the data itself, such as what each column of a spreadsheet represents, how null and zero values differ, and what encoded data means. For example, a column titled "date" does not tell you why the date matters, only that it is one and a value of "1" could be a quantity or might mean "yes", "question 1", or a variety of other things.

DataShare codebook template (Google Sheets)
- Row 2 contains instructions; rows 3-5 contain examples.
Example codebook within a readme file
- A very simple data set and codebook about superhero movies.
What is a codebook? - ICPSR
- While aimed at social science research, especially survey data, this page provides examples from a wide variety of sources.

Examples on this page were created for Iowa State University's equivalent Figshare instance, DataShare, and kindly provided by Iowa State University, reused with permission