All data sets published on Chapman Figshare must be accompanied by documentation. At minimum, your data documentation should address:
Depending on the data, we may require additional documentation. Human subjects data should be accompanied by a copy of the relevant consent form. If your data came from indigenous communities, from a data use agreement (DUA) with an outside entity, or otherwise may be wholly or partly someone else's intellectual property, we may request you add documentation of permission to share the data and for others to reuse it.
The simplest and most flexible way to document your data is through a README file - a text document that acts as a 'user manual' for your dataset. README files are most often found as plain text (.txt, .md) or PDF files. It's possible to insert a codebook or data dictionary into a README but it may not always be practical.
Editable in a text editor, Markdown editor, or in an online editor such as Dillinger.io.
Export file as text (.txt), Markdown (md), or PDF file.
Editable in MS Word or Google Docs.
Export file as text (.txt), Markdown (md), or PDF file.
Traditionally a codebook defines the variables of a data set while a data dictionary defines variables and provides extra details such as information on the origin of the data, its relationship to other data, and how to use it. However, the two terms are fairly interchangeable so use whichever you like.
These documents are important because they explain attributes that are not within the data itself, such as what each column of a spreadsheet represents, how null and zero values differ, and what encoded data means. For example, a column titled "date" does not tell you why the date matters, only that it is one and a value of "1" could be a quantity or might mean "yes", "question 1", or a variety of other things.
Examples on this page were created for Iowa State University's equivalent Figshare instance, DataShare, and kindly provided by Iowa State University, reused with permission