Skip to Main Content

Research Data Management Self-Assessment

A self-assessment and introductory guide to research data management for researchers

Organizing Data

Organizing Data

Organizing data involves ensuring that you can find your data and other research materials (including documentation, code, and physical samples) when you need to and ensuring that data and materials that go together are connected in a meaningful way.


What does it mean to organize data?

Organizing data means arranging your data and other research materials so they can be found – by yourself and by others – as needed. Here are four factors to consider when organizing data. Remember, you can’t use data you can’t find.

Names

Data should be labelled using a consistent and descriptive file naming system. Your system should allow you to immediately and uniquely identify the contents of your files.

Structures

Data should be organized within a consistent and easy to navigate file structure. Maintaining such a structure can help reduce the risk of data loss and unnecessary replication.

Connections

Connections give context. Data and other materials should be organized in a manner that emphasizes the links between them. This may refer to different versions of the same file or different files related to the same aim or project.

Documentation

You should document how you organize your data and other research materials and refer back to and update your documentation often. When thinking through how to organize your files, make sure you also consider how you include all of the related description and documentation (e.g. notes, data dictionaries, metadata).


Requirements and How to Meet Them

There are specific requirements about how some human subjects data can be organized. Under most circumstances, data containing sensitive or potentially identifying information should be stored separately from data that does not. However, you should apply the same organizational principles to both.


Things to Think About

  • There may or may not be standard organizational schemes that fit your data. Whenever possible, you should try to adopt the standards of your research community. For assistance in identifying the right organization scheme for your data, contact the LRDS team at LRDS@chapman.edu.
  • You should document your file naming and structuring schemes. Such documentation may take the form of a data dictionary or ReadMe file and should enable somebody other than you to understand how your research materials are organized.
  • The size and content of your data will determine the degree of flexibility you have about keeping it organized. It is very likely that your organizational scheme will not be perfect. There may be times when you’ll need to rearrange your files.
  • Versioning your data may be a good way of keep it organized, as long as it is done in a consistent and descriptive manner. Data_v2.csv may be informative, Data_NewEdits is less so.
  • These principles (naming, hierarchies, linking, documentation) also apply within data files. For example, variables names within a file should be consistent and descriptive and you should maintain documentation about what they refer to.