Organizing Data
Organizing data involves ensuring that you can find your data and other research materials (including documentation, code, and physical samples) when you need to and ensuring that data and materials that go together are connected in a meaningful way.
What does it mean to organize data?
Organizing data means arranging your data and other research materials so they can be found – by yourself and by others – as needed. Here are four factors to consider when organizing data. Remember, you can’t use data you can’t find.
Names |
Data should be labelled using a consistent and descriptive file naming system. Your system should allow you to immediately and uniquely identify the contents of your files. |
Structures |
Data should be organized within a consistent and easy to navigate file structure. Maintaining such a structure can help reduce the risk of data loss and unnecessary replication. |
Connections |
Connections give context. Data and other materials should be organized in a manner that emphasizes the links between them. This may refer to different versions of the same file or different files related to the same aim or project. |
Documentation |
You should document how you organize your data and other research materials and refer back to and update your documentation often. When thinking through how to organize your files, make sure you also consider how you include all of the related description and documentation (e.g. notes, data dictionaries, metadata). |
Requirements and How to Meet Them
There are specific requirements about how some human subjects data can be organized. Under most circumstances, data containing sensitive or potentially identifying information should be stored separately from data that does not. However, you should apply the same organizational principles to both.
Things to Think About