Skip to Main Content

Data Management

Learn how to manage your data to ensure it stays available and readable, how to write a data management plan, and how to comply with funding agency data mandates.

Research data is "the recorded factual material commonly accepted in the scientific community as necessary to validate research findings." (OMB Circular 110)  However, this definition can be expanded beyond the traditional sciences and applied to the humanities and social sciences as well.  Research data can encompass many different types of information and be recorded in a variety of formats, both analog and digital. 

Preserving data allows researchers:

  • To ensure data is available in the long-term.
  • To prevent loss of data due to malfunction or destruction of personal storage systems, format obsolescence, or other reasons.

Astrophysicist Rachel Ainsworth said in Nature: “Your primary collaborator is yourself six months from now, and your past self doesn’t answer e-mails."

 Sharing data allows researchers:              

  • To have their data used widely, increasing knowledge production while boosting the researcher’s visibility, prestige, and citations.
  • To promote new research, allowing others to test new hypotheses, analyze and verify existing findings, and potentially make new discoveries from the data.
  • To prevent duplication of scientific studies that have already been conducted.
  • To direct requests for data to a database, rather than having to send files to individual researchers.
  • To enhance collaboration and community-building within their disciplines.

A data management plan (DMP) is a document that details how research data will be handled during and after a research project. It specifies:

  • what data will be gathered
  • how it will be acquired and processed
  • whether and how research data will be preserved and shared upon the completion of the research. 

It will likely also include information about:

  • formatting
  • quality control
  • privacy concerns
  • accessibility
  • analysis
  • metadata generation
  • intellectual property rights and conditions for use.

A growing number of funders are requiring DMPs as a condition of funding, in an effort to get researchers thinking about how they will manage their data as early in the process as possible and to make the preservation and sharing processes more efficient. Preparing a DMP helps ensure that researchers will format their data correctly and consistently, organize it well, and plan ahead as to where they will archive it for preservation and sharing (if applicable).

Examples of research data include:

  • Spreadsheets
  • Text documents
  • Laboratory or field notebooks
  • Questionnaire, survey, or test responses
  • Transcripts
  • Audio or video files
  • Image files
  • Slides, artifacts, specimens, or samples
  • Models, algorithms, and scripts
  • Log files or input/output files for analysis or simulation software
  • Operating procedures, methodologies, and workflows

 

Research records may also be important to consider when planning how to manage one’s data.  These can include:

  • Correspondence files
  • Project files
  • Grant applications
  • IRB or other ethics applications
  • Technical and research reports
  • Master lists
  • Signed consent forms

 

Some research data may not be sharable due to ethics and privacy concerns or regulations, but researchers may be called upon to explain how they will manage them. Examples include:

  • Preliminary analyses
  • Drafts of papers
  • Plans for future research
  • Peer reviews
  • Communications between colleagues

 

In addition, the following types of data should also not be shared:

  • Trade secrets, commercial information, materials falling under a confidentiality clause, or other information protected by law.
  • Personal and medical information that constitute unwarranted invasions of privacy, such as information that would allow a particular participant in a research study to be identified.