Link Search Menu Expand Document

Data Management Planning Checklist

The following are questions that can assist in developing a strong data management plan. Not all questions may apply to your project.

Describing Your Expected Data:

  • What datasets are expected to be produced from the project? Include both raw data and processed data and any anticipated derivative applications or models.
  • What form(s) and format(s) are the data in?
  • What consistent naming methods are being used for data files or folders?
  • Has any data in the set been collected from other sources, a.k.a. third-parties? If so, have you cleared any copyright concerns to use and re-publish this data?
  • What specific tools or software are needed to view, process, or visualize the data? Is any proprietary software needed?”
  • Who owns the data?
  • Who is responsible for managing the data?
  • Who will have access to the data during the project?

Enabling Discovery of Your Data:

  • Who will be responsible for documenting the data (creating the metadata)?
  • What standards will be used for the metadata (e.g. XML-based like EML, Dublin Core)?
  • Have you used any formal specialized vocabularies, code lists, thesauri, or taxonomies (e.g. phylogenetic taxonomies, ISO topics)?
  • Have you used any customized abbreviations or shorthand? Are they explained in full in the data documentation?

Enabling Long-term Storage of Your Data:

  • Where will the data be stored during the project?
  • What backup measures will be implemented?
  • Where will the data be archived for long-term storage?
  • Will you expect to alter or update archived data, or is it permanently finished once archived?
  • How long should the data be stored by an archive or repository?
  • How large is the dataset, and if relevant, what is its anticipated rate of growth? (e.g. MB/year)

Enabling Sharing of Your Data:

  • How should the data be made accessible?
  • Who are the potential audiences for the data?
  • How could the data be re-used and re-purposed?
  • Does the dataset include any sensitive information subject to confidentiality concerns?
  • If possible, describe any required or special measures required by funders, lab, or IRB.
  • Will the data be collected in the United States or other? If so, where?
  • Do the data have unique identifiers?

Enabling Citation of Your Data:

  • What publications, discoveries, or further datasets have resulted from the data?
  • Who will receive credit for authoring the data? In what order, if any, should the authors be given?
  • What organizational name should be referenced in citing the data?