Skip navigation.

Data Management Planning Checklist

The following are questions that can assist in developing a strong data management plan. Not all questions may apply to your project.

    Describing Your Expected Data:

  • What datasets are expected to be produced from the project? Include both “raw” data and processed data and any anticipated derivative applications or models.
  • What form(s) and format(s) are the data in? [more]
  • What consistent naming methods are being used for data files or folders? [more]
  • Has any data in the set been collected from other sources, a.k.a. third-parties? If so, have you cleared any copyright concerns to use and re-publish this data? [more]
  • Which specific tools or software needed to view, process, or visualize the data, especially any proprietary software?
  • Who owns the data?
  • Who is responsible for managing the data?
  • Who will have access to the data during the project?

    Enabling Discovery of Your Data:

  • Who will be responsible for documenting the data (creating the metadata)? [more]
  • What standards will be used for the metadata (e.g. XML-based like EML, Dublin Core)? [more]
  • Have you used any formal specialized vocabularies, code lists, thesauri, or taxonomies (e.g. phylogenetic taxonomies, ISO topics)?
  • Have you used any customized abbreviations or shorthand? Are they explained in full in the data documentation? [more]

    Enabling Long-term Storage of Your Data:

  • Where will the data be stored during the project? [more]
  • What backup measures will be implemented?
  • Where will the data be archived for long-term storage? [more]
  • Will you expect to alter or update archived data, or is it permanently finished once archived?
  • How long should the data be stored by an archive or repository?
  • How large is the dataset, and if relevant, what is its anticipated rate of growth? (e.g. MB/year)

    Enabling Sharing of Your Data:

  • How should the data be made accessible?
  • Who are the potential audiences for the data?
  • How could the data be reused and repurposed?

    Copyright, Security, Privacy Concerns:

  • Does the dataset include any sensitive information subject to confidentiality concerns?
  • If possible, describe any required or special measures required by funders, lab, or IRB.
  • Will the data be collected in the United States or other? If so, where?
  • Do the data have unique identifiers?

    Enabling Citation of Your Data:

  • What publications, discoveries, or further datasets have resulted from the data?
  • Who will receive credit for authoring the data? In what order, if any, should the authors be given?
  • What organizational name should be referenced in citing the data?