Deposit data FAQs
This document gives general information for those who wish to archive data with the NERC EDS.
Contents
- Can I deposit derived data?
- When should I deposit data?
- What checks will be made on my deposit?
- Why is metadata important?
- What metadata should I provide for my data?
- How do I get a DOI for my data?
- Can I deposit with a non-NERC Data Centre/request a DOI from another organisation?
- Who owns my data after I’ve deposited it with the data centre?
- What if I need to change my data and I already have a DOI?
- Can my datasets be linked to other related datasets held in other NERC Data Centres?
- What format should I use for my data?
- What about personal and/or sensitive data?
- How do I cite data in my research publications?
Can I deposit derived data?
The NERC Data Centres store data in the long-term but will need to know the original terms & conditions under which the original data were obtained in order to ascertain if or how the derived data can be made accessible with appropriate licensing. Early identification of these datasets is essential as licensing can be complicated and time-consuming.
When should I deposit data?
NERC Data Centres will seek to bring in a whole dataset as soon as it is 'completed'. This is so the necessary arrangements can be put in place for its future management as well as obtaining a DOI for referencing data in publications. An embargo period can be applied, meaning data does not have to be made accessible straight away. 'Completed' means the dataset is stable and will not change. Typically this will be the point at which analysis could be done.
What checks will be made on my deposit?
Deposit with a NERC Data Centre will involve the following checks:
- Authenticity: The data is what it purports to be, is created or sent by the purported person, and at the purported time. This is shown, for example, in the provenance of data and preservation metadata.
- Integrity: The data is complete and unaltered. This is achieved by appropriate ingestion processes and using digital signatures, fixity checks, and persistent identifiers for the data.
- Reliability: The data accurately reflects the original context of data creation and is trustworthy. This is achieved by documenting and capturing the contextual metadata and ensuring its completeness.
- Usability: The data can be located, retrieved, presented and interpreted, through use of use preferred and open file formats, monitoring of obsolescence (software, hardware), persistent data identifiers, and digital rights management as part of metadata capture.
Why is metadata important?
Metadata is information about the data iitself. Metadata is used for data discovery and for explaining what the specifics of the data are. The archived data may be re-used in the future by parties not involved with the project and could help support their science - so it is essential that data producers provide as much context with their data as possible.
Metadata should answer these questions: who, what, why, where, when and how your data was produced.
What metadata should I provide for my data?
NERC Data Centres will require two types of metadata: discovery metadata to populate a catalogue record enabling users to discover and access the data, and supporting documents that will facilitate re-use of the data.
Metadata is only useful if it’s universally understandable by both people and software, and so metadata standards are used to enable consistency. Quality research metadata will provide key descriptors of the data, including: format and volume, why the research was undertaken, including references to associated publications and projects; where and when they were collected, who created the dataset and who funded the work; how the data were created, instrumentation and/or software, how they were analysed, and how they were quality checked; and whether there are any access or usage constraints.
How do I get a DOI for my data?
Datasets must be deposited and quality checked at a NERC Data Centre before a DOI can be minted, and this process takes varying amounts of time depending on the data and scientific domain. Researchers requesting a DOI should be aware that NERC have certain quality requirements for data to be assigned a DOI, and that it is the responsibility of the depositor to ensure the data meets the required level of quality. Details of these requirements are provided in the Guidelines for Scientists with any further information required provided by NERC Data Centres on a case-by-case basis.
Can I deposit with a non-NERC Data Centre/request a DOI from another organisation?
NERC Data Policy mandates you to offer a copy of your data to a NERC Data Centre, although this is on a non-exclusive basis. Citations for data generated from NERC-funded research must be referenced through a Digital Object Identifier (DOI) issued by a NERC Data Centre, unless alternative arrangements have been agreed as part of the data management plan. Certain specialist types of data are sometimes permitted to be deposited with non-NERC repositories for their primary curation, but this should always be agreed with a NERC Data Centre prior to deposit. NERC maintains a list of approved repositories that can be used.
Who owns my data after I’ve deposited it with the data centre?
The holder of intellectual property rights (IPR) for the data that a researcher generates depends on who they work for and on their contract of employment. It is normally the employer of the researcher that owns the IPR. If you work for a university, the majority of the time the IPR will belong to the university, but this does depend on your contract of employment. The requirement to deposit data with a NERC data centre does not affect intellectual property rights.
What if I need to change my data and I already have a DOI?
You will need to upload the revised/new data and we will mint you a new version of the DOI. We are unable to change data linked to an existing DOI.
Can my datasets be linked to other related datasets held in other NERC Data Centres?
We can link deposited data to other related datasets using persistent identifiers, such as DOIs, assigned to each subset. This will allow your data to be accessible as one collection, despite different datasets potentially being curated by different NERC Data Centres.
What format should I use for my data?
Data should be deposited in an open, non-proprietary format in common usage by the research community wherever possible. Open formats are readable by more than one application, so make the data available to the widest possible audience as they can be accessed by various programs. They also retain the best chance of being readable in the future. Proprietary formats are often used by only one particular program or even version, and so present problems for future reuse. Those without a license to the software may not be able to access the data. Data stored in a proprietary format should be converted to an open format before depositing in a Data Centre. For example, we ask depositors to convert proprietary Microsoft Excel spreadsheets (.xlsx) to an open format such as comma-separated values (.csv).
What about personal and/or sensitive data?
Personal data (from which a living individual can be identified) cannot be openly shared without explicit permission from the individual(s). Location data e.g. of a field site or Red Data Book species, may be deemed too sensitive to openly share. In both cases anonymised or summary data should be provided for deposit.
How do I cite data in my research publications?
When data has been assigned a DOI, an email will be sent to the requester confirming the DOI and the form of words for the formal citation for the dataset. It is the researcher’s responsibility to ensure that any data resource referred to in subsequent publications is cited formally, using the form of words suggested by the NERC EDS. Researchers should also encourage others to similarly cite the data.