Model Data Management
The NERC Environmental Data Centres can archive and publish high volume datasets including model outputs that are re-useable and of long-term value to the wider community. Model data submissions should be the output of exemplar simulation runs, which solve a particular research or experimental need, for a particular set of physical circumstances. The outcomes generated by the simulation(s) should be of a publishable quality.
The NERC Environmental Data Centres will not take output from simulations carried out as part of model development work, or from test runs carried out to identify good parameterisations to use in a future exemplar run. The Data Centres do not tend to archive high volume outputs that are easily reproducible by an average end user. Some exceptions can be made and we recommend contacting your data centre to ask about their size limits.
Format description
We encourage data to be supplied in a compliant Climate and Forecast (CF) netCDF format. The Data Centres also encourage the use of standard internal compression of netCDF files. However, other formats are acceptable as long as they comply with our accepted file formats. Please get in contact with us if you need any assistance or information.
Metadata submission
All outputs will need to be accompanied by adequate metadata and a comprehensive abstract for the dataset. If you are using netCDF files, you should also provide details of the metadata information you have placed in the CF header of the files. If you have any queries, please contact the Data Centres.
Additionally, any documentation you may have that relates to the simulations being submitted is also of use and should be supplied wherever possible. This would include journal references, if they are available.
Finally, a description of the simulations, in a non-technical language, must be provided so that the simulation set-up conditions can be made available to end users of differing technical experience. The description must provide enough detail so that users would have confidence to use your text as a source of information on the model simulations.
Model code
The NERC Environmental Data Centres recommend storing code in GitHub for long-term preservation. The tagged GitHub instance should be published with a DOI (generally using Zenodo) and referenced in any publications based on the code and/or corresponding model outputs. All the model metadata including forcing and configuration parameters and the source of any large forcing data fields, should also be placed into GitHub. Where possible code should be well commented, modular, have no local file paths, execute without errors and ideally execute without human intervention using a pipeline script. FAIR for research software principles (FAIR4RS) can be applied to ensure best practice.
Useful references for further information
Barker, M., Chue Hong, N.P., Katz, D.S. et al. Introducing the FAIR Principles for research software. Sci Data 9, 622 (2022). https://doi.org/10.1038/s41597-022-01710-x Models and modelling
