Data integrity
What is meant by data integrity?
Data integrity refers to the accuracy, consistency and validity of a dataset over its entire lifecycle. Maintaining data integrity is a core focus behind data management planning. Research data integrity can be ensured through good data management, data quality, and data security practices. Here, data security refers to the protection of data against unauthorised access or corruption, while data quality refers to the quality assurance checks applied to the data itself to make it reusable, reliable, and accurate.
How can I ensure data integrity throughout the research lifecycle?
Data integrity can be compromised in several ways; each time a dataset is replicated or transferred, it may be altered by unintentional errors in the procedures or altered maliciously. Error checking methods and validation procedures should be put in place to ensure the integrity of all data that are transferred or reproduced where there is no intention of alteration.
In other words, data integrity is the result of good data security practice and good data handling practice. When data are moved, whether they are in digital form or as written material, there is a risk of loss or errors. Whether it's a case of malicious intent or accidental compromise, data security plays an important role in maintaining data integrity. See below some of the common ways your data can be compromised, and some protocols to put in place to help mitigate these issues.
How might data integrity be compromised?
When data are moved, whether they are in digital form or as written material, there is a risk of loss or other errors. These could include:
- Human error, whether malicious or unintentional.
- Transfer errors, including unintended alterations or data corruption during transfer from one device or media to another.
- Theft, bugs, viruses/malware, hacking and other cyber threats.
- Compromised hardware, such as a device or disk crash.
- Physical compromise to devices.
Whether it's a case of malicious intent or accidental compromise, data security plays an important role in maintaining data integrity.
What steps can be put in place to ensure data integrity?
- keep a master file of data, on a separate device in another location, or in the cloud.
- use checksums to validate content of files.
- assign responsibility for master files, where possible, to an individual member of the project team.
- restrict write access to master versions to specific members of the project team.
- create a formal procedure for the destruction of master files.
- record all changes to master files.
- maintain old master files in case later ones contain errors.
- archive copies of master files at regular intervals.