Data Transformation
What is meant by data transformation?
In the process of most research, original or primary data will be 'worked up' or transformed from its original state into a different form, structure or format ready for analysis, visualisation or access. Transformation is a fundamental aspect of most data management tasks and will range in complexity and number of steps.
The choice of tools used to carry out data transformation can have a big impact on reproducibility, repeatability, efficiency and the level of risk to quality and even data loss.
Familiar 'point and click' tools used in transforming data, e.g., Microsoft Excel, offer a simple-to-use interface with a multitude of functions but they have drawbacks, especially when working with large and complex datasets, meaning that they have limitations for use in high quality research.
Transformation software/tools, such as FME, R, MATLAB, SAS, Python etc., whilst needing some effort to adopt, offer audit trailing and re-running of tasks and therefore offer greater reproducibility, repeatability, efficiency and mitigation of risk to quality.