Jump to Navigation

Version control

The importance of version control

Because digital research data can so easily be copied, over-written or changed, researchers need to take steps to protect its authenticity. Research time is wasted and valuable data put at risk if researchers work with outdated versions of files.

Version control can prevent this. Control is particularly important if data is being used by multiple members of a research team, or if research files are shared across different locations.

A regime to synchronise different copies or versions of files will improve research efficiency and help guarantee the authenticity of the data. Good practice generally involves the keeping of a single master file, to which all changes are recorded. Version control mechanisms should be established and documented before any data is collected or generated.

Sample version control table

Title:
File Name:
Description:

Research Data Management 2011 Survey
RDM_Results_2011.xlsx
A survey of data management practices and procedures at the University.
Created By:
Maintained By:
Created:
Last Modified:
Based on:
JT
JT
24/06/2011
18/10/2011
RDM_Results_2010.xlsx
Version: Responsible Notes Last amended
1.3 AP cleaned data 18/10/2011
1.2 TH cleaned data 07/10/2011
1.1 RA added data 04/09/2011
1.0 JP added data 13/08/2011

File naming methods to assist versioning

Version control can be helped by file naming conventions, e.g.

Using dates
- RDM Survey_10-11-2011
Using file version numbering
- RDM Survey_1-01
Using file version descriptions
- RDM Survey _draft_1
- RDM Survey _final_1

The Australian Code for the Responsible Conduct of Research is clear about your responsibilities around the management of data. The Code expects researchers to:

2.6.1 Keep clear and accurate records of the research methods and data sources ...

2.6.2 Ensure that research data and primary materials are kept in safe and secure storage provided, even when not in current use.

2.6.3 Provide the same level of care and protection to primary research records, such as laboratory notebooks, as to the analysed research data.

Version control can be managed via:

  • the use of versioning software, such as Apache Subversion (SVN)
  • the use of file sharing services such as CloudStor, or the ARCS Data Transfer service
  • strictly controlling who has rights to add or edit data
  • manual merging by a designated person of any additions or edits
Tip

Best practice

The UK Data Archive has best practice advice on how to achieve data authenticity, e.g. by keeping a single master file of data, and by regularly archiving copies of master files.

8-FurtherHelp.png


five_sixth