Research Data Management @ UNSW

Research Data Management @ UNSW random header image

Doing Research Data Management

Posted on June 8th, 2011 · Support & Resources


This image is licenced under an Attribution-NonCommercial 2.0 Generic Licence.

There are a number of benefits to managing and sharing research data, including facilitating collaboration between researchers and accelerating new developments by building on results.

Below are a couple of interesting examples of researchers and research groups managing and reusing data:

Repository of Antibiotic resistance Cassettes

Here at UNSW, the Repository of Antibiotic resistance Cassettes (RAC) is an archive of the elements responsible for spreading antibiotic resistance genes. The aim of the system is to make this collection of gene cassettes widely available to other researchers. The system allows researchers to browse the published gene cassettes held in the repository, annotate sequences and add new cassettes not yet in the repository.

Berkeley Earth Surface Temperature

The Berkeley Earth Surface Temperature project aims to provide a new assessment and resolve current criticism of the global temperature change by analysing the surface temperature record. This record includes data collected by 39,000 unique stations around the world. The project also aims to make the resulting dataset (preliminarily comprised of 1.6 billion temperature reports from 10 pre-existing data archives) freely available on their website.

Sample, I 2011, Can a group of scientists in California end the war on climate change?, Guardian, 27 February.

Finding Research Data

Posted on May 10th, 2011 · Support & Resources


This image is licenced under an Attribution-NonCommercial-ShareAlike 2.0 Generic Licence.

Research data that is organised and clearly documented is easier for others to find and reuse. Data organisation includes, for example, using a consistent method for naming files and directories. Data documentation describes the data and allows researchers to retrieve more accurate results and gain a greater understanding of the data.

Researchers may want to reuse the data from another researcher or group to save time replicating existing results, to build on top of pre-existing research or to foster new collaborations.

In addition to Research Data Australia (RDA), there are a number of discipline-specific research data repositories where researchers may find data. Some examples include:

Research Data Australia

Posted on April 17th, 2011 · Support & Resources

Zoubin Ghahramani - 'Internet search queries'

This image is licenced under a Creative Commons Attribution-NonCommercial-NoDerivs 2.0 Generic Licence.

Research Data Australia (RDA) is an online registry which holds descriptions of research data collections produced by or relevant to Australian researchers. One of their aims is to raise visibility and discovery of research data collections in web search engines.

While RDA also aims to encourage access and reuse, providing and controlling access to the data is left up to the researcher. The record may include a link to the data if it is publicly available on another website or it may include the contact information for a person associated with the data who may be able to provide additional information and, in some cases, grant access.

As part of the Australian National Data Service (ANDS)-funded Seeding the Commons program, one of our aims is to gain exposure for UNSW research data by providing records of these collections to RDA.

Questions & Suggestions

Posted on March 29th, 2011 · Questions & Suggestions


This image is licenced under a Creative Commons Attribution 2.0 Generic Licence.

Research data management is becoming increasingly important as the quantity of data gathered and created increases. Most researchers here at UNSW will be involved in data management at some level.

Any general feedback or comments would be appreciated. And if you have a question about managing research data, add it into the comments box below and we will do our best to answer it!

File formats and digital preservation

Posted on March 22nd, 2011 · Support & Resources

Analog Computer

This image is licenced under a Creative Commons Attribution 2.0 Generic Licence.

File formats play an important role in how accessible, readable and meaningful digital data will be in the future. Choosing a suitable format will help avoid loss, deterioration and obsolescence. There are a number of questions to consider when choosing a file format with good preservation potential, including:
  • Is it non-proprietary?
  • Is it an open, publicly available standard?
  • Is it commonly used within the relevant research community?
The Australian National Data Service (ANDS) has detailed information on file formats on their website, including a checklist for choosing an appropriate format. ANDS advises that the decision on which file format to use should be made before data collection begins to lessen the costs associated with migrating to another format further down the line.

The National Library of Australia discusses the need for digital preservation and gives guidance on best practices on their website.

The Library of Congress explores the issues around digital preservation in this video:

And Digital Preservation Europe gives us Team Digital Preservation! In the short video below, the team highlight the need for good data documentation (or representation information) as they take on their nemesis at the Opera House.

Appraisal and selection of data for preservation

Posted on March 13th, 2011 · Support & Resources


This image is licenced under a Creative Commons Attribution-NoDerivs 2.0 Generic Licence.

There are a number of reasons for using an appraisal and selection method for preserving digital data:

  • As the amount of digital content expands, it may not be cost-effective in terms of storage space and backup systems to keep it all forever.
  • Taking time to create and manage rich data descriptions is necessary to ensure the data are findable and understandable over time.
  • The more data there are, the harder they are to find. Discovery will require more effort on the part of the searcher to narrow down the vast number of results.

The Digital Curation Centre (DCC) recommends considering a number of criteria when thinking about appraisal and selection, including the relevance, historical value and uniqueness of the data. The potential for redistribution should also be considered as should the feasibility of replicating the data. The costs associated with managing and preserving the data should be analysed and the data description must be comprehensive in order to aid in discovery and reuse.

For more detailed information, see the DCC’s Appraise & Select Research Data for Curation guide. The National Library of Australia also has guidance on Selection on their website.

Data Management Plans now required for NSF proposals

Posted on February 28th, 2011 · News & Announcements

I love clutter

This image is licensed under a Creative Commons Attribution-Share Alike 2.0 Generic Licence.

As of 18 January 2011, the National Science Foundation (NSF) in the United States now requires a Data Management Plan of no more than two pages attached to all proposals. This plan should describe how the research results will be shared and disseminated, in line with NSF policy.

The Data Management Plan for NSF proposals may include details such as:

  • types of data to be produced
  • data and metadata standards to be used
  • access and sharing policies
  • re-use and re-distrubution policies
  • plans for data archival

See the NSF’s Grant Proposal Guide (GPG) Chapter II.C.2.j for more details.

ERIM Project analyses data management plans across five organisations

Posted on February 23rd, 2011 · News & Announcements


This image is licensed under a Creative Commons Attribution 2.0 Generic Licence.

As part of the ERIM Project, the IdMRC at the University of Bath has analysed the guidance provided by five organisations, including the Australian National University and the Digital Curation Centre, on writing data management plans.

This analysis, Thematic Analysis of Data Management Plan Tools and Exemplars, aims to extract the most significant data management planning requirements which ensure data is understood, re-used and re-purposed.

NHMRC signs joint statement of purpose regarding research data

Posted on January 26th, 2011 · News & Announcements

Murray, re-nibbed

This image is licensed under a Creative Commons Attribution 2.0 Generic Licence.

On 10 January 2011, the Wellcome Trust announced a joint statement of purpose regarding the sharing of research data to accelerate advances in public health. The National Health and Medical Research Council (NHMRC) is one signatory in the group of major international public health research funding agencies:

Sharing research data to improve public health: joint statement of purpose

Also see the comment piece in The Lancet by Mark Walport, the Wellcome Trust director, and Paul Brest, the Hewlett Foundation president:

Sharing research data to improve public health