Category: Data sharing
Figshare announced this week that they
have gone into partnership with PLOS to
host the supplemental data for all seven PLOS journals. For ease of
access to view the data, PLOS will provide a widget, which will allow the users
to view the data alongside the content.
"PLOS believes in making data as visible and useful as possible," said Kristen Ratan, Chief Publishing and Product Officer at PLOS. "Partnering with figshare is an important step in increasing the accessibility of the data associated with our research articles."
Figshare have invited members of their community to become advisors. In exchange for presenting figshare to your colleagues at a lab meeting or journal club.
- In return for becoming an advisor Figshare are offering :
- figshare goodies such as hoodies, t-shirts, mugs, stickers, pens, etc.
- Early access to new features before they're public.
- A figshare Advisor badge on your figshare profile.
- Travel expenses paid by figshare when you give presentations outside of your area.
- Looks great on your C.V.
The National Academies Press has published a compendium of academic thinking about sharing academic data, "The Future of Scientific Knowledge Discovery in Open Networked Environments: Summary of a Workshop." It is available to download free as a PDF.
The report is largely an exercise in envisioning the possible opportunities for scientific discovery presented by extensive data sharing, based on large-scale, present-day examples. Read his full article here.
Europeana is an online library endorsed by the European Commission, for millions of books, paintings, films, museum objects and archival records that have been digitised throughout Europe. Europeana has recently opened its dataset of over 20 million cultural objects for free re-use under the Creative Commons CC0 Public Domain Dedication meaning that anyone can use the data for any purpose - creative, educational, commercial - with no restrictions.
Neelie Kroes, Vice-President of the European Commission with responsibility for the Digital Agenda for Europe, said: "Open data is such a powerful idea, and Europeana is such a cultural asset, that only good things can result from the marriage of the two. People often speak about closing the digital divide and opening up culture to new audiences but very few can claim such a big contribution to those efforts as Europeana's shift to creative commons." Europeana's huge cultural dataset opens for re-use , Press Release - The Hague, 12 September 2012
The Denton Declaration: an Open Data Manifesto, is the latest announcement on the growing debate on open data.
The declaration includes:
- Open access to research data is critical for advancing science, scholarship, and society.
- Research data, when repurposed, has an accretive value.
- Publicly funded research should be publicly available for public good.
- Transparency in research is essential to sustain the public trust.
- The validation of research data by the peer community is an essential function of the responsible conduct of research.
- Managing research data is the responsibility of a broad community of stakeholders including researchers, funders, institutions, libraries, archivists, and the public.
To read further, go to Open Access@UNT
NDAR provides the infrastructure to store, search across, and analyse various types of data. In addition, NDAR provides longitudinal storage of a research participant's information generated by one or more research studies.
In other words, NDAR is able to associate a single research participant's genetic, imaging, clinical assessment and other information even if the data were collected at different locations or through different studies.
By doing so, NDAR gives researchers access to more data than they can collect on their own and provides robust tools to analyse the information, making it easier and faster for researchers to gather, evaluate, and share autism research information from a variety of sources.
Generally, NDAR provides the following capabilities:
• Standards to enable cross site meta-analysis and data comparisons across bioinformatics systems.
• Deployment of useful bioinformatics tools for researcher use.
• Promotion of the sharing of quality research data with autism research community.
• Query access to a repository of phenotypic, genomic, imaging and pedigree research data.
In The Conversation on the 26th September 2012, Alex O. Holcombe and Matthew Todd published an open letter to the Australian Research Council on why scientific data should be shared. Alex and Matthew stated in the letter "It may be only through open science, with massively collaborative efforts, that urgent problems of the world can be solved." To read the article and letter, visit The Conversation Scientific data should be shared: an open letter to the ARC.
The European Commission's
recent announcement on access to scientific data specifically mentions not
only scientists and research institutions, but also members of the public as
potential users of scientific data.
Jonathon Gray from the Open Knowledge Foundation writes: While the benefits of open scientific data for scientists and research institutions are reasonably well documented - the Human Genome Project is probably the best known exemplar - one wonders what innovations we might see from non-experts and non-scientists, and what more open policies might mean for the public understanding of science.
Read the full article here.
Figshare goes from strength to strength in its offerings to researchers who want to put their data 'out there'.
You can now specify the 'type' of research output you upload. The range includes papers, posters, figures, datasets, media and filesets. Users who have uploaded data in the past can go back and specify a type for those older files.
Following the shutdown of nature precedings, figshare is Nature Publishing Group's preferred alternative for future pre-print publications. Documents can be uploaded and made public quickly. The service is free.
You can also now list current and former colleagues in your profile. You can read more about this feature here.
Christine Borgman, of the UCLA Department of Information Studies, has written an interesting in-depth discussion paper on "the conundrum of sharing research data":
Researchers are producing an unprecedented deluge of data by using new methods and instrumentation. Others may wish to mine these data for new discoveries and innovations. However, research data are not readily available as sharing is common in only a few fields such as astronomy and genomics. Data sharing practices in other fields vary widely. Moreover, research data take many forms, are handled in many ways, using many approaches, and often are difficult to interpret once removed from their initial context. Data sharing is thus a conundrum. Four rationales for sharing data are examined, drawing examples from the sciences, social sciences, and humanities: (1) to reproduce or to verify research, (2) to make results of publicly funded research available to the public, (3) to enable others to ask new questions of extant data, and (4) to advance the state of research and innovation. These rationales differ by the arguments for sharing, by beneficiaries, and by the motivations and incentives of the many stakeholders involved. The challenges are to understand which data might be shared, by whom, with whom, under what conditions, why, and to what effects. Answers will inform data policy and practice.
Borgman, C. L. (2012), The conundrum of sharing research data. J. Am. Soc. Inf. Sci., 63: 1059-1078. doi: 10.1002/asi.22634
Terrestrial Ecosystem Research Network (TERN) will soon be launching a data discovery portal for ecosystem scientists to share their knowledge and data. TERN is a collaborative venture of Australian science facilities which aims to integrate and share their information and knowledge. Today's world is facing complex environmental problems and TERN's mission is "to link the science and scientists both within and across disciplines" through a data portal which will collect, store and distribute important data. TERN is funded by the Australian Government though the National Collaborative Research Infrastructure Strategy and the Super Science Initiative. TERN's mission is best explained by the video TERNing Australia's environment around.
With the current global emphasis on sharing research data with the public, you might wonder - what can the public actually do with data? How can they access it, understand it, or apply it? Why might it be of interest to them, or you?
The term 'data' refers to an item of information, or, items of information considered collectively for reference or analysis (OED). 'Data' applies across disciplines and could refer to statistics on the publication of comic books, DNA sequencing or, marine life in the Arctic - it can refer to countless sets of information.
The purpose of this post then is to enlighten readers about interesting and engaging ways that data are currently being presented and utilised on the web to inform the public about current issues, and other information available to them. For starters, did you know that everyone contributes to the growing wealth of digital data, whether you work in research or not? Take a look at this infographic from Mashable - every owner of a mobile phone, email address or iTunes account produces data in the digital age. You can also check out the impact of real-time tweets across the world via A World of Tweets.
The UK's newspaper The Guardian frequently experiments with public data and
uses it to support current news stories. It has a dedicated 'Datablog' with the sole purpose
of transforming data into useful and easily understandable formats about key
issues important to the UK and globally. Some examples include:
Water leakages: which company is the worst?; The
world's top 100 airports: listed, ranked and mapped;
Freedom of Information request 2011: how many were there and which ones were
turned down?; or
What does 15 years of baby name data tell us about modern Britain?.
The Guardian sites all sources of data and makes data freely available to every reader.
Other sites and services, and particularly research centers, make data available for download to use in your own way, or create visual representations for the reader or researcher. Visualisation and infographics are the terms generally used to describe this process and there are many tools available online that allow you to work with data in this way. A few examples of such tools include: Piktochart, Gephi, Tableau public and Taxgedo.
So where can you get public data?
Apart from researchers being increasingly required to share data, many governments are also opening up data for the public (see http://data.gov.uk, http://www.data.gov/ and http://data.gov.au). You can also try datacatalogs.org - a list of open data throughout the world; The Data Hub - where you can find, share and collaborate on data; Google Public Data; or, Freebase. But there are many places to acquire data if you do a simple online search or investigate your University's academic research centers and faculty websites.
Who is talking about data in the public realm?
Beyond the academic sphere, there are large communities online discussing data and its use in the public, as well as foundations geared towards data investigation. For example, the Knight Foundation in the US has organised the Civic Data Challenge whereby citizens of the US (aged thirteen and above) are invited to access, analyse, interpret and visualise data from Civic Health CPS datasets. In addition, there are many interested individuals proactively investigating, sharing and blogging about data. Here are a few sites worth checking and some blogs worth following: Visual.ly; Well-formed Data; Daily Infographic; Visualising Data; Visualising.org; The Guardian's Datablog; and a personal favourite - Information is Beautiful.
In this video, David McCandless of Information is Beautiful illustrates the importance (and perhaps playfulness) of contextualising research data and information through creative visual means.
The National Center for Biotechnology Information (NCBI) is supportive of open data and sharing data to further collaboration and research in the biosciences.
A challenge that NCBI is faced with today, is to transform the wealth of data emerging from laboratories worldwide into knowledge which will "lead to a better understanding of biological processes underlying both health and disease."
NCBI disseminates its resources to research and medical communities with the view to integrate data and shape more meaningful views of this information. This challenge has been met through the development of a large number of databases and shared data available from the NCBI site.
Two datasets of note include,GenBank and dbGap:
GenBank database is maintained by the National Institutes of Health and made available through NCBI. The database stores all known public DNA sequences. Data are submitted to GenBank from individual scientists and science centres involved with the Human Genome Project, and are also annotated and labelled by NCBI investigators.
dbGap is the database of Genotypes and Phenotypes (dbGaP). It was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. dbGap has two levels of access - open and controlled. The open-access data can be browsed online or downloaded.
NCBI also provides a variety of tools to use and explore the data, as well as a range of educational materials, how-to guides and training resources.
Despite the inevitable funding cuts that austerity budgets bring, it's really not a bad time to be a UK researcher - lots of organisations want to help you manage and share your data.
First up, there's the new open access world opening up, courtesy of a UK government mandate.
If you are a manager of a repository or the curator of a digital collection, there are tools for archiving and preserving information packages, managing and administering repositories and depositing and ingesting digital objects.