Category: Open Data
Repository of the Week - the Data Hub
The Data Hub describes itself as "the easy way to get, use and share data".
The Data Hub is a community driven catalogue of datasets on the Internet. It uses open-source data cataloguing software CKAN, which provides each dataset record with fields for descriptions, formats, ownership, access and subject areas, among others.
Most of the data indexed is open data, which means it is openly licensed, and free to use.
On the site, you can:
- Find data - the Hub contains 3840 datasets that can be viewed or downloaded
- Share data - sign up to add your own datasets
Datasets can also be located under groups, such as Linking Open Data, which contains 81 datasets and Bibliographic Data, which has 77 datasets.
In some cases, the Data Hub can provide data storage, and basic visualisation tools.
Visit the Data Hub at - http://thedatahub.org/
- Login to post comments
Repository of the Week - The Atlas of Living Australia
The Atlas of
Living Australia (Atlas) contains information on all the known species in
Australia aggregated from a wide range of data providers: museums, herbaria,
community groups, government departments, individuals and universities. The
Atlas is the Australian node of the Global
Biodiversity Information Facility (GBIF), and since 2001 the GBIF has been
encouraging free and open access to biodiversity data through global online
networks.
The Atlas of Living Australia can be used to:
- access information pages for each species containing photos, descriptions, maps and observations
- access scientific and common names
- explore the flora and fauna reported around your neighbourhood
- learn about Australia's biodiversity collections at museums, herbaria and other institutions
- learn about citizen science projects
- map, analyse and visualise biodiversity and environmental data and trends
- access tools to help track changes in biodiversity and the environment
- download and use open source tools
- download biodiversity data
- access images, literature and genetic information through Australian nodes of international data repositories
- volunteer for digitisation projects
- Upload datasets.
There are 370 datasets available in the Atlas, and the licensing tags make it clear which data can be used and how. The site also provides extensive explanatory information and help pages, including overviews on how data are integrated and described.
Search for records in the Atlas or browse the site today at http://www.ala.org.au/.
The Atlas of Living Australia is an Australian Government Initiative and is licensed under the Creative Commons Attribution 3.0 Australia License.
- Login to post comments
Data - it's out there
With the current global emphasis on sharing research data
with the public, you might wonder - what can the public actually do with data?
How can they access it, understand it, or apply it? Why might it be of interest
to them, or you?
The term 'data' refers to an item of information, or, items of information considered collectively for reference or analysis (OED). 'Data' applies across disciplines and could refer to statistics on the publication of comic books, DNA sequencing or, marine life in the Arctic - it can refer to countless sets of information.
The purpose of this post then is to enlighten readers about interesting and engaging ways that data are currently being presented and utilised on the web to inform the public about current issues, and other information available to them. For starters, did you know that everyone contributes to the growing wealth of digital data, whether you work in research or not? Take a look at this infographic from Mashable - every owner of a mobile phone, email address or iTunes account produces data in the digital age. You can also check out the impact of real-time tweets across the world via A World of Tweets.
The UK's newspaper The Guardian frequently experiments with public data and
uses it to support current news stories. It has a dedicated 'Datablog' with the sole purpose
of transforming data into useful and easily understandable formats about key
issues important to the UK and globally. Some examples include:
Water leakages: which company is the worst?; The
world's top 100 airports: listed, ranked and mapped;
Freedom of Information request 2011: how many were there and which ones were
turned down?; or
What does 15 years of baby name data tell us about modern Britain?.
The Guardian sites all sources of data and makes data freely available to every
reader.
Other sites and services, and particularly research centers, make data available for download to use in your own way, or create visual representations for the reader or researcher. Visualisation and infographics are the terms generally used to describe this process and there are many tools available online that allow you to work with data in this way. A few examples of such tools include: Piktochart, Gephi, Tableau public and Taxgedo.
So where can you get public data?
Apart from researchers being increasingly required to share data, many governments are also opening up data for the public (see http://data.gov.uk, http://www.data.gov/ and http://data.gov.au). You can also try datacatalogs.org - a list of open data throughout the world; The Data Hub - where you can find, share and collaborate on data; Google Public Data; or, Freebase. But there are many places to acquire data if you do a simple online search or investigate your University's academic research centers and faculty websites.
Who is talking about data in the public realm?
Beyond the academic sphere, there are large communities online discussing data and its use in the public, as well as foundations geared towards data investigation. For example, the Knight Foundation in the US has organised the Civic Data Challenge whereby citizens of the US (aged thirteen and above) are invited to access, analyse, interpret and visualise data from Civic Health CPS datasets. In addition, there are many interested individuals proactively investigating, sharing and blogging about data. Here are a few sites worth checking and some blogs worth following: Visual.ly; Well-formed Data; Daily Infographic; Visualising Data; Visualising.org; The Guardian's Datablog; and a personal favourite - Information is Beautiful.
In
this video, David McCandless of Information is Beautiful illustrates the
importance (and perhaps playfulness) of contextualising research data and
information through creative visual means.
- Login to post comments
NCBI - Meeting the challenge
The National Center for Biotechnology Information (NCBI) is supportive of open data and sharing data to further collaboration and research in the biosciences.
A challenge that NCBI is faced with today, is to transform the wealth of data emerging from laboratories worldwide into knowledge which will "lead to a better understanding of biological processes underlying both health and disease."
NCBI disseminates its resources to research and medical communities with the view to integrate data and shape more meaningful views of this information. This challenge has been met through the development of a large number of databases and shared data available from the NCBI site.
Two datasets of note include,GenBank and dbGap:
GenBank database is maintained by the National Institutes of Health and made available through NCBI. The database stores all known public DNA sequences. Data are submitted to GenBank from individual scientists and science centres involved with the Human Genome Project, and are also annotated and labelled by NCBI investigators.
dbGap is the database of Genotypes and Phenotypes (dbGaP). It was developed to archive and distribute the results of studies that have investigated the interaction of genotype and phenotype. dbGap has two levels of access - open and controlled. The open-access data can be browsed online or downloaded.
NCBI also provides a variety of tools to use and explore the data, as well as a range of educational materials, how-to guides and training resources.
- Login to post comments


Loading