When New York City’s Taxi and Limousine Commission made publicly available 20GB worth of trip and fare logs, many welcomed the vast trove of open data. Unfortunately, prior to being widely shared, the personally identifiable information had not been anonymized properly. Vijay Pandurangan describes the structure of the data, what went wrong with its release, how easy it is to de-anonymize certain […]
Category: open data
The Outing of the Medical Profession: Data marathons to open clinical research gates to frontline service providers.
Could greater data transparency across the medical field solve the problem of unreliable evidence? Dr. Leo Anthony Celi charts the efforts to improve the publicly available MIMIC database, a creation of the public-private partnership between MIT, Beth Israel Deaconess Medical Center and Philips Health-Care, through a series of data marathons. Data scientists, nurses, clinicians and doctors are coming together to collaborate and answer clinically relevant […]
Fostering open science
Training for EC project officers on open access and open data in Horizon 2020
We ran four half-day workshops at the end of June as part of the FOSTER project. FOSTER aims to facilitate open science by training researchers about open access a…
The Philosophy of Data Science (series) – Rob Kitchin: “Big data should complement small data, not replace them.”
Over the coming weeks we will be featuring a series of interviews conducted by Mark Carrigan on the nature of ‘big data’ and the opportunities and challenges presented for scholarship with its growing influence. In this first interview, Rob Kitchin elaborates on the specific characteristics of big data, the hype and hubris surrounding its advent, and the distinction between data-driven science and empiricism. What […]
All that Big Data Is Not Going to Manage Itself: Part One
On February 26, 2003 the National Institutes of Health released the “Final NIH Statement on Sharing Research Data.” As you’ll be reminded when you visit that link, 2003 was eons ago in “internet time.” Yet the vision NIH had for the expanded sharing of research data couldn’t have been more prescient. As the Open Government […]
Thomas Piketty’s Capital changed the global discussion about inequality because of its great data – now make it open.
The rich data informing Thomas Piketty’s landmark research in Capital in the Twenty-First Century has been instrumental to its success. Ulrich Atz argues it is highly commendable that Piketty has made attempts to share the data files. But none of this data is explicitly open for reuse and fails to be available in machine-readable formats. Without an open licence it is not clear whether […]
Feedback Wanted: Publishers & Data Access
This post is co-authored with Jennifer Lin, PLOS Short Version: We need your help! We have generated a set of recommendations for publishers to help increase access to data in partnership with libraries, funders, information technologists, and other stakeholders. Please read and comment on the report (Google Doc), and help us to identify concrete action items for each of the recommendations […]
It’s the Neoliberalism, Stupid: Why instrumentalist arguments for Open Access, Open Data, and Open Science are not enough.
The Open Movement has made impressive strides in the past year, but do these strides stand for reform or are they just symptomatic of the further expansion and entrenchment of neoliberalism? Eric Kansa argues that it is time for the movement to broaden … Continue reading →
Four critiques of open data initiatives
Open data initiatives may hold much promise and value, but more attention is needed on how these projects are developing as complex socio-technical systems. Rob Kitchin elaborates on four specific areas that have yet to be fully interrogated. These critiques affirm … Continue reading →
Researchers, publishers, libraries and data centres all have a role in promoting and encouraging data citation.
The key to verifying and validating research is the identification and access of datasets. But cultural and behavioural barriers to sharing data are still widespread. Rachael Kotarski, the Content Expert for scientific datasets at the British Library, explains why citing data, … Continue reading →