Category: Data science

Digital Object Identifiers: Stability for citations and referencing, but not proxies for quality

What exactly is a Digital Object Identifier (DOI) and how does it help in the management and long-term preservation of research? Laurence Horton explains the basic structure and purpose of a DOI and also points to some limitations. DOIs are not the only way of providing fixed, persisting references to objects, but they have emerged as the leading system. A DOI is a Digital […]

A clear distinction is needed between replication tests and the evaluation of robustness in social science literature

Confusion over the meaning of replication is harming social science, argues Michael Clemens. There has been a profound evolution in methods and concepts, particularly with the rise of empirical social science, but our terminology has not yet caught up. The meaning of replication must be standardized so that researchers can easily distinguish between replication efforts and the evaluation of robustness.  In Milan Kundera’s The Unbearable […]

Studies in social data: how industry uses social media for communications and research.

A series of meetups have been arranged for those interested in the use and applications of social data. Farida Vis provides a brief overview of the latest event on business uses of social data. Speakers reflected on principles for handling data, the need to collaborate externally, and how to look more closely at the full lifecycle of social data. Sometimes social data […]

Why the ban on P-Values? Understanding sampling error is key to improving the quality of research.

The weight placed on p-values and significance testing has come under increasing criticism, with one social psychology journal banning their use entirely. Nicole Radziwill argues that many of the issues come down to sampling errors. Inferential statistics is good because it lets us make decisions about a whole population based on one sample. But inferential statistics is bad if your sample size is too […]

The researcher’s guide to literature: Visualising crowd-sourced overviews of knowledge domains.

Given the enormous amount of new knowledge produced every day, keeping up-to-date on all the literature is increasingly difficult. Peter Kraker argues that visualizations could serve as universal guides to knowledge domains. He and colleagues have come up with an interactive way of automating the visualisations of entire knowledge domains and relevant articles within fields. Through similarity measures identified in a Mendeley powered data-set, an interested […]

Philosophy of Data Science – Emma Uprichard: Most big data is social data – the analytics need serious interrogation

In the final interview in our Philosophy of Data Science series, Emma Uprichard, in conversation with Mark Carrigan, emphasises that big data has serious repercussions to the kinds of social futures we are shaping and those that are supporting big data developments need to be held accountable. This means we should also take stock of the methodological harm present in many big data […]

Introduction to Open Science: Why data versioning and data care practices are key for science and social science.

A significant shift in how researchers approach their data is needed if transparent and reproducible research practices are to be broadly advanced. Carly Strasser has put together a useful guide to embracing open science, pitched largely at graduate students. But the tips shared will be of interest far beyond the completion of a PhD. If time is spent up front thinking about file organization, sample […]

Philosophy of Data Science series – Sabina Leonelli: “What constitutes trustworthy data changes across time and space”

The next installment of the Philosophy of Data Science series is with Sabina Leonelli, Principal Investigator of the ERC project, The Epistemology of Data-Intensive Science. Last year she completed a monograph titled “Life in the Digital Age: A Philosophical Study of Data-Centric Biology”, currently under review with University of Chicago Press. Here she discusses with Mark Carrigan the history of data-centric science and […]