Category: Best practices

CC BY and data: Not always a good fit

This post was originally published on the University of California Office of Scholarly Communication blog. Last post I wrote about data ownership, and how focusing on “ownership” might drive you nuts without actually answering important questions about what can be done with data. In that context, I mentioned a couple of times that you (or […]

Data Science meets Academia

(guest post by Johannes Otterbach) First Big Data and Data Science, then Data Driven and Data Informed. Even before I changed job titles—from Physicist to Data Scientist—I spent a good bit of time pondering what makes everyone so excited about these things, and whether they have a place in the academy. Data Science is an […]

Data (Curation) Viz.

Data management and data curation are related concepts, but they do not refer to precisely the same things. I use these terms so often now that sometimes the distinctions, fuzzy as they are, become indistinguishable. When this happens I return to visual abstractions to clarify —in my own mind—what I mean by one vs. the […]

Science Boot Camp West

Last week Stanford Libraries hosted the third annual Science Boot Camp West (SBCW 2015), “… building on the great Science Boot Camp events held at the University of Colorado, Boulder in 2013 and at the University of Washington, Seattle in 2014. Started in Massachusetts and spreading throughout the USA, science boot camps for librarians are […]

The 10 Things Every New Grad Student Should Do

It’s now mid-October, and I’m guessing that first year graduate students are knee-deep in courses, barely considering their potential thesis project. But for those that can multi-task, I have compiled this list of 10 things that you should undertake in your first year as a grad student. These aren’t just any 10 things… they are 10 […]

Git/GitHub: A Primer for Researchers

I might be what a guy named Everett Rogers would call an “early adopter“. Rogers wrote a book back in 1962 call The Diffusion of Innovation, wherein he explains how and why technology spreads through cultures. The “adoption curve” from his book has been widely used to  visualize the point at which a piece of technology or […]

Abandon all hope, ye who enter dates in Excel

Big thanks to Kara Woo of Washington State University for this guest blog post! Update: The XLConnect package has been updated to fix the problem described below; however, other R packages for interfacing with Excel may import dates incorrectly. One should still use caution when storing data in Excel. Like anyone who works with a lot of […]

Software Carpentry and Data Management

About a year ago, I started hearing about Software Carpentry. I wasn’t sure exactly what it was, but I envisioned tech-types showing up at your house with routers, hard drives, and wireless mice to repair whatever software was damaged by careless fumblings. Of course, this is completely wrong. I now know that it is actually […]

Software for Reproducibility Part 2: The Tools

Last week I wrote about the workshop I attended (Workshop on Software Infrastructure for Reproducibility in Science), held in Brooklyn at the new Center for Urban Science and Progress, NYU. This workshop was made possible by the Alfred P. Sloan Foundation and brought together heavy-hitters from the reproducibility world who work on software for workflows. I provided some broad-strokes overviews last […]

Software for Reproducibility

Last week I thought a lot about one of the foundational tenets of science: reproducibility. I attended the Workshop on Software Infrastructure for Reproducibility in Science, held in Brooklyn at the new Center for Urban Science and Progress, NYU. This workshop was made possible by the Alfred P. Sloan Foundation and brought together heavy-hitters from the reproducibility world who […]