Data Management Threshold Concepts

We’ve been going through the new ACRL “Framework for Information Literacy for Higher Education” recently at work. This document discusses ways to teach students how to search and understand information resources, framing critical skills as “threshold concepts”. While the Framework itself is interesting, I’m really intrigued by the idea of a threshold concept and wonder if there are any threshold concepts for data management.

For those unfamiliar with the term, a “threshold concept” is an idea that, once understood, completely reframes the way you view a topic. It’s like seeing a hidden image in that it’s very difficult to un-see the image afterward. Threshold concepts are so fundament to understanding that it’s actually necessary to understand the concept in order to progress in the field.

Let’s look at the ACRL Framework to better understand how such concepts work. The six concepts are:

  • Authority is Constructed and Contextual
  • Information Creation as a Process
  • Information Has Value
  • Research as Inquiry
  • Scholarship as Conversation
  • Searching as Strategic Exploration

If you understand these concepts, you’ll easily see, for example, why a scholarly article may be an appropriate source for one research project while a blog post would be better for a different project, depending on the topic. Or why searching doesn’t always turn up the content you are looking for on the first try. Etc.

This blog post is not about the new Framework, but rather how the Framework challenged me to think about what the threshold concepts are for data management. Taking a stab at it (directly cribbing from the Framework), I have three ideas:

  • Data is Contextual
  • Data Management is a Process
  • Data Has Value

Let’s look at these individually to get into what I mean in each case.

First, data is contextual. That means that data never exists independently of the information about how it was acquired and processed. Just like a chemist records notes about her data in a lab notebook, so should any dataset come with enough documentation to be understood by someone who is not the dataset creator. Without this extra information, the data is practically useless.

Second, data management is a process. It’s not something that you do once and are done with forever. It’s a process by which you take better care of your data continually over time. That doesn’t mean that it’s incredibly difficult. Rather, it’s like doing regular preventative maintenance to avoid disaster.

Third, data has value. This is something that many researchers are currently grappling with due to new data sharing requirements. If you can understand that your data has value, you can see how published data adds richness to an article, why data should be preserved after the end of a project, and why other researchers might want to use your data (hint: it’s valuable!).

These three ideas are by no means the final say on threshold concepts in data management, only my initial ideas. I’m still mulling them over (for example, I’m wondering if “data is contextual” and “data has value” are truly independent concepts) and trying to figure out if there are more concepts in this field.

I would love to hear other people’s ideas about threshold concepts in data management. Has anyone had an “aha!” moment about something that really affected the way they think about data management? Let me know!