Dating Your Data (or How I Learned to Stop Worrying and Love the Standard)

I’m going to come right out an admit something terribly nerdy: I have a favorite standard. It’s ISO 8601. My having a favorite standard probably doesn’t surprise you, as I am a person who writes a blog on data management for fun. Why wouldn’t I have a favorite standard? But my having a favorite standard isn’t something for me alone (though I do use the standard often), it’s because ISO 8601 is incredibly useful for data management. Therefore, I want to make my favorite standard your favorite standard too.

The standard ISO 8601 concerns dates, a common type of information used for data and documentation. To understand why this standard is important, consider the following dates:

  • March 5, 2014
  • 2014-03-05
  • 3/5/14
  • 05/03/2014
  • 5 Mar 2014

All of these represent the same date but are expressed in different formats. The problem is that if I use all of these formats in my notes, how will I ever find everything that happened on March 5th? It’s simply too much work to search for all the possible variations. The answer to this problem is ISO 8601.

ISO dictates that all dates should use the format “YYYYMMDD” or “YYYY-MM-DD”. So the example above becomes “20140305” or “2014-03-05”. This provides you with a consistent format for all of your dates. Such consistency allows you to more easily find and organize your data, the hallmark of good data management.

ISO 8601’s consistency is nice in and of itself, but here’s where things get really awesome: when you use ISO 8601 dates at the beginning of file names. This is because dates using this standard sort chronologically by year, by month, and then by date. So if you date all of your file names using ISO 8601, you suddenly have a super easy way to find and sort through information.

Let me give you an example to show you how wonderful this is. I recently cleaned up over 10 years of files for a committee that I am currently on. The committee’s membership changes each calendar year and it was hard to find specific files from previous committees. My solution was to make all the file names start with a date. This makes everything super easy to find and I can now simply ignore content from years I don’t need.

The other great thing about using the “YYYY-MM-DD” format is that you can mix and match how specific your dates are. For example, all of my presentation files live in folders labeled by date. One-off presentations are given an exact date, eg. “2014-04-30_DataManagementWebinar”, while presentations that I give multiple times are only given a year, eg. “2013_CreatingADMP”. I also have files that end up with just a year and a month, eg. “2012-09_Website”. No matter the specificity of the date, all of these files sort chronologically. It’s a beautiful thing.

I highly recommend using ISO 8601 as the way you write dates in your research. It’s a trivially small change, but can have a huge impact in terms of how easy it is to find and use your content. That is data management at its very best.