The following is a guest post by Carl Fleischhauer, a Digital Initiatives Project Manager at the Library of Congress.
During December 2015, the Library’s Format Sustainability website added descriptions of eleven members of the Open Document Format family, aka OpenDocument and ODF. These eleven join a number of other format descriptions mounted in 2015, many of which are also carried in the Library’s Recommended Format Statement, first published in 2014 and revised in early 2015.
These two complementary websites support the Library’s ever-increasing acquisition of born-digital content. The Recommended Format Statement is designed to inform staff and external content creators about preferred and acceptable formats to acquire for the Library’s holdings. These formats are ones for which the Library believes that the provision of access and long-term preservation management will be feasible.
Meanwhile, the Format Sustainability website provides technical descriptions about formats of all types, candidates for the recommended list as well as those that may be deemed to be unsuitable for acquisition. This information is intended to aid staff when they assess new content offerings and when they revise and refine the Recommended Format Statement.
In the Recommended Format Statement, ODF is listed as an acceptable type in the “text” category, in the same bullet with OOXML, the XML expression of Microsoft’s family of Office formats. This section of the Recommended Format Statement carries a parenthetical comment that features the term “electronic books.” (For more on OOXML, see my post from February 3, 2015.)
The truth is, however, that examples of ODF and OOXML will be most frequently encountered as born-digital segments within collections of personal papers and organizational records, the types of unpublished materials that are acquired by the Library’s special collection divisions. (In contrast, “e-publications” will most often be acquired in formats like ePub; other publisher-favored, schema-governed XML formats; and as PDF files.)
Like many other complex format families, ODF exists in several versions and “parts.” ODF is developed and maintained under the auspices of the Organization for the Advancement of Structured Information Standards (OASIS), headquartered in Massachusetts. ODF has also been approved as an international standard through the ISO/IEC joint technical committee JTC1, a collaborative effort of the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC), both headquartered in Switzerland. Our descriptions focus on the versions approved by both standards bodies, with an emphasis on ODF version 1.2. Where version 1.1 differs substantially from version 1.2, separate descriptions have been produced.
Here’s the list of ODF-related formats added to the sustainability website last month:
- ODF_Family, ODF (OpenDocument Format) Family, OASIS and ISO/IEC 26300
- ODF_package_1_1, OpenDocument Package Format, ODF 1.1, ISO/IEC 26300:2006
- ODF_package_1_2, OpenDocument Package Format, ODF 1.2 part 3; ISO/IEC 26300-3:2015
- ODF_text_1_1, OpenDocument Text Format (ODT), ODF 1.1, ISO/IEC 26300:2006
- ODF_text_1_2, OpenDocument Text Format (ODT), ODF 1.2, ISO/IEC 26300-1:2015
- ODF_chart_1_2, OpenDocument Chart Format (ODC), ODF 1.2, ISO/IEC 26300-1:2015
- ODF_draw_1_2, OpenDocument Drawing Format (ODG), ODF 1.2, ISO/IEC 26300-1:2015
- ODF_spreadsheet_1_1, OpenDocument Spreadsheet Format (ODS), Version 1.1, ISO 26300:2006
- ODF_spreadsheet_1_2, OpenDocument Spreadsheet Format (ODS), Version 1.2, ISO 26300:2015
- ODF_dbfront_1_2, OpenDocument Database Front End Document Format (ODB), Version 1.2, ISO 26300-1:2015
- ODF_presentation_1_2, OpenDocument Presentation Document Format (ODP), Version 1.2, ISO 26300-1:2015
When sorting out the taxonomy and history of digital formats, the sustainability team has often been intrigued by the intricacy and nuances of a given format’s history. ODF’s ancestry takes us back to the 1980s, but the narrative line sharpens in the early 2000s, when the format was being refined and two factors influenced the development team. First, they drew inspiration from the movement for open government, arguably dating from the eighteenth century Enlightenment but taking on fresh intensity in the Internet age. The introduction to a 2006 ODF white paper (PDF) states, “In the case of public [governmental] documents . . . no resident should be excluded from data access [and/or] . . . forced to buy software from one particular vendor or for one particular operating system platform.”
A second motivation for the format’s developers was the need–felt by memory institutions in many nations–to preserve documents for the long term, an outcome that was seen as threatened by the widespread use of commercial office software applications and by the proprietary binary document formats they produced at that time. In the words of the white paper cited above, the use of an open-source format “guarantees long-term access to data even if companies cease to operate, change their strategies, or dramatically raise their prices.”
The sustainability team’s principal investigator for ODF, Caroline Arms, has prepared an outline of the format’s history. Her main findings are presented in the ODF family description. The story begins at Sun Microsystems, a private company founded in 1982 and acquired by Oracle in 2010. In 1999, Sun acquired the German StarOffice software suite (first released in 1985) and quickly made it available as a free download. This edition of StarOffice produced binary files but within a year or two, the Sun team had modified the tool to produce output files in XML and, like StarOffice, made this application available at no cost.
In 2002, after more elaborations, the output format–by then referred to as the OpenOffice.org 1.0 format–was submitted for standardization to OASIS. Sun was joined in this standardization effort by Boeing, Stellent, Arbortext, the National Archives of Australia, and the Society of Biblical Literature. To buttress the standard and to encourage wider acceptance, the OASIS ODF technical committee also moved the specification to ISO/IEC JTC1, where it was published as ISO/IEC 26300 in 2006, designated as version 1.0, amended in 2012 to align with version 1.1. In 2015, ISO/IEC published version 1.2 in three parts and, at this writing, OASIS is developing what will be version 1.3.
The development of software to support ODF changed course after Oracle’s 2010 purchase of Sun. Oracle was not interested in continuing the activity and multiple independent efforts soon emerged. Two important examples are the LibreOffice effort, formally coordinated through a German non-profit doing business as The Document Foundation, and the Apache OpenOffice (AOO) project, running under the auspices of the U.S.-based Apache Software Foundation. These two efforts involve a worldwide community of volunteer coders. Their codebases have grown apart since 2010, and the ODF family format description summarizes a number of perspectives on the implications of this sometimes confusing circumstance.
The ODF family format description also identifies some the organizations, including government bodies, that have adopted the ODF family of formats as mandatory or recommended for documents that must be editable in order to support collaboration within the government or between the government and the public. Success in this area represents a payoff for the creators’ initial goal of supporting open government. Here are a few selected examples:
- Brazil, 2008. The ePING (Standards for Interoperability for Electronic Government) includes ODF 1.2 and ISO/IEC 26300: 2008 as the only editable formats for office documents.
- Norway, 2009. Norway adopted a new set of obligatory information technology standards, mandating ODF as the only editable format for exchanging documents between the government and users by email. See announcement and summary in English.
- Germany, 2011. Version 5.0 of the German Standards and Architecture for e-Government Applications (SAGA) includes ODF and OOXML among its formats under observation.
- Portugal, 2012. Regulation incorporating a list of mandatory formats. The only editable format for documents listed was ODF 1.1.
- The United Kingdom, 2014. Sharing or collaborating with government documents mandates ODF 1.2.
- Denmark, 2014. eGovernment recommendation v.16.0 indicates that Denmark continues to accept editable documents in “all common formats (including OpenDocument Format – ODF and Office Open XML – OOXML).”
- EU, 2014. Statement from the Vice-President of the European Commission Maroš Šefčovič recommended, “For revisable documents, all European institutions are recommended to support as a minimum two ISO standards, the Open Document Format (ODT) and Office Open XML (OOXML).”
- United States, 2014. ODF 1.0 is recommended in the National Archives transfer guidance statement.
- Canada, 2015. Libraries and Archives Canada’s Guidelines on File Formats for Transferring Information Resources of Enduring Value lists ODF 1.0 as a preferred format.