During Preservation Week 2013, I gave a webinar about personal digital archiving. Over 600 people participated and, during the post-presentation question section, 91 people submitted questions online. I had time to answer about a dozen or so. After the webinar, the hosts from the Association for Library Collections and Technical Services sent me the complete list of questions and I’m gradually responding to all of them. Questions are always good because it helps us to improve and expand our information resources.
The questions covered a variety of topics — email preservation, file naming, digital video, file migration, scanning and digital asset management — but the most striking fact is that two-thirds of all the questions could be grouped into just two main topics: digital photos and storage.
Interest in digital photos is not surprising. Most of the questions we get at NDIIPP personal-digital-archiving presentations are related to digital photos. The webinar questions about storage were also not surprising; with the variety of available digital storage options and the uncertainty about their reliability, storage can be a perplexing topic.
I’d like to share a few of the webinar questions in this post. There’s not enough space to cover both topics today so I will just do the digital photo ones. I will post the digital storage questions in a future column.
Photographer David Riecks, of photometdata.org, helped answer the more difficult questions. Since many of the questions were variations on the same theme, I mashed some of the more representative ones together.
Which is better for preservation, JPEG or TIFF? I have heard that TIFF is better because of degrading. Do JPEGs deteriorate?
TIFF is a lossless format, though newer versions of photo-processing applications such as Photoshop have options to save TIFF files with various forms of lossless compression. A lossless file format is especially good if you plan to return to the file to make tone or color changes, or to retouch the photo. When you finish with the file and close it, there is no data compression and no image data is lost.
TIFF files require more storage space than JPEGs because of their relatively larger data-rich sizes, so some photographic organizations use a form of lossless file compression called LZW. It does take a bit of time to pack the file and each time you open the file it may take a bit of time to expand it. But no data is thrown away and the image does not degrade over time.
If you scan a photo, it is a good practice to save the scan as a TIFF, rather than as a JPEG or PDF, because of the TIFF’s losslessness. In addition, if you want the maximum quality, you can even capture and save up to 16 bits per channel in an RGB TIFF; JPEG only allows for 8 bits per channel.
If you want to share a digital photo that is in a TIFF file format, saving or exporting a copy of it as a JPEG is a fine option. A JPEG can be viewed a web browser and it takes less bandwidth to transmit or download. Always keep the original TIFF though.
If your original digital photo file is a JPEG and you don’t intend to modify it, you can archive it as it is. There is no benefit to converting it to a TIFF if you are not going to modify it. The “lossy” aspect of JPEG becomes an issue when you modify the JPEG and save it — and consequently compress it.
JPEG compression of image data results in some loss of image information, which is why it is referred to as lossy. Compression is not inherently bad; light compression reduces a file size and the lost image information is barely visible. But the more you compress a file, the more information you lose and the worse the photo looks. Once that digital information is lost, you can never get it back.
If you take a TIFF file and save it as a high quality JPEG with a low compression setting, the JPEG may occupy a fraction of the disk space that the TIFF would have occupied. However, if you were to open the JPEG again, make tone or color changes and then re-save it, you would subject it to another round of compression; after multiple rounds of modification and re-compression you would begin to see degradation in the image file.
The amount and quality of compression applied to a JPEG file is an important factor in its quality. In Photoshop, there are two means of creating a JPEG. One uses a quality scale of 1 to 12, with 12 being the least compression or “maximum quality” and it results in the largest file size. Quality equals size. The higher the quality, the larger the file size; the lower the quality, the greater the data loss and the smaller the file size.
The type of JPEG compression applied in a camera will be different from that used in Photoshop. Some of the newer cameras have several settings, ranging from a “Basic” JPEG to a “Superfine” JPEG. These settings probably have a rough equivalent setting to Photoshop but they are not exactly the same.
When modifying digital photos, never modify the original. Always make a copy and modify the copy. You can compress copies for upload or delete copies if you are not happy with the results. Be careful to save the copy with a different name than the original; otherwise it will overwrite and replace the original.
The JPEG 2000 format has both a lossless and a lossy means of compression. Like TIFF, JPEG 2000 can store files with more than 8 bits per channel, though it requires less storage space than a TIFF. Note that while you can substantially reduce a JPEG 2000 file size, there are fewer applications that can create and open this file format compared to a TIFF. If you are considering converting your files to JPEG 2000, do some tests first.
Here’s a tip: if you open a JPEG image in a photo-processing application, modify it and save the retouched image as a TIFF (with or without LZW compression), then this TIFF image will not be any further degraded or compressed than the original. However, if you apply curves or levels to the image, then you will more than likely introduce some loss of data, since both these ways of modifying the tonal distribution of the image do so by squishing or stretching out the original data.
Does adding metadata affect the photo file? If you add descriptive information using particular software, will any other software enable you to view that information or is it all proprietary? Are there any open-source options for adding metadata?
You can modify the metadata about the image — such as caption, description and keywords — with a number of programs. Most of these will only modify the file header information, not the image pixels. [See "An Easy Way to Add Descriptions to Digital Photos," part 1 and part 2.] Adding metadata to a photo file does not subject the image to compression, so the quality of the image will not change. Since the metadata text does take up a little bit space, the size of the image will increase slightly.
Information written to the file header of JPEG images can be read by many applications and, in newer computers, even the operating system itself. For instance in Windows Vista and Windows 7/8, the WIC (Windows Imaging Component) allows you to see this information simply by “right clicking” and viewing the image properties. With Macs, from OS 10.5 forward, the information is visible by using “Preview” and Command + I (view info).
If you add metadata to TIFF files, much is the same as with JPEGs, though not all programs will work. Other special and proprietary file formats like Photoshop files (PSD) and camera RAW files (NEF, CR2) are even more problematic in terms of image metadata and review by other programs.
Most software use the IPTC or XMP standards to store embedded photo metadata. Picasa uses the older IPTC standard. Photoshop uses XMP for storing metadata: this includes the IPTC Core, IPTC Extension, PLUS and more. Information entered with Picasa can be read by Photoshop. The reverse is not always true.
You can find a list of photometadata resources at controlledvocabulary.com.
Does frequently opening digital photos, JPEGs, degrade the quality or is that due to compression?
Moving a JPEG from one location to another will not degrade the image but if the file is corrupted in transit (due to, say, a virus), it will likely not be openable.
It’s important to understand that while compression is used in saving the JPEG file, and the JPEG image has to be decompressed before you can view it, there is no change to the image just through the act of opening the file. Re-compressing the file changes it.
If you “Save” the opened JPEG file, rather than just close the open file (exit without saving), you can cause the file to degrade over time with each “open/save” action. Typically the only time you would be asked to save the file is after modifying the image pixels, such as changing the tone or color, or retouching, cropping or removing red-eye.
You might consider making pixel changes to your JPEG and saving the digital photo as a (lossless) TIFF file.
You mentioned scanning at 300 dpi for the standard photograph sizes. Would you use a different dpi if you were scanning a color photograph versus a black and white photograph?
You could scan a b&w photo using the “grayscale” option rather than the RGB color option, but you’d want at least 300 dpi/ppi regardless.