This Document Is Superceded by: Western States Digital Imaging Best Practices
General Guidelines for Scanning
CDP Scanning Working
Technical Writer: Erin Rhodes
NOTE: Libraries, archives, museums, and educational institutions may freely use these guidelines, with appropriate credit given to the Colorado Digitization Program. Others should contact the Project Director for permission to copy or make other use.
Table of Contents
The goal of the Colorado Digitization Program is to provide the people of Colorado access to Colorado's unique and special collections in digital format through the collaborative effort of Colorado archives, libraries, museums, and historical societies. By providing these standards, the Colorado Digitization Program is not advocating that every organization buy their own equipment and scan in-house. Rather, we suggest that organizations investigate the many commercial vendors that provide high-quality imaging services to libraries, archives, and museums; or that organizations look to regional-based or inter-institutional cooperative ventures before beginning a digital imaging project or investing in equipment.
Although many digitization projects in Colorado are scanning for purposes of increased access to their collections, preservation is often a natural by-product of any digitization project. The Colorado Digitization Program advocates that organizations approach digitization projects with a "preservation mindset." This mindset implies:
Although this document suggests the minimum standards, organizations should not just "do the minimum." Scanning at a higher resolution than this document recommends is encouraged.
The purpose of this document is to offer guidance and to provide minimum scanning recommendations to Colorado institutions that are planning for or involved in digitization projects. It is aimed towards institutions that have or are up the equipment and expertise to scan in-house. This document addresses the more standard formats of text, photographs, maps, and graphic materials. If you are planning to scan primarily oversize materials, bound materials, or materials in non-standard formats and sizes, you might consider outsourcing these materials to imaging vendors. A list of vendors is supplied in Appendix C.
These recommendations have been developed in order to: (1) ensure a consistent, high level of image quality across collections; (2) encourage widespread and convenient access to digital images by supporting the use of standard or industry formats that are widely accepted; and (3) decrease the likelihood of rescanning in the future by promoting best practices for conversion of materials into digital format. The recommendations we make in this document are purposely broad enough to apply to a variety of institutions and collections, and attempt to synthesize different recommendations previously made for specific institutions or projects.
What is addressed in this document:
What is not addressed in this document:
Guidelines for workflow, metadata standards, and selection for digitization are addressed in other, complementary documents on the Colorado Digitization Program website.
Because technology and industry standards and preferences are constantly
improving and changing, we view this as a continually evolving document.
We welcome your comments and suggestions.
Every organization should carefully consider the pros and cons of outsourcing digitization projects or conducting them in-house. Following are some points to consider for both strategies:
(See Kenney and Chapman, "Digital Imaging for Libraries and Archives"
for further information).
There are many excellent models and resources available in print and online that have articulated different standards for scanning a variety of materials. The Colorado Digitization Program drew upon the following resources in developing these recommendations. We suggest that institutions review the following resources for further information on scanning practices and procedures prior to implementing a digitization project.
Additionally, we recommend Howard Besser's and Jennifer Trant's excellent online tutorial, "Introduction to Imaging" for a primer on the vocabulary and technology of an imaging project and the issues surrounding the construction an image database. This tutorial can be found at: http://www.getty.edu/redirect/redirect_gri_intromages.html
Besser, Howard. Best Practices for Image Capture. Word
Besser, Howard. Procedures and Practices for Scanning.
California Digital Library: Digital Object Standard: Metadata, Content
and Encoding (May 18, 2001)
California Digital Library: Digital Image Format Standards (July 9, 2001)
Fleischauer, Carl. Digital Formats for Content Reproductions.
Library of Congress, July 13, 1998.
Kenney, Anne and Steven Chapman. Digital Imaging for Libraries and Archives. Ithaca: New York, Department of Preservation and Conservation, Cornell University Library, June 1996.
Macklin, Lisa A. and Sarah L. Lockmiller. Digital Imaging of Photographs: A Practical Approach to Workflow Design and Project Management. Chicago: Illinois, American Library Association, 1999.
Puglia, Steven and Barry Roginski. NARA Guidelines for Digitizing
Archival Materials for Electronic Access. National Archives and Records
Administration, January 1998.
Technical Advisory Services for Images. An Introduction to Making
Digital Image Archives. University of Bristol, United Kingdom.
International Imaging Industry Association (I3A) is a consortium
of image and imaging companies engaged in the development of digital imaging
standards and technologies for the consumer market. The membership of
more than 80 participants includes Eastman Kodak, Digimarc, Fuji, and
Polaroid. According to their white paper, their metadata standards are
based on XML formats and are designed to support the evolution of digital
imaging technology over the next 5-10 years. Because of their near-future
and consumer, home use, focus these standards are not appropriate for
libraries, museums, archives, and historical societies.
The goals of any digitization project will have an influence on the equipment needed to accomplish the project. Scanning, organizing, tracking, editing, describing and indexing, displaying, and storing the images all require different equipment, as well as different levels of staff expertise. Choice of equipment depends primarily on the types of materials you plan to scan as well as the intended uses of the digital images. Several sources, such as computing magazines and equipment manufacturers, offer informative reviews of hardware, scanners, and software. We recommend that prior to purchasing system components, you review the current literature available both online and in print for the latest developments and reviews. Several such sources are listed at the end of this section.
Some things to consider when choosing specific components for your system:
A powerful computer is the basis of any digitization project. A dedicated computer should be used specifically for the imaging project. Below is a list of suggested basic equipment:
Below, we suggest a set of ideal and minimum hardware specifications for a small-scale, in-house scanning operation. Note that we are not recommending specific brands of equipment, but rather, suggested configurations.
Suggested minimum configuration as of this writing:
Suggested ideal configuration as of this writing:
A second PC/Mac station for image editing functions is recommended. A separate PC/Mac exclusively for scanning will increase production. The two stations can be configured as a network (using Windows NT or Appletalk (Mac), for example). Other networked configurations are possible.
The scanning computer is critical to any digitization project. A computer with the highest processing speed available will have a tremendous impact on workflow and will facilitate faster production rates. Since image files are so large, an appropriate amount of RAM is also necessary for image scanning and editing. Ideally, saving image files to a server before or after editing might be preferable in order to avoid running out of file space for organizing and storing images in production. When images have undergone final editing and quality control, they can then be uploaded to a server and/or downloaded to a CD-recorder.
A good monitor is critical to image manipulation and editing. It should have a large screen size, high resolution, a high refresh rate, be flicker-free, and it should support adequate video RAM in order to produce images that are a close representation of the original item scanned. A high quality PC and monitor are well worth the money they will require; don't go cheap at the expense of quality.
If you plan on saving the images to CD (recommended), you will need an internal or external CD-recordable drive. It should be ISO 9660 compliant and should support a variety of formats. CD recorders are defined by recording and playback speed. An acceptable option at this time is 4x record by 6x playback. In addition, we recommend that you use CD-ROM recorders instead of DVD recorders and readers at this point in time, until DVD technology matures and becomes more widely accepted and supported. DVD technology promises backward-compatibility with CD-ROMs. Make sure that the CD recorder, as well as peripherals such as the scanner and backup mechanisms, are compatible with your computer.
We recommend that you backup image files before transferring them to CD or storing them on a network server. Acceptable backup options for the short term include internal or external tape (DAT) drives or a zip® or jaz® drive. In case of disaster, we recommend that you provide storage for image files on a second, secure, hard-copy (i.e., CD or tape) offsite as well, especially if your only other copies are on the server or on a CD or tape copy onsite.
If you plan to print out quality copies of the images, you will want
to consider a color and/or black and white printer capable of printing
at least 1200 dpi, particularly if you plan to sell prints of digitized
images (in which case, you may want to invest in a printer that can print
on glossy paper).
There are many types of scanners available, in many different sizes and levels of quality. Choosing a scanner depends on your project goals and on the format, type, and size of the media you intend to scan. Other factors to consider when choosing a scanner include optical resolution, bit depth, scan area size, scan time, and functionality for scanning different formats. These factors will affect the quality of the digital image and must be taken into consideration when purchasing a scanner.
The resolution at which you scan is one of the factors that will determine the quality of your images. Resolution is often expressed as an array--the number of pixels across both dimensions of an image ( or more simply as 3000 pixels across the long side), as dpi (dots per inch), or as ppi (pixels per inch). Higher dpi settings will generally yield a better digital image, because they place more pixels (therefore, information) in an inch than do the lower dpi settings. However, the higher the dpi, the larger your file size. You must take into account your server or computer storage capacity when determining resolution settings, and balance that against the goals of your project. Scanning at a high resolution is recommended if you are planning to convert an important collection into digital form to increase access and to build a virtual archive, generate "archival" images, or make prints of the digital image on a good printer. There is a threshold to resolution, however. After a certain point, increasing resolution will not yield a better image.
There are two different types of resolution: optical and interpolated. Optical resolution is the inherent resolution of the scanner, and is usually expressed as a pixel array (i.e., 1000 x 2000). The first number refers to the limit of the CCD array on your scanner (the short dimension), and the second number refers to a number determined by the movement of the CCD array across the long dimension of your scanner.
Interpolated resolution is calculated by software from a lower resolution image file. This is often performed during or after scanning. A higher optical resolution in a scanner is better than interpolated resolution. The specifications for the resolution at which you scan should represent actual optical resolution rather than values achieved by interpolation.
Bit depth also has an effect on image quality. Bit depth is the number of bits of data representing each pixel in the image. A bit can have two values, 0 or 1. If an object is scanned at a bit depth of 8, it can have 256 possible colors. A bit depth of 24 produces over 16 million colors. Bit depth also has an effect on file size: as bit depth increases, the file size increases arithmetically. Scanners generally sample at a higher bit depth and then the final image output is sampled down to a lower bit depth. Sampling at a higher bit depth aids in reducing noise, extends the possible tonal range of the image, and allows the scanner to capture a larger density range without loss of detail.
Common Scanner Types:
Flatbed scanners are one of the most popular scanners used in libraries and archives and are suitable for scanning papers, flat photographs, and other printed materials. Flatbeds can be purchased with an optional attachment called a transparent media adapter, which allows you to scan directly from slides or negatives. However, transparency adapters do not always produce as high a quality of image as a slide or film scanner. If you plan to scan predominantly transparent materials that are smaller than 4 x 5, you may want to consider a slide or a film scanner (there are some slide/film scanners that can handle larger transparent formats). Scanners that combine flatbed scanner capabilities and 35mm slide capabilities are also on the market. Some slide scanners can deliver a better dynamic range than flatbeds; however, the resolution may not be sufficient to create digital masters or meet the resolution requirements of some users.
If your collection contains predominantly oversized materials, you may want to consider outsourcing the scanning to an imaging vendor or purchasing a high-end digital camera that can capture oversize materials, which works much like a copystand setup. There are also flatbed scanners that handle originals that are 12" x 17", and some flatbed scanners can accommodate even larger sizes, although they tend to take up considerable space and produce enormous file sizes.
Some participants in the CDP Project have asked about drum scanners. In general, the CDP does not recommend them for formats of significant value or that are fragile or brittle in any way, as drum scanners can cause a great deal of stress to the document. The original is also taped to the rotating cylinder, so consider how this may also affect the document. Drum scanners are designed for the graphic arts community and, as such, provide an extremely high level of resolution. Drum scanners can scan transparent as well as reflective media, in grayscale and color.
Not all scanners take the same amount of time to scan the same image at the same resolution. If high production levels are important, it will be necessary to look at the time it takes for both preview and full scan images of materials similar to what you intend to scan. In general, flatbed/slide scanners accommodate a higher production rate than digital cameras, but they also are limiting in the size and type of media formats they are able to scan.
All electronic devices suffer from "noise," which often appears on scans
as blotchy or matte-like areas in the dark shadow parts of an image when
these areas are lightened or have their contrast range increased. Noise
can obscure details in the shadows. Higher quality scanners, with higher
bit depths, will give better results, as they tend to use higher quality
(lower noise) components.
At this point in time, we feel that commercially available, hand-held digital cameras are not suitable for archival scanning, excepting the high-end digital cameras (Kontron, Zeutschel, Leica) used by several larger institutions and imaging vendors. High-end digital cameras have no scanning limitations when it comes to size and shape, and can scan at an extremely high resolution (up to 15,000 pixels across the long dimension). They do require certain lighting requirements and a high level of operator skill. However, if you can afford a high-end, overhead digital camera, they present great potential for scanning oversize materials, media in all formats, bound materials with the aid of a book cradle, and present a lower risk to fragile materials by allowing face-up, contact-free scanning.
Digital Camera Reviews http://www.steves-digicams.com/digresources.html#reviews
There are two types of software that you will need for most digital imaging projects. The first is the scanning software that comes with the scanner. The second type of software is the image editing software, normally applied to the image after it has been scanned. Some software, such as Adobe Photoshop®, can serve as both the scanning software and the image editing software. The scanning software is usually limited in its functionality. You should choose scanning software that is at least capable of saving image files into standard formats such as TIFF, JPG, GIF, etc. This functionality will help production and also ensure a wide range of image delivery options. Software that converts image files from one format to another may also be useful.
To produce images of acceptable quality, it is important to invest in image editing software, which is normally used for "cleaning up" an image (removing dust spots, for example) and for correction (adjusting the level of brightness and contrast, for example). Image editing software should come with the capability to crop, deskew, and rotate; adjust brightness and contrast levels; sharpen (if needed); zoom in and out; accommodate different file formats; provide controls for gamma, black and white, and color (RGB); provide a histogram and look-up table; support compression types; and possess the capability for the user to create and save customized settings, among other functions.
The choice of image editing software is based on the level of image manipulation desired for your project and the level of expertise of staff. Some image editing software, such as Adobe Photoshop®, is very advanced, and may require some time and training to learn. Other software is more basic and allows for only limited operations, such as cropping and rotating, and is not difficult to master. Consider the range of operations you will normally need to perform. The cost of this software can range from free (freeware) to several hundreds of dollars. When considering cost, think about not only the cost of the product, but also how easy it is to use—and factor in additional costs for training, accordingly.
In addition to considering the capability and usability of image editing software, make sure that your current technology can support the software. Do you have the appropriate amount of memory, hard drive space, processor power, and display capabilities (a 24-bit color display card is recommended for image editing work)?
The amount of image editing performed on the images should be defined in your project goals, possibly decided in consultation with the collection curator or an archivist or librarian who is knowledgeable about the materials being scanned. Some digitizing projects aim to create a "pleasing image" that may require a great deal of editing. Other projects may be more concerned with the fidelity of the digital image to the original (this may be important to scholars), and may require very minimal editing. Do you intend to match the digital image as closely as possible to the original? Are you more concerned with the photographer's/creator's intent when editing the digital image (i.e., high contrast; scanner operator makes decisions about tone and color values of the digital image)? Or are you more concerned with reconstructing the appearance of the original as it would have existed when first created (to digitally reconstruct deteriorated originals)? What constitutes a "good image" for the purposes of your project--a faithful reproduction or a pleasing image--should be defined prior to scanning.
Batch processing software, such as Equilibrium's DeBabelizer Pro http://www.equilibrium.com/
or ThumbsPlus http://www.cerious.com/featuresv4.shtml,
takes a large set of files and automatically performs the same process
on them. This type of software is useful for the generation of thumbnails
and access images (making JPEGs and/or GIFs from TIFFs, for example),
converting from one file format to another, or compressing files. Some
image editing software includes batch processing capability; if not, this
software might be worth considering.
The following is not an exhaustive list of the hardware and software available as of this writing. By listing the following sources, we do not imply endorsement of any of the products or publications. They are listed simply as a guide to possible options.
Some General Sources for Hardware and Scanner Reviews
PC Magazine Online http://www.pcmag.com
Adobe Photoshop http://www.adobe.com/
Adobe Photodeluxe http://www.adobe.com/
Ulead PhotoImpact http://www.ulead.com/
JASC Paint Shop Pro http://www.jasc.com/
Micrografx Picture Publisher http://www.micrografx.com/
Corel Photo-Paint http://www.corel.com/
Macromedia xRes http://www.macromedia.com/
Although these guidelines do not focus on database options, there are many types of databases available that are suitable for digitization projects, with varying degrees of complexity, cost, and power. Some options include:
Many organizations around the country offer workshops and training on digital imaging, and many conferences are held each year addressing imaging issues.
The Northeast Document Conservation Center http://www.nedcc.org/ hosts a "School for Scanning" several times a year for digital project managers. Information on the school can be found on their website.
AMIGOS Bibliographic Council http://www.amigos.org/ gives several workshops a year on digital imaging to institutions in the Southwest and at conferences. Workshops can be requested on certain topics and in Colorado as well.
Museums and the Web Annual Conference: Hosted by Archives and Museum Informatics http://www.archimuse.com/.
In reality, many digital imaging projects will not have dedicated staff working on the project, but will utilize existing staff from other areas in the organization, student assistants, or volunteers. It may benefit the project to look at "transferable skills" that staff members may already possess that would be useful in any digitization project. Sufficient time for training, and opportunities to receive further education and training, should also be provided.
Digitization projects require a combination of skills from a variety of staff with different areas of expertise. The following areas and skills may be important to any digitization project:
Digitization projects, by nature, require a team approach, and bring together different sets of skills from different areas of the library perhaps more than any other project. Administration, technical services staff, cataloging staff, the information technology department, subject specialists, curators, librarians, preservation and conservation staff, faculty, and others may all be involved.
This document provides the minimum scanning recommendations we feel are necessary for responsible conversion and for achieving an acceptable level of image quality. These minimum recommendations are reliant upon technical capabilities and limitations at this current point in time. We recognize that all collections differ in the ways they are used and accessed and that institutions have differing purposes and clientele, which will likely have an impact on how and for what purposes and reasons collections are digitized. These are not hard and fast recommendations for every collection and every institution. As a rule, the key to quality scanning is not to scan at the highest resolution possible but to scan at a level that matches the informational content of the original.
Decisions on image quality and resolution should be based on the needs of users, how the images will be used, and the nature of the materials you are scanning (dimensions, color, tonal range, format, material type, etc.). The quality of the original (such as the quality of the shooting or processing technique in the case of photographs) also has an impact on the resolution at which you scan and the resulting quality of the digital image.
Master Image File
Many digital imaging projects scan a high-quality "master" or archival image and then derive multiple versions in smaller sizes or alternative formats for a variety of uses. There are compelling preservation, access, and economic reasons for creating an archival-quality digital master image: it provides an information-rich, unedited, research quality surrogate, and ensures rescanning will not be necessary in the future. A high-quality master image will make the investment in the image capture process worthwhile. Since user expectations and technology change over time, a digital master must be available and rich enough to accommodate future needs and applications. The master image should be the highest quality you can afford; it should not be edited or processed for any specific output; and it should be uncompressed. Intensive quality control should be applied in creating master image files.
Derivative Image Files
Derivative files are created from the master digital image, and are used in place of it, usually for general Web access. Derivative files typically include an access image, which is sized to fit within the screen of an average monitor; and a thumbnail image, which is usually quite small and hotlinked to the larger access image. Derivative files are usually stored online. Master image files are very large and expensive to store online. Consider whether you have the server space to store these files. An alternative is to store these files on CD or DVD.
We recommend that three versions of an image be created: a master image, an access image, and a thumbnail image. A higher resolution access image may be created depending on the need to detect detail in the image.
There are three types of scanning:
Bitonal – One bit per pixel representing black and white. Bitonal scanning is best suited to high-contrast documents such as printed text.
Grayscale – Multiple bits per pixel representing shades of gray. Grayscale is suited to continuous tone documents, such as black and white photographs.
Color - Multiple bits per pixel representing color. Color scanning is suited to documents with color information.
These three modes of scanning also require some subjective decisions.
For example, a black and white typed document may have annotations in
red ink. Although bitonal scanning is often used for typed documents,
scanning in color may be preferable in this case, depending on how the
image will be used. Manuscripts, older printed matter, and sheet music
may be better served by scanning as continuous tone in grayscale or color
to bring out the shade and condition of the paper and the marks inscribed
Recommendations have been developed for the following formats:
We have included recommendations for tonal depth, file format, compression,
and spatial resolution. Tonal depth will be determined by the nature of
the material you are scanning as well as the functionality of the scanner/camera
you are using. File formats suggested are non-proprietary and meet the
industry standard; however, some alternative file formats are briefly
addressed. Compression, a process that compresses images prior to storage
and transmission in order to save space and time, will be both lossy or
lossless, depending on the file format. Lossless compression results in
a file similar to the original image, with no loss of information. In
lossy compression, a certain amount of information is discarded during
the compression process. Although the discarded information may be invisible
to the human eye, a loss of quality occurs. Compression levels may vary
from project to project. In general, we suggest that master image files
remain uncompressed. Finally, we have included recommendations for spatial
resolution. These recommendations are the minimum (or, in some cases,
middle of the road) for resolution, and are expressed as either dpi (dots
per inch) or as a pixel dimension. For all of the following formats, we
cannot justify scanning at a resolution lower than 300 dpi, as the cost
of storage has become so cheap.
Alternative format: PDF (Portable Document Format) from Adobe is an alternative file format for creating and displaying text-based files on the web. See http://www.adobe.com for further information. Requires Adobe Acrobat software to create and manipulate files. The Adobe Acrobat viewer is free to download so users can view documents on their computers.
Other considerations: Consider providing a transcription of textual materials, and, especially, of handwritten manuscripts that may be difficult to read. Transcriptions can be of tremendous help to researchers looking at a text. Transcribed text, especially when it is encoded with markup language, can greatly facilitate the researcher's ability to navigate and search long documents. There are several ways of presenting digital images of text. You may want to consider providing the text as an image file linked to a transcribed file, especially since the accuracy of OCR is not high. However, rekeying is labor-intensive. One option for OCR software is by the Caere Corporation at http://www.caere.com/. If you choose to keep the text as page images only, you could also create a table of contents in HTML and have it link to the individual page images, for easier navigation. Another option is to encode the text using a markup language, such as SGML, to enable the searchability of the text document. Participants in the American Memory Project at the Library of Congress, for example, use SGML in a DTD (Document Type Definition) based on the TEI (Text Encoding Initiative) Guidelines. Since SGML viewers are not yet freely available for viewing SGML over the Internet, an HTML version can be derived from the SGML version for widespread viewing online.
Alternative formats: Many imaging projects are using the proprietary Kodak PhotoCD format for storing their photographic images. For more information on Kodak PhotoCD can be found at Kodak's site http://www.kodak.com/US/en/digital/products/photoCD.shtml and in an article in RLG Diginews, "Using Kodak PhotoCD for Preservation and Access," at http://www.rlg.org/preserv/diginews/diginews23.html#feature. Consider how to reprocess the image files into GIF/JPEG for direct Web access to the images.
Other emerging file formats include the Flashpix format, http://www.I3A.org/ sponsored by the International Imaging Industru Association. Flashpix is a technology that provides a multi-resolution, tiled file format that allows images to be stored at different resolutions for different purposes, such as editing or printing, in one file. To use Flashpix, you need the Openpix technology.
PNG, http://www.w3.org/Graphics/PNG/ or Portable Network Graphics, is an image format designed to replace the GIF format. It offers a smaller file size than GIF but does not lose any information to compression. It is not yet widely supported.
Other considerations: Photographs can present many scanning challenges. We recommend scanning from the negative (or the earliest generation of the photograph) to yield a higher-quality image. Another consideration is whether to scan sepia-tone photographs as color or black and white images. We recommend scanning them as color images to create a better image, although this will greatly increase the file size.
Another consideration with photographs is whether to scan the backs of photographs as separate image files if there is significant information on the back of the photo (which may be of interest to users) that may not be included elsewhere. If a scanned image of the verso of the photograph is available, the digital image may serve as a more successful surrogate for the original.
Alternative formats: The MrSID (Multiresolution Seamless Image Database) format by LizardTech, Inc. http://www.lizardtech.com/products/mrsid/ allows for the compression, storage, and retrieval of large digital images. Files are stored in proprietary .sid format. The files are compressed with a "wavelet" compression algorithm that also provides a "zoom in" capability in the browser software, and provides little loss in image quality. LizardTech provides viewers to those who wish to download and manipulate .sid images, but the technology can be used to deliver a portion of the image requested as a standard JPEG, with no viewers required.
Standards for artwork are not well defined. Usually artwork imaging projects involve scanning from photographic surrogates such as 35mm slides, in which case recommendations for transparent photographs should be followed. For large format artwork, outsourcing to a vendor with an overhead digital camera or large flatbed scanner suitable for scanning large documents is recommended.
If you do choose to distribute master images over the web for access by users, you may want to consider digital watermarking or some kind of copyright/ownership mark, possibly embedded in the image itself, as master image files are of a quality that can be used for commercial reproduction. The access and thumbnail files are for web display only, and are not of a quality suitable for reproduction.
Some links to digital watermarking information:
A quality control program should be conducted throughout all phases of the digital conversion process. Inspection of final digital image files should be incorporated into your project workflow. Typically, master image files are inspected via CD batch or online for a variety of defects. Depending on your project, you may want to inspect 100% of the master images, or 10% of the files randomly, for example. We do recommend that quality control procedures are implemented and documented and that you have clearly defined the specific defects that you find unacceptable in an image. Images should be inspected while viewing at a 1:1 pixel ratio or at 100% magnification or higher.
Quality is evaluated both subjectively by project staff (scanner operator, image editors, etc.) through visual inspection and objectively in the imaging software (such as using targets). The viewing environment for visual inspection of images is also important: monitors should be calibrated, and the room should be dark or at least free from bright lighting, sunlight, or glare.
Things to look for during visual inspection may include:
CDs should be inspected to make sure image files open and display properly and that the correct batch has been recorded on the CD.
It is important to label each CD for ease of retrieval. If you use a felt-tip pen to label the CD, make sure it is water-based and does not contain alcohol, which can damage the protective layer of the disc. It is best to write information on the innermost, clear ring. Special adhesive labels are also available for labeling CDs, but the adhesive may have adverse affects on the CD over time. It is best to label only the jewel case or create an insert for it; however, it is easy for the CD to become separated from its case.
You may want to include on the CD or jewel case information such as: Name of your institution, name of collection, name of project or grant, a unique number for the disc, the beginning and ending file name on the disc, the file formats on the disc, the date the disc was created, the speed, brand, and model of the CD recorder, and relevant scanning information, such as the software used to scan the images, the brand and model of scanner used, and the resolution used to scan the images.
Keep an inventory of CDs and the files each one contains!
You will need to consider the nomenclature you will use to name your
files before starting the project. The file name must be a unique number
that uniquely identifies the image. The file name may include the name
of the collection or institution as well as the image number, plus the
appropriate extension (.gif, .jpg, .tif). File names should be no longer
than 8 characters and should not include spaces or symbols such as ?,
/, or # (etc.).
Most scanner and image editing software provide a function for calibration of the scanner and/or monitor (including monitor brightness, contrast, and control of gamma settings). Scanner software is often used to match the tonal scale of what is being scanned, which may include black and white or color calibration. In general, scanner calibration should occur every time you scan a new media format or scan a new media size. Computer monitors can misrepresent the scanned image if not properly calibrated. Image characteristics, such as moire, wavy lines, dark or light spots, inaccurate resolution, etc. may be introduced if the monitor is not calibrated.
Suggested Monitor Settings:
Monitors should be calibrated regularly. There is specific software you can purchase to calibrate your monitor.
The Western History Photodigitization Project at the Denver Public Library
has a web
page that describes how to calibrate a monitor for image viewing (on
the user end). This will help users adjust the brightness and contrast
of their computer monitor so that digital images will look their best.
If computer monitors are adjusted to a target, the digital images (if
scanned properly) should provide a reasonably accurate depiction of the
originals when viewed on the "average" computer monitor.
Targets are used to verify the tone and color reproduction of the materials you are scanning and are also used to measure system resolution (targets are about the scanning system and the accuracy of the system to reproduce correct tonal values, not about the materials that you are scanning). Tone reproduction refers to the degree to which a digital image conveys the luminance ranges of the original. The ideal in tone reproduction is to match the brightnesses in the original with the brightnesses in the digital reproduction. This is not often achieved, since the digital image is different from the original, and viewing conditions are also different. What can be achieved, however, is an acceptable subjective tone reproduction that can give an approximation of the luminance ranges in the original. Targets provide a means of controlling tone reproduction. Targets are a way of predicting image quality, and help ensure that the scanning system you are using is producing the best quality image it can and is operating at a consistent level of quality over time. Different targets for prints and transparencies exist. Targets must consist of the same material as the media being scanned (paper, film, etc.) and quality assessments should be performed on targets each time the scanner is calibrated. Targets usually contain patches of color, black and white, or shades of gray for verifying tone reproduction. Some example of targets include the Kodak Color Separation Guide, Grayscale Control Bar, AIIM Scanner Test Chart, IEEE Standard Facsimile Test Chart, and the RIT Alphanumeric Resolution Test Object target. To ensure color fidelity from scanner to monitor, the use of color targets and proper calibration of the monitor is recommended. Some color targets are the Macbeth Color Checker Rendition Chart and the PostScript IT8 Color Output Target.
Some digitization projects are also scanning a color bar along with the original, to be included in the final digital image, to aid users in verifying accuracy in color reproduction.
Please see NARA's Guidelines for Digitizing Archival Materials
and Kenney and Chapman's Digital Imaging for Libraries and Archives
for a discussion of calibration and targets in more detail.
Proper storage will help ensure access to and long-term maintenance of image collections. Storage media consists of the materials on which the digital images are written as well as the devices that record, read, and process the information. Choices for the storage of your images will depend on the technical infrastructure you have in place; however, careful consideration of storage choices will help make the investment in image capture and equipment worth the cost, time, and labor.
The CDP recommends that you consider multiple storage media for your digital collections, including adequate backup storage (which may also include offsite storage in case of disaster). Other considerations for storage media and systems include: capacity of the medium (how much it can store); speed (how quickly images can be written, read, retrieved); reliability (stability and longevity of the media); security (risks of the medium, safeguards built into the medium to protect data); scalability (planned growth rates); and costs (purchase costs, housing costs, training, maintenance, costs of access, cost of migration, etc.).
There are several types of storage media available for online, offline, near-line, and archiving purposes:
Online storage: Refers to media that is access-ready. Retrieval is fast,
often in seconds. Reliable medium for accessing information. Multiple
users can access information simultaneously.
Near-line storage: Refers to data that is accessed from a drive. Retrieval
is fast, often in seconds; can be faster or slower than online storage.
Retrieval can be slow if multiple users have requested the disc.
Offline storage: Data stored on the shelf, must be retrieved by a person. Retrieval time can take minutes to hours. Low cost to store, more security and reliability, but limited access. Data not easily browseable.
It is difficult to predict just how much a digital imaging project is actually going to cost, and little hard data on the cost, cost effectiveness, and costs over time of digital projects is readily available. Generally, capture and conversion of data often comprises only 1/3 of the total costs, while cataloging, description, and indexing comprise 2/3 of the total costs. Upfront and ongoing costs can be significant, and economic advantage--and reality--may be better realized through collaborative initiatives or cooperative/regional digitization initiatives, where costs, resources, goals, and expertise can be shared. Initial investment in equipment, staff training, capture and conversion, handling, storing, and housing originals, producing derivative files, CD production, cataloging and building the image database system, and developing Web interfaces are all possible areas of cost for any digitization project. However, the costs of a project do not end after conversion. Some on-going costs that an institution must commit to include the costs of maintaining data and systems over time, including media migration costs and infrastructure costs.
|Colorado Digitization Program • 2150 E. Evans Ave.
• Denver, CO 80208 • Contact
Please direct questions about this site to the webmaster.
© 1999 - 2003 CDP Copyright Information