Glossary of terms related to the Digital Practice Guidelines.
Archival Master Image
File that represents the best copy produced by a digitising organisation, with best defined as meeting the objectives of a particular project or program. Archival master files represent digital content that the organisation intends to maintain for the long term without loss of quality, fidelity and functionality. Archival master files are the starting point when organisations produce the production master files and/or derivative files that will in turn support a wide range of objectives. The file format most often associated with archival images is TIFF, audio is WAV and video is AVI.
A digital artefact is any undesired alteration in data introduced in a digital process by an involved technique and/or technology. For digital images, an artefact is a digital effect introduced into an image during digitisation that does not correspond to the original image being scanned. Artefacts might include pixellation, dotted or straight lines, regularly repeated patterns, moiré, etc. Heavy JPEG compression creates strong artefacts which noticeably degrade image quality. With sound recordings a compression artefact is a particular class of data error that is usually the consequence of lossy data compression (e.g. converting to MP3), causing a noticeable distortion of media.
Audio Interchange File Format (AIFF) is an audio file format most commonly used on Apple Macintosh computer systems.
Backup or the process of backing up refers to making copies of data so that these additional copies may be used to restore the original after a data loss event. Regular backup processes are essential tasks of professional data management and a key component of digital preservation. External hard drives and/or tape systems are most common.
A method of capturing and processing digital images in large numbers so that minimal staff time is required. This becomes important when digitising large collections of similar items or performing the same task on multiple image files.
The bit depth of an image refers to the number of bits used to describe the colour of each pixel. Greater bit depth allows more colours to be used in the colour palette for the image. 1-bit colour is the lowest number of colours per pixel in which a graphics file can be stored. In 1-bit colour, each pixel is either black or white. 8-bit colour/greyscale has eight bits assigned to each pixel, providing 256 colours or shades of gray. 24-bit colour has 8-bits assigned to each of the red, green, and blue components of every pixel, representing 16.7 million colours. In digital audio, bit depth describes the number of bits of information recorded for each sample. Bit depth directly corresponds to the resolution of each sample in a set of digital audio data.
File with a solely digital origin, of which there is no physical precursor e.g. files created on digital camera/video/audio recorder, a word processed or spreadsheet document, computer generated graphic artwork. See also turned digital.
Calibration is the adjusting of a device, such as a camera, scanner or monitor, to make it meet known standards. Colour temperature, brightness and gamma will be set to particular specifications. Colorimeters and densitometers are used to measure the behaviour of the device; the results are compared to master figures; then the device is adjusted to behave in line with the master standards. The file which accomplishes the adjustments is known as a profile.
These include flatbed scanners, drum scanners, film scanners, and digital cameras. They use electronic sensors (e.g. CCDs) to capture images rather than photographic film. Digital audio and video recorders are also capture devices CCD (Charge Coupled Device) CCDs are sensors used in digital cameras and video cameras to record still and moving images. The CCD captures light and converts it to digital data that is recorded by the camera. The quality of an image captured by a CCD is high and its size depends on the resolution of the sensor. In digital cameras, the resolution is measured in Megapixels (or thousands of pixels). Complementary Metal Oxide Semiconductor (CMOS) is a commonly used alternative technology to the CCD as a chip for collecting digital data. They can be cheaper and simpler to make and use less power.
CMYK (Cyan, Magenta, Yellow, Black)
One of several colour encoding systems used by printers for combining (secondary) colours to produce a full‐colour image. In CMYK, colours are expressed by the “subtractive primaries” (cyan, magenta, yellow) and black. Modern inkjet printers can have more than one shade of each colour in the CMY set.
The process of altering colours as they appear in a digital image or in print to ensure they accurately represent the work depicted.
The reduction of image file size for processing, storage, and transmission. The quality of the image may be affected by the compression techniques used and the level of compression applied. There are two types of compression:
- Lossless compression is a process that reduces the storage space needed for an image file without loss of data. If an image has undergone lossless compression, it will be identical to the image before it was compressed. LZW compression for Tiff files is an example.
- Lossy compression discards information but can dramatically reduce file size. Amount of compression can be controlled to suit purpose of file. In selecting a compression technique, it is necessary to consider the attributes of the original. Some compression techniques are designed to compress text; others are designed to compress pictures. Jpg files are created with variable, lossy compression. JPEG2000 can use lossless compression but is not currently widely supported.
Smoothly varying gradation of tones found in colour and B&W photographs, and (some) original artwork. Printing presses and digital inkjet printers simulate continuous tone by producing precise patterns of extremely fine dots.
Act of matching colour and tones between monitors, scanners and printers. Profiling software will refer the colour behaviour of each device to a known standard. The colour and tone information sent to each device will be adjusted to keep to colour and tones consistent across all the devices. The original, the scan, the monitor display, the inkjet print and the offset print will then all match.
A geometric representation of colours in space (generally in three dimensions) can be visually perceived or generated using a particular colour model. The parts of the visible light spectrum used to describe an image. Colour spaces vary in their scope according to the range of colours involved. Examples of colour spaces are: Adobe RGB (1998) and sRGB.
'White' light varies in its colour; sometimes it is reddish (tungsten light or sunsets) with a low colour temperature; sometimes it is blueish (cloudy days or in snow mountains) with a high colour temperature. Colour temperature is measured on the Kelvin scale. D50 is the designated CIE standard illuminant representing a daylight colour temperature of 5,000 degrees Kelvin, corresponding to warm daylight near sunrise or sunset. D50 is the standard used in the graphics industry for viewing conditions. D65 (6500 degrees Kelvin) is the standard for monitor calibration.
Image that has been created from another image for specific purposes e.g. thumbnails for the web display or production files used to provide prints. Techniques to create derivative images include sampling to a lower resolution, using lossy compression techniques, resizing or cropping or saving to a different file format.
Type of camera where the image is recorded by a sensor, called a “charged coupled device” or CCD. Instead of saving the picture on analogue film like traditional cameras, digital cameras save photos in digital memory most commonly an SD or Compact Flash card. Digital cameras range greatly in size in quality, from cameras in mobile phones to 50MP professional camera systems.
Unique identifier of a digital file. This can be an alphanumeric with meaningful (descriptive) elements or system generated random or sequential numbers.
An electronic photograph scanned from an original document, made up of a set of picture elements ("pixels"). Each pixel is assigned a tonal value (black, white, a shade of grey, or colour) and is represented digitally in binary code (zeros and ones). The term "image" does not imply solely visual materials as source material; rather, a digital image is simply a representation of whatever is being scanned, whether it is manuscripts, text, photographs, maps, drawings, blueprints, halftones, musical scores, 3‐D objects, etc. Digital Images can be Born Digital or Turned Digital (reformatted).
Digital preservation is the set of processes and activities that ensure continued access to, and functionality of information and all kinds of records, scientific and cultural heritage existing in digital formats
The representation of an object, image, sound or document by a discrete set of its points or samples, most commonly the conversion from printed paper, film, audio tape or other physical media formats to a digitally encoded format where an object is represented as colour or greyscale pixels, or 1s and 0s.
To transmit a file from one computer to another. Usually implies retrieving a file from a remote computer or server (on the internet on a Network) to a local one. FTP (File Transfer Protocol) is commonly used for larger files.
DPI stands for dots per inch, and was originally used specifically as a term in printing, providing a measure of how many dots of ink are placed on a print in distance of one inch. The terms DPI and PPI (pixels per inch) are used somewhat interchangeably today, though PPI is the preferred term.
The Dublin Core set of metadata elements provides a small and fundamental group of text elements through which most resources can be described and catalogued. Using only 15 base text fields, a Dublin Core metadata record can describe physical resources such as books, digital materials such as video, sound, image, or text files, and composite media like web pages.
The ratio between the largest and smallest possible values of a changeable quantity, such as in sound (volume) and light (brightness). When digitising images it is essential to accurately capture the complete brightness range from the shadows to the highlights; with audio it is important to capture the loudest and the softest parts of the sound. For image sensors, the highest density is the D-max; the lowest density is the D-min.
The range of colours a device such as a monitor or printer can produce or display. Colour spaces (Adobe RGB 1998, sRGB) also have their own gamut or range of colours that they can render accurately. Any colours that a device cannot produce are known as “out-of-gamut”. A printed output will have a smaller gamut or range of colours than can be seen on a monitor.
An image rendered with only black, white, and a range of shades of grey, with no chromatic (colour) data. Most commonly, digital greyscale images contain 8 bits per pixel, allowing for 256 shades or levels of intensity. The larger the number of shades of grey, the better the image will look (and the larger the file will be).
Making changes (i.e. tonal adjustments, cropping, colour corrections etc.) to an image using image processing software; altering the image from its original state when captured.
A method of creating new digital information by estimating in-between values from within a set of known data points. Often used to create larger file sizes than were captured originally. Master images should be captured and saved at the appropriate resolution for that project/object with no interpolation, resizing or manipulation.
Defined by the Joint Photographic Experts Group in 1992, JPEG is a commonly used method of lossy compression for digital photography. The degree of compression can be adjusted, allowing a selectable trade-off between storage size and image quality. JPEG typically achieves 10:1 compression with little perceptible loss in image quality.JPEG compression is used in a number of image file formats. JPEG/Exif is the most common image format used by digital cameras and other photographic image capture devices; along with JPEG/ JFIF, it is the most common format for storing and transmitting photographic images on the World Wide Web. These format variations are often not distinguished, and are simply called JPEG. Not recommended as an archival master file format because some data is discarded during the compression. The more the compression, the smaller will be the file size and lower the image quality. Every time a JPEG image is manipulated and resaved there is a further loss in quality.
A modification/improvement to the original JPG file format. The compression methods within the JPEG2000 file format (wavelet based) can produce better results than the standard JPEG (discrete cosine transformation based). It has a lossless option and the artefacts found with high compression levels are less pronounced. A significant feature of JPEG2000 is that images can be decoded into different resolutions, allowing, for example, a viewer to quickly render a low-resolution image from a gigantic file that would otherwise take considerably longer. JPEG2000 files will not be quite as small as a .jpg and the format has not yet had widespread acceptance.
MARC is an acronym for Machine-Readable Catalogue or Cataloguing. It is a system by which data elements within bibliographic records are uniquely labelled for computer handling. MARC is widely used by libraries and other information agencies to exchange bibliographic and related information between systems.
Equivalent to 1,000,000 pixels (= 1MP). A measure of the resolution of sensors in digital cameras. 10MP is common for consumer cameras, 50 and 60 MP sensors are becoming common in professional camera systems.
Structured information that describes, explains, locates, and otherwise makes it easier to retrieve and use an information resource. It is loosely defined as data about data. There are a number of different schemas for structuring Metadata. Metadata is used to describe three aspects of digital documents and data:
- Descriptive metadata is the information used to search and locate an object such as title, author, subjects, keywords, and publisher
- Structural metadata gives a description of how the components of the object are organised and relate to each other
- Administrative metadata refers to the technical information including file type. Sub-types of Administrative metadata are Rights Management metadata, Technical metadata and Preservation metadata
Optical Character Recognition is the mechanical or electronic translation of scanned images of handwritten, typewritten or printed text into machine-encoded text. It is widely used to convert books and documents into electronic files, to computerize a record-keeping system in an office, or to publish the text on a website. OCR text is searchable and can be edited and repurposed into other documents.
Portable Document Format. Documents with images, text and formatting can be saved in PDF format, created by Adobe Systems. It is a file format which is cross-platform and can be opened on any computer and will look like the original. Documents with images and text to be sent to commercial printing houses are usually best saved in this format. PDF/A format is designed for long term archiving. It has fonts embedded, colour spaces are defined and use of standards-based metadata is mandated.
Short for picture elements, the points of information, which make up a digital image or represent it on a screen. Each pixel can represent a number of different shades of grey or gradations (levels) of colour, depending on how many bits are used (see Bit Depth).
Pixel per inch (PPI)
A measurement of the scanning resolution of an image or the quality of an output device. PPI expresses the number of pixels used to capture the digital image.
The original digital file created by a digital camera or scanner. Each manufacturer uses their own version of RAW files which are proprietary and exist in many variants. RAW files are commonly used as capture and working files but are not recommended to be saved as Archival Master files (they are usually processed and modified to produce a TIFF or JPEG file before saving).
The number of pixels (in both height and width) making up an image. The more pixels, the higher the resolution; the higher the resolution, the greater its clarity and definition and the greater the file size. 300PPI means there will be 300 points of information in a line of 1 inch length. Monitor resolution is the number of dots over the whole screen. 1024 x 768 means 1024 dots on 768 lines.
Red, green and blue; the three primary colours of the spectrum which combine to make white light and in different combinations can create all colours. This is an “additive” method used to simulate natural colour on computer monitors.
The audio sampling rate is the number of samples per second that are used to digitise a particular analogue sound.The sample rate of an audio recording partially determines the overall sound quality - a high sample rate produces better quality. The sample rate is measured in Hertz (Hz - cycles per second) and Kilohertz (kHz - thousand cycles per second). For example CD quality audio has a sample rate of 44100Hz, 16-bit (resolution).
A device for capturing a digital image from a physical (analogue) original item. There are many types of scanners, such as flatbed scanners, drum scanners, slide scanners, and microfilm scanners.
The degree to which the page is not vertical after printing or scanning. De‐skewing is a process where the computer detects and corrects the skew in an image file.
Tagged Image File Format, an industry standard image file format for bit mapped (pixel based) digital image files. It can support multiple images per file, a wide range of bit depth and is used mostly for high quality imaging and archival storage. Delivery over web is hampered by file sizes, although LZW compression can reduce these file sizes by up to 33%.
Small, low‐resolution images (derived from master files), often hyper linked to a larger version of the same image. Commonly used on web pages as previews and in search result lists.
Digital file created by reformatting a physical original, e.g. by capturing/ scanning paper document, photograph, audio tape or film original see also Born digital.
Waveform Audio File Format (WAVE, or more commonly known as WAV due to its filename extension) is an audio file format. It is the main format used on Windows systems for raw and typically uncompressed audio.
Working Space (colour)
The current colour space in which the image is being edited on a computer. The image might have been converted from its original colour space to its working space, and the final output space might be different again. Adobe RGB 1998 is a common professional working space (and archival colour space) while sRGB is the most common for images intended for web delivery.
Physical space in which creation and processing of digital content occurs. Ambient lighting needs to comply (as far as possible) with IS0 guidelines for luminance (brightness) and colour temperature.