Grayscale


In digital photography, computer-generated imagery, and colourimetry, a greyscale image is one in which the value of each pixel is a single sample representing only an amount of light; that is, it carries only intensity information. Greyscale images, a kind of black-and-white or grey monochrome, are composed exclusively of shades of grey. The contrast ranges from black at the weakest intensity to white at the strongest.[1]

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Greyscale images are distinct from one-bit bi-tonal black-and-white images, which, in the context of computer imaging, are images with only two colours: black and white (also called bilevel or binary images). Greyscale images have many shades of grey in between.

Greyscale images can be the result of measuring the intensity of light at each pixel according to a particular weighted combination of frequencies (or wavelengths), and in such cases they are monochromatic proper when only a single frequency (in practice, a narrow band of frequencies) is captured. The frequencies can in principle be from anywhere in the electromagnetic spectrum (e.g. infrared, visible light, ultraviolet, etc.).

A colourimetric (or more specifically photometric) greyscale image is an image that has a defined greyscale colourspace, which maps the stored numeric sample values to the achromatic channel of a standard colourspace, which itself is based on measured properties of human vision.

If the original colour image has no defined colourspace, or if the greyscale image is not intended to have the same human-perceived achromatic intensity as the colour image, then there is no unique mapping from such a colour image to a greyscale image.

Numerical representations


A sample greyscale image

The intensity of a pixel is expressed within a given range between a minimum and a maximum, inclusive. This range is represented in an abstract way as a range from 0 (or 0%) (total absence, black) and 1 (or 100%) (total presence, white), with any fractional values in between. This notation is used in academic papers, but this does not define what "black" or "white" is in terms of colourimetry. Sometimes the scale is reversed, as in printing where the numeric intensity denotes how much ink is employed in halftoning, with 0% representing the paper white (no ink) and 100% being a solid black (full ink).

In computing, although the greyscale can be computed through rational numbers, image pixels are usually quantized to store them as unsigned integers, to reduce the required storage and computation. Some early greyscale monitors can only display up to sixteen different shades, which would be stored in binary form using 4 bits. But today greyscale images (such as photographs) intended for visual display (both on screen and printed) are commonly stored with 8 bits per sampled pixel. This pixel depth allows 256 different intensities (i.e., shades of grey) to be recorded, and also simplifies computation as each pixel sample can be accessed individually as one full byte. However, if these intensities were spaced equally in proportion to the amount of physical light they represent at that pixel (called a linear encoding or scale), the differences between adjacent dark shades could be quite noticeable as banding artifacts, while many of the lighter shades would be "wasted" by encoding a lot of perceptually-indistinguishable increments. Therefore, the shades are instead typically spread out evenly on a gamma-compressed nonlinear scale, which better approximates uniform perceptual increments for both dark and light shades, usually making these 256 shades enough (just barely) to avoid noticeable increments.

Technical uses (e.g. in medical imaging or remote sensing applications) often require more levels, to make full use of the sensor accuracy (typically 10 or 12 bits per sample) and to reduce rounding errors in computations. Sixteen bits per sample (65,536 levels) is often a convenient choice for such uses, as computers manage 16-bit words efficiently. The TIFF and PNG (among other) image file formats support 16-bit greyscale natively, although browsers and many imaging programs tend to ignore the low order 8 bits of each pixel. Internally for computation and working storage, image processing software typically uses integer or floating-point numbers of size 16 or 32 bits.

Converting colour to greyscale


A colour photo converted to greyscale

Conversion of an arbitrary colour image to greyscale is not unique in general; different weighting of the colour channels effectively represent the effect of shooting black-and-white film with different-coloured photographic filters on the cameras.

Colourimetric (perceptual luminance-preserving) conversion to greyscale

A common strategy is to use the principles of photometry or, more broadly, colourimetry to calculate the greyscale values (in the target greyscale colourspace) so as to have the same luminance (technically relative luminance) as the original colour image (according to its colourspace).[2][3] In addition to the same (relative) luminance, this method also ensures that both images will have the same absolute luminance when displayed, as can be measured by instruments in its SI units of candelas per square meter, in any given area of the image, given equal whitepoints. Luminance itself is defined using a standard model of human vision, so preserving the luminance in the greyscale image also preserves other perceptual lightness measures, such as L* (as in the 1976 CIE Lab colour space) which is determined by the linear luminance Y itself (as in the CIE 1931 XYZ colour space) which we will refer to here as Ylinear to avoid any ambiguity.

To convert a colour from a colourspace based on a typical gamma-compressed (nonlinear) RGB colourmodel to a greyscale representation of its luminance, the gamma compression function must first be removed via gamma expansion (linearization) to transform the image to a linear RGB colourspace, so that the appropriate weighted sum can be applied to the linear colour components () to calculate the linear luminance Ylinear, which can then be gamma-compressed back again if the greyscale result is also to be encoded and stored in a typical nonlinear colourspace.[4]

For the common sRGB colour space, gamma expansion is defined as

where Csrgb represents any of the three gamma-compressed sRGB primaries (Rsrgb, Gsrgb, and Bsrgb, each in range [0,1]) and Clinear is the corresponding linear-intensity value (Rlinear, Glinear, and Blinear, also in range [0,1]). Then, linear luminance is calculated as a weighted sum of the three linear-intensity values. The sRGB colour space is defined in terms of the CIE 1931 linear luminance Ylinear, which is given by

.[5]

These three particular coefficients represent the intensity (luminance) perception of typical trichromat humans to light of the precise Rec. 709 additive primary colours (chromaticities) that are used in the definition of sRGB. Human vision is most sensitive to green, so this has the greatest coefficient value (0.7152), and least sensitive to blue, so this has the smallest coefficient (0.0722). To encode greyscale intensity in linear RGB, each of the three colour components can be set to equal the calculated linear luminance (replacing by the values to get this linear greyscale), which then typically needs to be gamma compressed to get back to a conventional non-linear representation.[6] For sRGB, each of its three primaries is then set to the same gamma-compressed Ysrgb given by the inverse of the gamma expansion above as

Because the three sRGB components are then equal, indicating that it is actually a grey image (not colour), it is only necessary to store these values once, and we call this the resulting greyscale image. This is how it will normally be stored in sRGB-compatible image formats that support a single-channel greyscale representation, such as JPEG or PNG. Web browsers and other software that recognizes sRGB images should produce the same rendering for such a greyscale image as it would for a "colour" sRGB image having the same values in all three colour channels.

Luma coding in video systems

For images in colour spaces such as Y'UV and its relatives, which are used in standard colour TV and video systems such as PAL, SECAM, and NTSC, a nonlinear luma component (Y') is calculated directly from gamma-compressed primary intensities as a weighted sum, which, although not a perfect representation of the colourimetric luminance, can be calculated more quickly without the gamma expansion and compression used in photometric/colourimetric calculations. In the Y'UV and Y'IQ models used by PAL and NTSC, the rec601 luma (Y') component is computed as

where we use the prime to distinguish these nonlinear values from the sRGB nonlinear values (discussed above) which use a somewhat different gamma compression formula, and from the linear RGB components. The ITU-R BT.709 standard used for HDTV developed by the ATSC uses different colour coefficients, computing the luma component as

.

Although these are numerically the same coefficients used in sRGB above, the effect is different because here they are being applied directly to gamma-compressed values rather than to the linearized values. The ITU-R BT.2100 standard for HDR television uses yet different coefficients, computing the luma component as

.

Normally these colourspaces are transformed back to nonlinear R'G'B' before rendering for viewing. To the extent that enough precision remains, they can then be rendered accurately.

But if the luma component Y' itself is instead used directly as a greyscale representation of the colour image, luminance is not preserved: two colours can have the same luma Y' but different CIE linear luminance Y (and thus different nonlinear Ysrgb as defined above) and therefore appear darker or lighter to a typical human than the original colour. Similarly, two colours having the same luminance Y (and thus the same Ysrgb) will in general have different luma by either of the Y' luma definitions above.[7]

Grayscale as single channels of multichannel colour images


Colour images are often built of several stacked colour channels, each of them representing value levels of the given channel. For example, RGB images are composed of three independent channels for red, green and blue primary colour components; CMYK images have four channels for cyan, magenta, yellow and black ink plates, etc.

Here is an example of colour channel splitting of a full RGB colour image. The column at left shows the isolated colour channels in natural colours, while at right there are their greyscale equivalences:

Composition of RGB from 3 Greyscale images

The reverse is also possible: to build a full colour image from their separate greyscale channels. By mangling channels, using offsets, rotating and other manipulations, artistic effects can be achieved instead of accurately reproducing the original image.

Grayscale modes


Some operating systems offer a greyscale mode. It may be bound to a hotkey or this could be programmable.

It is also possible to install a greyscale mode extension in some browsers.

See also


References


  1. Johnson, Stephen (2006). Stephen Johnson on Digital Photography. O'Reilly. ISBN 0-596-52370-X.
  2. Poynton, Charles A. "Rehabilitation of gamma." Photonics West'98 Electronic Imaging. International Society for Optics and Photonics, 1998. online
  3. Charles Poynton, Constant Luminance
  4. Bruce Lindbloom, RGB Working Space Information (retrieved 2013-10-02)
  5. Michael Stokes, Matthew Anderson, Srinivasan Chandrasekar, and Ricardo Motta, "A Standard Default Color Space for the Internet – sRGB", online see matrix at end of Part 2.
  6. Wilhelm Burger, Mark J. Burge (2010). Principles of Digital Image Processing Core Algorithms. Springer Science & Business Media. pp. 110–111. ISBN 978-1-84800-195-4.
  7. Charles Poynton, The magnitude of nonconstant luminance errors in Charles Poynton, A Technical Introduction to Digital Video. New York: John Wiley & Sons, 1996.