What are the basic low level steps involved in raw file conversion? In this article I will discuss what happens under the hood of digital camera raw converters in order to turn raw file data into a viewable image, a process sometimes referred to as ‘rendering’. We will use the following raw capture to show how image information is transformed at every step along the way:
Rendering = Raw Conversion + Editing
The rendering process can be divided into two major components, Raw Conversion and Editing. Raw conversion is necessary to turn raw image data into a standard format that downstream software like Editors, Viewers and hardware output devices understand – usually a colorimetric RGB color space like Adobe or sRGB.
Editors on the other hand take image information in a standard color space and apply adjustments to it, to make the displayed image more ‘appropriate’ and/or ‘pleasing’ to the photographer.
An example of a pure raw converter is dcraw by David Coffin; an example of a pure editor is Photoshop by Adobe. Most Raw Converters actually combine raw conversion with some or a lot of Editing functionality (e.g. Capture NX, LR, C1 etc.).
The 7 Steps of Basic Raw Conversion
Where raw conversion ends and editing begins is somewhat of a fuzzy line but, depending on how one decides to segment them, there are only 7 steps involved in the basic conversion of a raw file into a standard (colorimetric) color space like sRGB. They are often but not always applied in this sequence:
- Load linear data from the raw file and subtract Black Levels
- White Balance the data
- (Optional: apply a linear brightness correction)
- Properly Clip image data
- Demosaic it
- Apply Color Transforms and Corrections
- Apply Gamma
In basic raw conversion the process is linear until the very end. The gamma curve in the last step is ideally undone by the display device so the end-to-end imaging system – from light hitting sensor to light hitting eyes – is also approximately linear. If one is after a ‘faithful’ rendering of the image as captured from the sensing plane this is all one needs from a raw converter.
+ Contrast: Adapting to Output Device DR
As we will see, however, basic linear raw conversion of files from current digital cameras is almost never satisfactory as-is because the vast majority of output devices (photo paper, monitors) have lower contrast ratios than that captured by a good quality DSC these days. Therefore a contrast correction is also pretty well mandatory in order to actively guide the latter’s larger Dynamic Range into the formers’ smaller one. It can be just a simple curve – or something more sophisticated, with separate local and global controls for shadows, highlights, clarity etc. (standard sliders in commercial raw converters and editors). It is normally performed in the editing portion of rendering – but if one is after ‘accurate’ color, contrast should ideally be part of Color Correction step 6 above, during raw conversion.
After the seven basic raw conversion steps an editor may then be used to objectively correct for imperfections in the specific capture and display process – and/or to make the image subjectively more pleasing to the artist’s eye. For instance correct lens distortion and lateral chromatic aberration; or apply noise reduction; or apply capture/output sharpening. Some of these functions could arguably benefit from being performed in linear space before certain raw conversion steps above but for most purposes they are optional and they can be done just as well in an Editor after rendering in a gamma space.
In this article we will concentrate on the basic raw conversion portion of the rendering equation and leave editing for another time.
1. Load Raw Data and Subtract Blacks
This first step in raw conversion is simply to load linear image data from the raw file into memory as-is. Since raw file formats tend to be different and proprietary to camera manufacturers most raw converters use variations on open source dcraw to perform this task. The following command line will repackage uncompressed raw image data into a linear 16-bit TIF file so that it can be read by the image processing application of choice (Octave/Matlab, ImageJ, PS etc.):
dcraw -D -4 -T.
The recorded raw values are linear with incoming light intensity but they are typically stored with camera and channel specific positive offsets. These so-called Black Levels often have values in the few hundred to few thousand DN and need to be subtracted from the original raw data so that pixels show zero intensity with zero light. dcraw will do this for you (albeit somewhat generically) when the command line dcraw -d -4 -T is used instead of the one above. As it turns out Nikon subtracts the Black Level of each channel before writing data to D610 raw files so either command works equally well for our reference capture.
The loaded data at this stage simply represents the ‘grayscale’ intensity of the image – but it does have one to one correspondence to a position on the sensing plane associated to the specific color filter that it is sitting under. In Bayer sensors the layout is in rows of alternating red and green – and green and blue – filters as shown below. The exact layout is specified according to the order of the first quartet of the active area on the sensor, in this case RGGB as shown below but three other orders are possible.
The D610’s Bayer color filter array is laid out as RGGB quartets. The raw data looks as follows at this stage, an apparently underexposed grayscale image:
If it looks pitch black to you (as it does to me) then your browser does not know how to display properly tagged linear data. Interestingly WordPress’s editor seems to show it fine while, once published, Chrome itself does not. For those of us in this predicament this is what the cfa grayscale image should look like instead:
This is what raw cfa data looks like straight off the file. It looks a bit dark because I Exposed this image To The Right in order not to clip the highlights of the waterfall – we’ll fix that later in the linear brightness correction step.
It looks like a fully formed grayscale image because of its small size but it’s not: even in uniformly illuminated areas pixelation caused by the different color information collected under adjacent color filters of differing strength can easily be seen by zooming in. This is what the portion of the image just below the left WhiBal gray card’s hole looks like at 600%:
2. White Balance Image Data
Because of the illuminant’s spectral energy distribution (in this case a partly overcast sky through the foliage) and the spectral sensitivity of the filters in the color filter array sitting on the sensor, different color pixels record proportionately lower or higher counts even under the same light. This is especially obvious in neutral parts of the image which we expect to show the same mean count in each color plane when displayed. In this case the values on the neutral gray card show that red and blue pixels recorded on average 48.8% and 75.4% of the count of green pixels respectively.
The next step is therefore to apply White Balance to linear cfa image data by multiplying every red pixel by 2.0493 (1/48.8%) and every blue pixel by 1.3256 (1/75.4%)- to obtain a Camera Neutral grayscale image (in Adobe parlance):
Note how the pixelation is now gone at the top of the image – the neutral portion that was used as a reference for the multipliers – but of course not in the colored portions of the image.
3. (Correct Linear Brightness)
This is somewhat of a subjective step and arguably should not be performed if one is after a ‘faithful’ rendering of the image projected on the sensing plane as captured in the raw data. However nobody always nails Exposure and different cameras often place middle gray at different percentages of full scale in the raw file so it is useful to be able to correct rendered brightness. That’s why many raw converters call the relative slider Exposure Correction or Compensation (EC). Adobe also has a related DNG tag for normalizing Exposure recorded by different cameras called BaselineExposure.
The image looks a bit dark for my taste and in fact the 48% WhiBal gray card measures out at 17% of full scale as-is. Linear brightness correction simply means multiplying every white balanced pixel in the data by a constant. If we trusted that there were no highlights worth keeping above 100% that constant would be 48/17, equal to a 1.5 stops correction. In this case I chose to apply a subjectively conservative +1.1 stops of linear brightening by multiplying all file data by a factor of 2.14, with this result:
That’s better. But as you can see the price to pay in order to keep the brightness correction linear are blown highlights in the waterfall. That’s where the advanced nonlinear highlight compression and shadow recovery sliders found in most real raw converters come in handy.
4. Ensure Even Clipping of the White Balanced Data
If the green channel was originally clipped in the raw data the other two may need to be clipped to full scale after white balancing also in order to deal with the created nonlinearity. Full scale is shown in the histograms below as normalized value 1.0 and it’s clear that the green channel in the original raw data of our capture was clipped (because of all the values piled up there) while the other two were not.
The reason for clipping at full scale is that when color information is not available from all three channels the rendered color will most likely be wrong. In the final image it could show as, for instance, a pinkish tinge in the highlights near saturation, below right. Very annoying in snowy mountain landscapes for instance. So clipping is an all-or-nothing solution to this problem.
The alternative to clipping would be to make some assumptions about the missing color and fill the relative data in: in advanced raw converters this is typically controlled by algorithms or sliders with names like highlight ‘reconstruction’. There are many ways to accomplish this feat. For instance if only one channel is missing, as is the case of green between 1.0 and 1.2 in Figure 8, the simplest is to assume that the highlight is near white and the image is properly white balanced – and guess that the one clipped value in a pixel is the average of the other two. You can also see that in this capture that strategy would at most be able to reconstruct about 1/4 stop of highlights. Normalization back to 1.0 would then be needed to bring that 1/4 stop back into the fold.
5. Demosaic CFA Data
The cfa image was until this point on a single ‘grayscale’ plane therefore areas of color appeared pixelated when viewed at 100%. The time has come to demosaic it, separating the red, green and blue pixels shown in Figure 2 onto their own individual, full-sized color planes by guesstimating the missing data (the white squares in the image below).
Any number of advanced demosaicing algorithms can be used for this step, most these days are mature and very good although some are better than others depending on subject. Some raw converters like open source Raw Therapee and dcraw offer the user the choice of a number of different algorithms. The majority of raw converters however do not offer that option, adopt/adapt one demosaicer and use it under the hood. Do you know what demosaicing algorithm your raw converter of choice uses?
For this test I decided to cheat and simply collapse every RGGB quartet into a single RGB pixel, keeping the R and B values for each quartet as they are in the raw data and averaging the two G ones (equivalent to dcraw -h mode). This is effectively 2×2 nearest neighbor demosaicing/resizing and results in a more manageable, smaller, half sized image (half linearly on each side, one quarter the area):
Figure 11 shows that our raw capture data is now in RGB format with three fully populated color planes. The colors are those recorded by the camera (can we say ‘camera space’?) as shown by your browser through your video path and monitor. They seem a bit bland but not too far off. They are not in a standard RGB color space, so imaging software and hardware down the line do not necessarily know what to make of them. The next step is therefore to convert these colors to a generally understood colorimetric standard.
6. Color Transformation and Correction
This is one of the less intuitive but more critical steps needed in order to produce a rendered image of pleasing color. Every manufacturer guards the recipes of the color filters in the CFA of their sensors closely but with the right equipment the relative spectral sensitivity functions are not too difficult to reverse engineer. In fact you can take an approximate peek at those of your camera with a cheap spectrometer.
With those in hand – and a lot of assumptions about typical illuminants, scenes and viewers – a compromise linear matrix can be generated to transform color obtained through a camera’s CFA image (like that shown in figure 11 above) to standard color understood by software like editors or browsers and devices like monitors or printers.
Luckily for us some excellent laboratories measure and calculate these linear matrices – and make them freely available online. DXOmark.com for instance produces matrices to convert data from white balanced raw data to sRGB for two illuminants for every camera in their database (see the ‘Color Response’ tab). This is the one for the Nikon D610 under illuminant D50:
The best compromise matrix depends on the color temperature of the illuminant at the time of capture so the actual matrix used is typically interpolated from a couple of references, often A and D65. The converted data is then adapted to the illuminant expected by the final color space, D65 for sRGB for instance. The result is a matrix such as the one shown in Figure 12. It is then a simple matter of multiplying it by the RGB value of each demosaiced pixel after step 5 above as shown in the next post.
Adobe offers a more flexible process, as outlined in its DNG Converter specification. Instead of going straight from camera CFA to colorimetric color space, it takes a spoke and hub approach, converting first into a connection color space (XYZ D50) through multiplication of the white balanced and demosaiced data by an interpolated linear Forward Matrix; and then from there into a final color space like sRGB through standard matrices. Sometimes it also applies additional non linear color corrections according to custom profiles while in XYZ (HSV corrections via ProPhoto RGB HueSatMap and LookTable in DNG speak).
Forward Matrices of the camera that took it are written into every DNG converted raw file, bless Adobe’s soul. I lifted the ones for the D610 from there and Bradford-adapted XYZD50 -> sRGBD65 matrices from Bruce Lindbloom’s site in order to produce the final raw-converted image:
The colors are now what they should be for display by software or devices that expect data in the sRGB color space. Just in case you were wondering this image looks virtually identical to one produced by Nikon’s Capture NX-D raw converter with the ‘Flat’ profile. However it does not look very incisive because of the poor contrast ratio of our monitors.
7. Apply Gamma
This last step depends on the chosen output color space, in this case sRGB’s approximately 2.2 gamma. I mention it separately only to indicate the point at which the rendering process becomes necessarily non-linear. From this point on the image is in a colorimetric, gamma color space and can be properly handled by your Editor of choice and/or displayed as-is. In theory all previous steps were instead known and linear, hence easily reversible.
In 2016 a contrast correction is pretty well always needed before display in order to proactively choose how to squeeze the camera’s larger Dynamic Range into the output device’s smaller one. For instance, depending on your tolerance for noise, the D610 has a DR of 12+stops while my more than decent monitor has a contrast ratio of about 500:1 (or about 9 stops) when calibrated and profiled. This means that the bottom three+ stops of usable tones from the camera would never get displayed because they would not make it past the monitor’s backlight.
A curve will subjectively redistribute tones throughout the range so that some shadow gradations will be more visible at the expense of some highlights’ (that’s why this is called a ‘Tone Curve’ in Adobe DNG speak). Applying a touch of sharpening and the ‘Increase Contrast’ curve in Photoshop CS5 to figure 13 just above produces the final rendered image:
Of course applying a contrast curve at such a late stage does cause some changes in chromaticity and saturation but that’s what happens when you implement these adjustments in a gamma RGB space after rendering. In any case this is typically how it has historically been done by widely used raw converters and it is therefore the procedure and look that we have become accustomed to over the years. The alternative for best perceptual ‘accuracy’ would instead be to use one of Torger’s excellent non-linear color profiles with built-in neutral tone reproduction operator during color correction step 6 – and never touch the tones again.
So to recap the only steps needed for basic, linear luminosity and color, raw conversion are
- Get linear image data from the raw file and subtract Black Levels
- White Balance it
- (Optionally correct its Brightness)
- Make sure it is properly clipped
- Demosaic it
- Apply Color Transforms and Corrections
- Apply Gamma
And you are done. Basic raw conversion demystified, not so complex after all.
Technical note: the call used in Octave/Matlab to produce the images in this article is shown below, with the 7 basic steps in the yellow section:
s = raw2RGB(‘DSC_4022’ , ‘ROI’ , 1.1)
If you use this script, save the file as a TIFF once it has run, load it in a color managed editor and ‘Apply’ the chosen color space in order to see its colors properly.
Notes and References
1. See Anders Torger’s excellent site on the subject of color profiles.
2. dcraw’s main site is here and windows executables can be downloaded from here.
3. See here for a description of Bayer CFA quartets.
4. sRGB gamma applied for deficient browsers to all images except figures 2 and 3.
5. See here for an article on how to roughly measure the spectral distribution of your CFA + illuminant with an inexpensive spectrometer designed by a Harvard team.
6. Adobe’s Digital Negative specification version 22.214.171.124 corresponding to ‘Process 2012’ can be found here.
7. Reverse Matrices as used by dcraw are also there but the DNG spec says that “The use of the forward matrix tags is recommended for two reasons. First, it allows the camera profile creator to control the chromatic adaptation algorithm used to convert between the calibration illuminant and D50. Second, it causes the white balance adjustment [..] to be done by scaling the camera coordinates rather than by adapting the resulting XYZ values, which has been found to work better in extreme cases.”
8. You can read the detail of how the relative color matrices were applied here.
9. See section 8 here for how sRGB’s gamma is defined.
10. The Matlab/Octave scripts used to generate some of the figures in this post can be downloaded from here.