My camera sports a 14 stop Engineering Dynamic Range. What bit depth do I need to safely fully encode all of the captured tones from the scene with a linear sensor? As we will see the answer is not 14 bits because that’s the eDR, but it’s not too far from that either – for other reasons, as information science will show us in this article.
When photographers talk about grayscale ‘tones’ they typically refer to the number of distinct gray levels present in a displayed image. They don’t want to see distinct levels in a natural slow changing gradient like a dark sky: if it’s smooth they want to perceive it as smooth when looking at their photograph. So they want to make sure that all possible tonal information from the scene has been captured and stored in the raw data by their imaging system.
Preserving Detected Information Quality
An idealized way to accomplish this objective would be to store the number of photoelectrons collected by the sensor one for one: this is equivalent to setting the system’s gain to 1 DN/e-.
If that were the case a camera like the D800e for instance, which according to sensorgen.info clips at about 55k e- at base ISO, would require less than 16 bits to encode all photoelectrons to full scale (log2 of 55k). At ISO 200 it clips at about 26k e-, so only 15 bits would be required – and so on according to the following table
Pretty easy then, we are done right? Well, actually thanks to Information Theory we can do better than that.
A Question of Balance
We don’t actually need as many levels as a pixel could possibly collect photoelectrons to fully preserve the information coming off the sensor. We would in the absence of noise. But there is always a certain amount of random noise present in an imaging system, partly inherent in the Signal itself (Poissonian Shot Noise) and partly added by the electronics (Gaussian Read Noise). This noise degrades the accuracy of measurement of the e- rolling off the sensor, influencing the precision required to fully record the relative information.
The question we are trying to answer is similar to asking whether it makes sense to record weights obtained through an old hand balance to X decimal places. If we were to task a number of volunteers at a market to weigh the same tomato several times, asking them each time to write down the weight to, say, 5 decimal places in units of Kg, we would soon find that all digits beyond a certain decimal place, say the second, vary randomly. There is no information in randomness – therefore no additional information would be captured by writing down figures beyond the second decimal place.
The complete recordable information is actually fully contained within the first two decimal places (column A), which represent the measurement’s significant figures. Noting down the rightmost 3 digits is a waste of time, ink and space because they contain no additional information whatsoever, by definition.
Read Noise is Random Too
Similarly the lowest, darkest Signal in e- coming out of a current ILC sensor at base ISO has a noise component superimposed on it – a standard deviation proportional to the capabilities of the sensor – that make its last ‘digits’ fluctuate randomly. If we assume that the downstream electronics are very clean we can say that this random noise is approximately equal to Read Noise. Just as in the balance example above there is no point in writing down random data – so we can choose to save space and time by only recording ‘bits’ that do not flip randomly because of noise. There is no information to be gained by recording the Signal in steps (least significant bits) smaller than random Read Noise – doing so would only be a waste of time and space (Janesick discusses this in chapter 11 of his fine book).
For instance, if our sensor had random read noise of 10e- at base ISO it would be wasteful to record the incoming signal at 1e- precision. Information Science tells us that reading and recording it at 10e- precision is enough to ensure that all information (signal) from the sensor is fully accounted for. This means that to achieve our objective of recording all information from the sensor, with a well designed system we would only need one tenth as many raw levels as when counting every e-, as shown below:
Similarly, the read noise of the D800e is 5.2 e- at base ISO, so we would only need 1/5.2 as many raw levels as its clipping e- count. That means that in order to fully encode all of the tones from a scene the D800e needs to store its raw data in no less than 54,924e- / 5.2e- = 10,562 levels – which represent somewhat less than 14 bits of information (log2 of 10,562 levels). Any more would be wasteful.
The Required Bit Depth is…
By now you’ve realized that the two key elements in determining the bit depth necessary to fully encode tones from the scene in a properly designed imaging system are its clipping level and read noise as set up (meaning at the desired ISO). The encoded bit depth needs to be maintained higher than the ratio between the capacity of the ADC and read noise at its input, in the same units, expressed as a power of two logarithm.
Only the manufacturer can measure the actual noise level at the input of the ADC. What we can estimate instead thanks to Photon Transfer Curves is the random read noise referred to the output of the photosites in physical units of photoelectrons (e-). If analog amplification and transfer of the e- to the ADC add little noise, we can assume that the estimated noise out of the photosites is about the same as that at the input of the ADC**. In this case the required bit depth in order to fully encode information out of the photosites can be simplified to
This looks suspiciously like a DR formula, and in fact it’s what sensorgen.info calls DR in its pages. But in this context it means required bit depth, so feel free to peruse them to compare the bit depth required by different cameras at different ISOs. Canons in general for instance seem to be quite inefficient in their choice of bit depth. Sony seems to go the other way.
Note that read noise also determines by definition the ideal sensitivity of the system k (in e-/DN, equal to 1/gain).
Encoding image information from the sensor in fewer than the bits indicated by the formula above will result in some information loss; in more bits than suggested it should not have any impact on IQ.
Note that as ISO is raised the number of bits required to fully encode image information diminishes, a fact often taken advantage of by camera manufacturers. This may show up in gaps in 14-bit histograms at higher ISOs (gasp!) Not to worry, you will not have any more visible banding in your images than if they weren’t there (examples in this article). In fact, what’s the point of showing a 14-bit histogram of 12 or 10 bit scene information?
In Fact It’s Even Less
In this article we have only dealt with the effect of read noise on imager design, bit depth and the quality of the information recorded. Read noise dominates in the deepest shadows. But further up in the tonal range there is another source of random noise which can be used to dither the data and reduce the number of levels required to fully encode scene information rolling off a sensor: shot noise inherent in the Signal, which grows as its square root in units of e-.
This fact is exploited by a number of ILC manufacturers in their encoding (e.g. Sony) and ‘lossy’ compression modes (e.g. Nikon). Emil Martinec has written a nice post that explains the process involved: fewer bits required, faster, less space used at no cost to IQ. If you’ve followed the balance scale example above it should be apparent that information in a raw file gives up nothing but random digits by utilizing those modes. So if you need the improved speed or space that they afford, go for it without remorse.
One last thing. The fact that in a properly designed imaging system the least significant bit (LSB) should approximately correspond to its read noise in units of e- does not mean that the lowest Signal detectable by it (hence discernible by us visually) is 1 LSB/DN. Far from it. In other words in a well designed 14 bit sensor, thanks to dithering action by read noise, the Signal can be detected well below 14 stops from clipping at base ISO. The question is whether such a signal is of sufficient enough quality for the purpose at hand. Engineers (eDR) and Photographers (PDR) think not. Astrophotographers may disagree. As you can read in this article.
** That is not always the case. This subtle difference can sometimes result in interesting PTC responses near base ISO for overly clean sensors, even with an estimated input-referred read noise larger than 1 LSB (latest Exmors, see for example some curves at Jim Kasson’s).