In this article we shall find that the effect of a Bayer CFA on the spatial frequencies and hence the ‘sharpness’ captured by a sensor compared to those from a corresponding monochrome imager can go from nothing to halving the potentially unaliased range based on the chrominance content of the image projected on the sensing plane and the direction in which the spatial frequencies are being stressed.
A Little Sampling Theory
We know from Goodman and previous articles that the sampled image ( ) captured in the raw data by a typical current digital camera can be represented mathematically as the continuous image on the sensing plane ( ) multiplied by a rectangular lattice of Dirac delta functions positioned at the center of each pixel:
with the functions representing the two dimensional grid of delta functions, sampling pitch apart horizontally and vertically. To keep things simple the sensing plane is considered here to be the imager’s silicon itself, which sits below microlenses and other filters so the continuous image is assumed to incorporate their as well as pixel aperture’s effects.
Because spatial domain multiplications become convolutions in the frequency domain, the Spectrum of sampled image is just the Fourier Transform of the continuous image convolved with the transform of the comb functions:
with indicating two dimensional convolution and sampling pitch.
We saw what that looked like in 3D in the article on Aliasing but this time I am going to show the magnitude of the Discrete Fourier Transform of a typical test target captured by a monochrome digital camera in 2D as an image, with image brightness representing the energy of the relative spatial frequency:
On the left is a Siemens star target captured by the fine folks at DPReview.com with a Leica Monochrome Typ 216 at base ISO. On the right is the linear magnitude of the DFT of the raw capture as performed by Matlab/Octave, otherwise known as its Spectrum. It is difficult to see what’s going on above right because of the very large energy excursion involved, so normally a logarithm of the Spectrum is shown instead – keeping in mind that this approach tends to overemphasize low energy effects. Below the same data is displayed as ln(1+Spectrum):
The horizontal and vertical units are cycles per pixel pitch with zero cycles/pitch (c/p) top left and one c/p at the other three corners: (0,0), (1,0), (1,1), (0,1) clockwise for linear spatial frequencies and in the x and y directions respectively. Nyquist frequencies are then half way down the top, bottom left and right edges of Figure 2, corresponding to the yellow lines.
The DFT routine only shows one period of the baseband Spectrum but, mentally, tile Figure 2 vertically and horizontally forever because the function is actually infinitely periodic, per equation (2). The tiling is caused by the orthogonal Dirac delta that effectively modulate baseband out at cycles per sampling pitch spacing. When you do that it becomes sometimes useful to look at the baseband Spectrum with the origin (0,0) shifted to the center of the relative image:
Now the origin of the (,) spatial frequencies is in the center of the Spectrum, the four corners representing (-0.5,-0.5), (0.5,-0.5), (0.5,0.5), (-0.5,0.5) c/p clockwise starting top left. Nyquist frequencies therefore run along the edges. It’s clear that as long as all the energy of the spatial image fits within the Nyquist Frequency boundaries there will be no aliasing. On the other hand it is obvious by looking at Figure 2 that this is not the case here because some of the Spectrum extends into neighboring quadrants, therefore exceeding Nyquist. We can see the same thing in Figure 3 by noticing the ‘reflection’ of rays at the edges of the Spectrum.
The Impact of a Bayer CFA
So that is the story for a monochrome sensor. What impact will the introduction of a Bayer Color Filter Array have on the Spectrum of a current digital camera all else equal? Linearity and superposition apply so one way to look at the CFA image is to pretend that we are actually sub-sampling four separate continuous images, one per color plane, at a spacing of every other monochrome pixel according to the Bayer layout. We can then estimate the spectrum of each sub-sampled image separately – and add the individual results up to obtain the spectrum of the CFA image.
Each sub-sampled color plane in Figure 4 has half the linear pixels of the fully populated monochrome sensor and sampling pitch () is doubled for a Bayer CFA sensor compared to monochrome . Twice the pitch means convolving the baseband spectrum with with half the spacing between the Dirac deltas in equation (2), therefore halving the Nyquist frequency. To verify this intuition we can take a look at the spectrum of a Siemens star, again captured in the raw data by the fine folks at DPReview.com, this time with a Bayer CFA Nikon D7200 at base ISO. It is shown below with the same spatial frequency scale as in Figure 3. Note pixelation in the spatial image to the left due to the different relative intensity in the four color planes:
Sure enough the baseband signal of the Bayer CFA D7200 appears to be repeating at twice the linear frequency of the Monochrome Typ 216, reducing by half the pristine unaliased Spectrum. Nyquist here appears to occur at -1/4 and 1/4 cycles per monochrome pitch, versus -1/2 and 1/2 before.
However, this intuitive explanation is somewhat unsatisfying. For instance, why couldn’t a Bayer CFA sensor behave like a monochrome sensor if the subject were neutral and the raw data properly white balanced before demosaicing?
A Frequency Domain Bayer CFA Model
David Alleysson and, subsequently, Eric Dubois came up with a mathematical model that clearly explains the effect of color on a Bayer CFA sampled image in the frequency domain, their papers are worth a read. In this article I will use Dubois’ notation which I find a little easier to follow.
Alleysson’s insight, as explained by Dubois, is based on assuming three fully populated planes (i.e. not subsampled, each the size of the equivalent full size monochrome sensor), receiving exactly the same light from the scene, each behind a large color filter with spectral sensitivity equivalent to that of the respective , or Color Filter Array on the digital sensor, as shown below left.
Following the simple math in Dubois’ letter we can see that the image captured in the raw data by a Bayer CFA sensor can be expressed as follows:
- is the Bayer CFA image captured in the raw data
- are the coordinates of the full size image on the sampling lattice, 0,1,2,3… #samples horizontally and vertically resp.
- is the full resolution baseband grayscale (luma) component of the image on the sensing plane, as better defined below
- and the two full resolution chrominance components better defined below
- are the exponential terms responsible for the checkered look of the Bayer CFA image.
By full resolution I mean not sub-sampled but with the same pixel layout and resolution as an equivalent monochrome sensor. As Dubois says ‘This can be interpreted as a baseband luma component , a chrominance component modulated at the spatial frequency (0.5, 0.5), and a second chrominance component modulated at the two spatial frequencies (0.5, 0) and (0, 0.5), where spatial frequencies are expressed in cycles per pixel…’ , not too far from our earlier intuition. The terms in equation (3) are responsible for the modulation hence for the replicas of at the corners and at the cardinal points.
Full Res Grayscale, Subsampled Chrominance
, and are defined in terms of full resolution , and color plane data as follows:
This is interesting because Equation (3) suggests that mathematically a Bayer CFA image consists of a full resolution baseband grayscale component with Nyquist frequency at monochrome c/p just like the monochrome sensor – and only the chrominance components and are sub-sampled at twice the pitch. As a result the chrominance components get modulated out to their respective monochrome Nyquist frequencies and potentially corrupt the otherwise pristine baseband signal, possibly halving monochrome Nyquist and the useful frequency range as shown in Figure 5.
A Bayer CFA raw file contains a full resolution grayscale image because of the correlation between adjacent color pixels, which our earlier thought experiment ignored.
To Subsample or not to Subsample
The other insight comes from Equation (4), where we can easily see that the sub-sampling function of the terms in Equation (3) on the and full resolution images is immaterial when and/or are equal to zero.
For instance it is obvious that disappears whenever = , that is when those two color channels are the same. I simulated this case by taking the full resolution raw data of the Monochrome Typ 216 and applying a factor of 0.6 to the pixels that would have corresponded to the and channels in a Bayer CFA file – a common scenario in raw data captured by current digital cameras in daylight. Below to the left you can see the demosaiced image, to the right the Spectrum of the mosaiced CFA as described. Note the missing baseband replicas at the cardinal points:
The same will be true anywhere the and planes are equal.
On the other hand disappears when the sum of the two channels is equal to the sum of and . In this case I applied factors of 1.4 and 0.6 to the and planes respectively, note the missing baseband images at the corners:
The same will be true anywhere the sum of the two planes is equal to the sum of and . Here for instance factors of 0.7 and 1.3 respectively were applied instead, to the same effect:
Of course both and disappear where = = , the case where the subject is neutral and the color planes each see the same intensity, Here the grayscale baseband component is left alone, uncorrupted by chrominance:
What about in practice?
That’s how it works in theory. In the real life Bayer CFA Spectrum of the earlier D7200 capture, white balancing the raw data before demosaicing effectively suppressed and as shown in Figure 12. There’s the approximation of the full resolution grayscale image present in the raw data of a Bayer sensor (don’t forget that what is actually shown is the natural logarithm of the magnitude of the Discrete Fourier Transform, which tends to overemphasize low level signals).
White balancing the raw data was not able to completely eliminate and because of residual slight differences in the information collected by each channel, some physical and some due to non-idealities in the system. For instance differences in the amount of noise, diffraction, lateral chromatic aberration – or non-uniformities in the sensor.
So how does introducing a CFA affect sharpness?
In conclusion we have seen that the effect of a Bayer CFA on the spatial frequencies and hence the ‘sharpness’ captured by a sensor compared to those from a corresponding monochrome imager can go from nothing to halving the potentially unaliased range based on the chrominance content of the image projected on the sensing plane and the direction in which the spatial frequencies are being stressed.
This kind of analysis was responsible for a flurry of papers on frequency domain demosaicing which were state-of-the-art about ten years ago and are still going strong today.
Notes and References
1. Introduction to Fourier Optics 3rd Edition, Joseph W. Goodman, p. 22.
2. Linear demosaicing inspired by the human visual system. David Alleysson, Sabine Susstrunk, Jeanny Herault. IEEE Transactions on Image Processing, Institute of Electrical and Electronics Engineers, 2005, 14 (4), pp.439-449.
3. Frequency-Domain Methods for Demosaicking of Bayer-Sampled Color Images. Eric Dubois. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (12), p. 847.
4. Adaptive Filtering for Color Filter Array demosaicing. IEEE Transactions on Image processing, vol. 16, no. 10, October 2007, Nai-Xiang Lian, LanlanChang, Yap-PengTan, and VitaliZagorodnov.
5. Eric Dubois CFA Demosaicing page.