Bayer CFA Effect on Sharpness

In this article we shall find that the effect of a Bayer CFA on the spatial frequencies and hence the ‘sharpness’ captured by a sensor compared to those from a corresponding monochrome imager can go from nothing to halving the potentially unaliased range based on the chrominance content of the image projected on the sensing plane and the direction in which the spatial frequencies are being stressed.

A Little Sampling Theory

We know from Goodman[1] and previous articles that the sampled image (I_{s} ) captured in the raw data by a typical current digital camera can be represented mathematically as  the continuous image on the sensing plane (I_{c} ) multiplied by a rectangular lattice of Dirac delta functions positioned at the center of each pixel:

(1)   \begin{equation*} I_{s}(x,y) = I_{c}(x,y) \cdot comb(\frac{x}{p}) \cdot comb(\frac{y}{p}) \end{equation*}

with the comb functions representing the two dimensional grid of delta functions, sampling pitch p apart horizontally and vertically.  To keep things simple the sensing plane is considered here to be the imager’s silicon itself, which sits below microlenses and other filters so the continuous image I_{c} is assumed to incorporate their as well as pixel aperture’s effects.

Because spatial domain multiplications become convolutions in the frequency domain, the Spectrum F_{s} of sampled image I_{s} is just the Fourier Transform of the continuous image F_{c} convolved with the transform of the comb functions:

(2)   \begin{equation*} F_{s}(f_{x},f_{y}) = F_{c}(f_{x},f_{y}) \ast\ast [p^2 comb(pf_{x}) comb(pf_{y})] \end{equation*}

with \ast\ast indicating two dimensional convolution and p sampling pitch.

Monochrome Spectrum

We saw what that looked like in 3D in the article on Aliasing but this time I am going to show the magnitude of the Discrete Fourier Transform of a typical test target captured by a monochrome digital camera in 2D as an image, with image brightness representing  the energy of the relative spatial frequency:

Figure 1. Left: Siemens star target as captured by a Monochrome Typ 216 in the raw data.  Right: Spectrum of the Discrete Fourier Transfer of the image on the right.

On the left is a Siemens star target captured by the fine folks at DPReview.com with a Leica Monochrome Typ 216 at base ISO.  On the right is the linear magnitude of the DFT of the raw capture as performed by Matlab/Octave, otherwise known as its Spectrum.   It is difficult to see what’s going on above right because of the very large energy excursion involved, so normally a logarithm of the Spectrum is shown instead – keeping in mind that this approach tends to overemphasize low energy effects.  Below the same data is displayed as ln(1+Spectrum):

Figure 2. ln(1+Spectrum) of image in Figure 1.  Origin is top left.  The yellow lines represent Nyquist frequencies.

The horizontal and vertical units are cycles per pixel pitch with zero cycles/pitch (c/p) top left and one c/p at the other three corners: (0,0), (1,0), (1,1), (0,1) clockwise for linear spatial frequencies f_{x} and f_{y} in the x and y directions respectively.  Nyquist frequencies are then half way down the top, bottom left and right edges of Figure 2, corresponding to the yellow lines.

The DFT routine only shows one period of the baseband Spectrum but, mentally, tile Figure 2 vertically and horizontally forever because the function is actually infinitely periodic, per equation (2).    The tiling is caused by the orthogonal Dirac delta combs that effectively modulate baseband out at cycles per sampling pitch spacing.  When you do that it becomes sometimes useful to look at the baseband Spectrum with the origin (0,0) shifted to the center of the relative image:

Figure 3.  Baseband spectrum in Figure 2, centered.  Nyquist frequencies are along the edges of the image.

Now the origin of the (f_{x},f_{y}) spatial frequencies is in the center of the Spectrum, the four corners representing (-0.5,-0.5), (0.5,-0.5), (0.5,0.5), (-0.5,0.5) c/p clockwise starting top left.  Nyquist frequencies therefore run along the edges.  It’s clear that as long as all the energy of the spatial image fits within the Nyquist Frequency boundaries there will be no aliasing.  On the other hand it is obvious by looking at Figure 2 that this is not the case here because some of the Spectrum extends into neighboring quadrants, therefore exceeding Nyquist.  We can see the same thing in Figure 3 by noticing the ‘reflection’ of rays at the edges of the Spectrum.

The Impact of a Bayer CFA

So that is the story for a monochrome sensor.  What impact will the introduction of a Bayer Color Filter Array have on the Spectrum of a current digital camera all else equal?   Linearity and superposition apply so one way to look at the CFA image is to pretend that we are actually sub-sampling four separate continuous images, one per color plane, at a spacing of every other monochrome pixel according to the Bayer layout.  We can then estimate the spectrum of each sub-sampled image separately – and add the individual results up to obtain the spectrum of the CFA image.

Figure 4. One way to think of the effect of a Bayer CFA on the Spectrum captured by  a sensor is to consider separate color planes sampled every other pixel.  Here the two G channels are shown superimposed.

Each sub-sampled color plane in Figure 4 has half the linear pixels of the fully populated monochrome sensor and sampling pitch (p) is doubled for a Bayer CFA sensor compared to monochrome .  Twice the pitch means convolving the baseband spectrum with combs with half the spacing between the Dirac deltas in equation (2), therefore halving the Nyquist frequency.  To verify this intuition we can take a look at the spectrum of a Siemens star, again captured in the raw data by the fine folks at DPReview.com, this time with a Bayer CFA Nikon D7200 at base ISO.  It is shown below with the same spatial frequency scale as in Figure 3.  Note pixelation in the spatial image to the left due to the different relative intensity in the four color planes:

Figure 5.  Left: Raw file of Siemens star captured by DPReview.com with a Bayer CFA Nikon D7200 at base ISO.  Right: Spectrum of the raw file, computed as the magnitude of the DFT of the linear image to the left as-is..

Sure enough the baseband signal of the Bayer CFA D7200 appears to be repeating at twice the linear frequency  of the Monochrome Typ 216, reducing by half the pristine unaliased Spectrum.  Nyquist here appears to occur at -1/4 and 1/4 cycles per monochrome pitch, versus -1/2 and 1/2 before.

However, this intuitive explanation is somewhat unsatisfying.  For instance, why couldn’t a Bayer CFA sensor behave like a monochrome sensor if the subject were neutral and the raw data properly white balanced before demosaicing?

Full Res Grayscale, Subsampled Chrominance

David Alleysson[2] and, subsequently, Eric Dubois[3] came up with a mathematical model that clearly explains the  effect of color on a Bayer CFA sampled image in the frequency domain, their papers are worth a read.  In this article I will use Dubois’ notation which I find a little easier to follow.

Alleysson’s insight, as explained by Dubois, is based on assuming three fully populated planes (i.e. not subsampled, each the size of the equivalent full size monochrome sensor), receiving exactly the same light from the scene, each behind a large color filter with spectral sensitivity equivalent to that of the respective R, G or B Color Filter Array on the digital sensor.

Figure 7.  Another way to model the impact of a Bayer CFA on the Spectrum captured by a digital sensor is to consider three full resolution images, each under one of the three color filters of a typical CFA, and to separate them into full resolution grayscale and chrominance components.

Following the simple math in Dubois’ letter we can see that the image captured in the raw data by a Bayer CFA sensor can be expressed as follows:

(3)   \begin{equation*} \begin{align*} CFA_{(x,y)} &= L_{(x,y)} \\ &+ C_{1(x,y)}\cdot e^{i\frac{2\pi(x+y)}{2}}\\ &+ C_{2(x,y)}\cdot (e^{i\frac{2\pi x}{2}}- e^{i\frac{2\pi y}{2}}) \end{align*} \end{equation*}

where:

  • CFA is the Bayer CFA image captured in the raw data
  • (x,y) are the coordinates of the full size image on the sampling lattice, 0,1,2,3… #samples horizontally and vertically resp.
  • L is the full resolution baseband grayscale (luma) component of the image on the sensing plane, as better defined below
  • C_{1} and C_{2} the two full resolution chrominance components better defined below
  • e are the exponential terms responsible for modulation of the chrominance components to Nyquist.

By full resolution I mean not sub-sampled but with the same pixel layout and resolution as an equivalent monochrome sensor.  As Dubois says ‘This can be interpreted as a baseband luma component , a chrominance component modulated at the spatial frequency (0.5, 0.5), and a second chrominance component modulated at the two spatial frequencies (0.5, 0) and (0, 0.5), where spatial frequencies are expressed in cycles per pixel…’ , not too far from our earlier intuition.  The e terms in equation (3) are responsible for the modulation hence for the replicas of C_{1} at the corners and C_{2} at the cardinal points.

LC_{1} and C_{2} are defined in terms of full resolution R, G and B color plane data as follows:

(4)   \begin{equation*} \begin{align*} L_{(x,y)} &= +\tfrac{1}{4}R_{(x,y)} +  \tfrac{1}{2}G_{(x,y)} + \tfrac{1}{4}B_{(x,y)}\\ C_{1(x,y)} &= -\tfrac{1}{4}R_{(x,y)} +  \tfrac{1}{2}G_{(x,y)} - \tfrac{1}{4}B_{(x,y)}\\ C_{2(x,y)} &= -\tfrac{1}{4}R_{(x,y)} + \tfrac{1}{4}B_{(x,y)} \end{align*} \end{equation*}

This is interesting because Equation (3) suggests that mathematically a Bayer CFA image consists of a full resolution baseband grayscale component L with Nyquist frequency at \frac{1}{2} monochrome c/p just like the monochrome sensor – and only the chrominance components C_{1} and C_{2} are sub-sampled at twice the pitch.  As a result the chrominance components  get modulated out to their respective monochrome Nyquist frequencies and potentially corrupt the otherwise pristine baseband signal, possibly halving monochrome Nyquist and the useful frequency range as shown in Figure 5.

A Bayer CFA raw file contains a full resolution grayscale image L because of the correlation between adjacent color pixels, which our earlier thought experiment ignored.

To Subsample or not to Subsample

The other insight comes from Equation (4), where we can easily see that the sub-sampling function of the e terms in Equation (3) on the C_{1} and C_{2} full resolution images is immaterial when C_{1} and/or C_{2} are equal to zero.

For instance it is obvious that C_{2} disappears whenever R = B, that is when those two color channels are the same.  I simulated this case by taking the full resolution raw data of the Monochrome Typ 216 and applying a factor of 0.6 to the pixels that would have corresponded to the R and B channels in a Bayer CFA file – a common scenario in raw data captured by current digital cameras in daylight.  Below to the left you can see the demosaiced image, to the right the Spectrum of the mosaiced CFA as described.  Note the missing baseband replicas at the cardinal points:

Figure 8.  Left: The Siemens Star monochrome image in Figure 1 with assumptive R and B channels multiplied by 0.6 and then demosaiced.  Right: The spectrum of the relative CFA image.  Note the missing C2 replicas at the cardinal points.

The same will be true anywhere the R and B planes are equal.

On the other hand C_{1} disappears when the sum of the two G channels is equal to the sum of R and B.   In this case I applied factors of 1.4 and 0.6 to  the R and B planes respectively, note the missing baseband images at the corners:

Figure 9. Left: The Siemens Star monochrome image in Figure 1 with assumptive R channel multiplied by 1.4 and the B channel multiplied by 0.6 and then demosaiced.  Right: The spectrum of the relative CFA image. Note the missing C1 replicas in the corners and that 1.4+0.6 = 2.

The same will be true anywhere the sum of the two G planes is equal to the sum of R and B.  Here for instance factors of 0.7 and 1.3 respectively were applied instead, to the same effect:

Figure 10. Left: The Siemens Star monochrome image in Figure 1 with assumptive R channel multiplied by 0.7 and the B channel multiplied by 1.3 and then demosaiced.  Right: The spectrum of the relative CFA image. Note the missing C1 replicas in the corners and that 0.7+1.3 = 2.

Of course both C_{1} and C_{2} disappear where R = G = B, the case where the subject is neutral and the color planes each see the same intensity,  Here the grayscale baseband component is left alone, uncorrupted by chrominance:

Figure 11. Left: The Siemens Star monochrome image in Figure 1 with assumptive R, B and G channels multiplied by 1 and then demosaiced.  Right: The spectrum of the relative Bayer CFA image.  Note the absence of C1 and/or C2 replicas.

What about in practice?

That’s how it works in theory.  In the real life Bayer CFA Spectrum of the earlier D7200 capture, white balancing the raw data before demosaicing  effectively suppressed C_{1} and C_{2} in its Spectrum as shown in Figure 12.  There’s the approximation of the full resolution grayscale image L present in the raw data of a Bayer sensor (don’t forget that what is actually shown is the natural logarithm of the magnitude of the Discrete Fourier Transform, which tends to overemphasize low level signals).

Figure 12. Left: The Siemens Star captured by a D7200 with a Bayer CFA in Figure 5, white balanced and then demosaiced.  Right: The natural logarithm of the Spectrum of the white balanced Bayer CFA.

White balancing the raw data was not able to completely eliminate C_{1} and C_{2} because of residual slight differences in the information collected by each channel, some physical and some due to non-idealities in the system.  For instance differences in the amount of noise, diffraction, lateral chromatic aberration – or non-uniformities in the sensor.

So how does introducing a CFA affect sharpness?

In conclusion we have seen that the effect of a Bayer CFA on the spatial frequencies and hence the ‘sharpness’ captured by a sensor compared to those from a corresponding monochrome imager can go from nothing to halving the potentially unaliased range based on the chrominance content of the image projected on the sensing plane and the direction in which the spatial frequencies are being stressed.

This kind of analysis was responsible for a flurry of papers on frequency domain demosaicing which were state-of-the-art about ten years ago and are still going strong today.

Notes and References

1. Introduction to Fourier Optics 3rd Edition, Joseph W. Goodman, p. 22.
2. Linear demosaicing inspired by the human visual system. David Alleysson, Sabine Susstrunk, Jeanny Herault. IEEE Transactions on Image Processing, Institute of Electrical and Electronics Engineers, 2005, 14 (4), pp.439-449.
3. Frequency-Domain Methods for Demosaicking of Bayer-Sampled Color Images. Eric Dubois. IEEE SIGNAL PROCESSING LETTERS, 2005, 12 (12), p. 847.
4. Adaptive Filtering for Color Filter Array demosaicing. IEEE Transactions on Image processing, vol. 16, no. 10, October 2007, Nai-Xiang Lian, LanlanChang, Yap-PengTan, and VitaliZagorodnov.
5. Eric Dubois CFA Demosaicing page.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *