The Perfect Color Filter Array

January 20, 2018 Jack 14 Comments

We’ve seen how humans perceive color in daylight as a result of three types of photoreceptors in the retina called cones that absorb wavelengths of light from the scene with different sensitivities to the arriving spectrum.

A photographic digital imager attempts to mimic the workings of cones in the retina by usually having different color filters arranged in an array (CFA) on top of its photoreceptors, which we normally call pixels. In a Bayer CFA configuration there are three filters named for the predominant wavelengths that each lets through (red, green and blue) arranged in quartets such as shown below:

Figure 2. Bayer Color Filter Array: RGGB layout. Image under license from Cburnett, pixels shifted and text added.

A CFA is just one way to copy the action of cones: Foveon for instance lets the sensing material itself perform the spectral separation. It is the quality of the combined spectral filtering part of the imaging system (lenses, UV/IR, CFA, sensing material etc.) that determines how accurately a digital camera is able to capture color information from the scene. So what are the characteristics of better systems and can perfection be achieved? In this article I will pick up the discussion where it was last left off and, ignoring noise for now, attempt to answer this question using CIE conventions, in the process gaining insight in the role of the compromise color matrix and developing a method to visualize its effects.^[1]

Just be aware that when I say CFA for brevity in this article I really mean the combined effect of its spectral sensitivity functions, responsivity of the detector, filtering effects of lenses, infrared, ultraviolet and/or any other filter present.

Eye CFA Proxy = CMF

Since we are sticking with CIE conventions we might as well start with Color Matching Functions, those miraculous curves determined experimentally in 1931 that, multiplied wavelength-by-wavelength by irradiance from the scene and summed up , transform color information into XYZ coordinates.

Figure 3. CIE 1931 2-degree XYZ Color Matching Functions, under license.

Coordinates in the CIE XYZ color space are said to have one-to-one correspondence to colors perceived by the average person, represented by the CIE Standard Observer. If two coordinates are the same, the two colors should look identical to such a person, if they are sufficiently different they should look different.

In other words, if the observer and a digital camera were co-located at a scene and if the camera were able to transform captured image data to the exact same XYZ values as those produced by a Standard Observer present at the scene we would have the perfect CFA, capable of capturing any tone the observer might be able to see.

The (CIE) CFA for Perfect Color

How is the objective of having image data match one-to-one the observer in XYZ achieved? The most straightforward way is to have the color filters in the CFA match the Observer Color Matching Functions – and let a linear color matrix take care of translating the captured raw data to XYZ, as described in a series of articles starting here. The process can be summarized as follows:

In order to know what colors the Observer would see from the scene all one needs to do is convert spectral irradiance from the scene to CIE XYZ by multiplying it by the respective color matching function in Figure 3.
if the CFA of a digital camera looked like Figure 3 in relative energy units, the captured raw data would represent CIE XYZ colors natively; we could say that the camera met the Luther-Ives condition^[2], its color space would be CIE XYZ and all tones would match perfectly those seen by the average (CIE) observer.
Since the CFA is presumably fixed but the illumination changes from scene to scene, if following DNG conventions for a Forward matrix a linear matrix would in theory then be needed just to compensate for white balance under different illuminants (and subsequently chromatic adaptation in order to maintain approximate color constancy).

So here is the perfect CFA under spectrally flat, equi-energy per small wavelength interval illuminant $Se$ . From this point on I will use the new and improved CIE (2012) 2-deg XYZ “physiologically-relevant” colour matching functions shown in Figure 4 instead of the classic 1931 ones in Figure 3. These are in fact just a linear matrix multiplication away from human cone spectral sensitivities, which as far as the CIE is concerned are the 10 degree cone fundamentals measured by Stiles and Burch in 1959, adjusted to 2-degrees^[3].

Figure 4. 2-degree XYZ CIE (2006) physiologically-relevant LMS fundamental colour matching functions, normalized to have the same area under equi-energy illuminant Se. From www.cvrl.org

And in fact if we let the optimum matrix finding process described earlier loose on a ColorChecker 24 we obtain a perfect SMI score of 100 assuming such a CFA (Sensitivity Metamerism Index, a measure of color accuracy). If the input to the matrix is white balanced data, its job would simply be to account for the illuminant, with its diagonal equal to the latter’s XYZ coordinates (which in the case of $Se$ happen to be 1,1,1):

Matrix 1. The Compromise Color Matrix required by the perfect CFA in Figure 4, under Illuminant Se.

This is the identity matrix, which leaves input data unchanged as expected.

Reading the Matrix

Recall that the matrix effectively performs an invertible linear projection from one space to the next. In the context of this post it transforms demosaiced and white balanced raw data (r,g,b) to the XYZ CIE color space:

(1) $\begin{equation*} \left[ \begin{array}{c} X} \\ Y \\Z \end{array} \right] = \begin{bmatrix} a_{11} & a_{12} & a_{13} \\ a_{21} & a_{22} & a_{23} \\ a_{31} & a_{32} & a_{33} \end{bmatrix} \times \left[ \begin{array} {c}\textbf{r} \\ \textbf{g} \\ \textbf{b} \end{array} \right] \end{equation*}$

Projection of white balanced intensity (r,g,b) to XYZ is performed by multiplying each raw value triplet into the matrix, row by row as follows

$\begin{align*} X &= a_{11}\cdot \textbf{r} + a_{12}\cdot \textbf{g} + a_{13}\cdot \textbf{b} \\ Y &= a_{21}\cdot \textbf{r} + a_{22}\cdot \textbf{g} + a_{23}\cdot \textbf{b} \\ Z &= a_{31}\cdot \textbf{r} + a_{32}\cdot \textbf{g} + a_{33}\cdot \textbf{b} \\ \end{align*}$

In the case of Matrix 1 above the diagonal terms $a_{11}, a_{22}$ and $a_{33}$ are all ones while all others are zero. That is generally good because, for instance, we want the raw data from a uniform neutral patch of the brightest diffuse white (1,1,1 when normalized to the 0->1 range) to map to the coordinates of the white point in the projected space, which in this case are also (1,1,1) for equi-energy illuminant $S_{e}$

Obtaining the Perfect CFA

A manufacturer could produce a CFA with Spectral Sensitivity Functions in the shape of Gaussians with means and standard deviations as follows

sprinkling about 20% blue in the red CFA to obtain that bump in the blue wavelengths – and come out with color close to perfection, an SMI of 98.3 (dots are the Gaussian CFA sensitivities, lines are CMFs for comparison)^[4]:

Figure 5. CFA Spectral Sensitivities obtained by approximating the Color Matching Functions in Figure 4 by gaussians, see text for means and standard deviation of each.

Again the job of the matrix here would mainly be to adjust for the current illuminant. It wouldn’t be as clean as that corresponding to Figure 5 because there are some small differences, but close enough:

Matrix 2. The Compromise Color Matrix required by the CFA in Figure 5, illuminant Se.

Note that the terms along the diagonal are all about 1.0 (for equi-energy illuminant $Se$ ) and those off diagonal close to zero.

While we have the matrix out, though, shouldn’t we be able to use it to also generate the red bump around 450nm? After all it seems to be centered near the blue CFA peak – and we have seen how the matrix’s job is to add together different proportions of energy going through the individual CFA channels.^[5]

The Perfect CFA, Take II

So couldn’t we just have a single red peak in the relative CFA curve and let the matrix take care of adding in some blue? This is what the perfect color CFA would look like then:

Figure 6. A near ‘perfect’ CFA, yielding 98.3 SMI, normalized for equal area.

Now the perfect CFA has just three peaks, one for each channel (what do you know, just like Cone Fundamentals, I wonder why?). Red looks taller than in the previous figure because it lost the bump around 450nm and the curves are designed to all have the same area, that is to integrate to the same raw value under equi-energy illuminant $Se$ . Letting Matlab’s lsqnonlin search routine find the best compromise color matrix for this CFA produces the following result:

Matrix 3. The Compromise Color Matrix required by the CFA in Figure 6, Illuminant Se.

Virtually identical to Matrix 2 – except for the redimensioned red component top left and the new coefficient that popped up top right, the blue component in the red channel: that’s the bump we got rid of in the CFA, reinstated by the optimizing routine.

Visualizing the effect of the Matrix

In fact we can visualize the CFA directly in XYZ space by applying the matrix to the CFA spectral curves themselves, wavelength by wavelength – it’s a linear system after all. In other words, once multiplied by the matrix under illuminant $Se$ the resulting red curve is the result of 0.8341 times the red CFA curve in Figure 6 minus 0.0030 times the green curve plus 0.1623 times the blue curve, and so for the others. If we perform those operations on each of the three curves we effectively transform them to the perceptual XYZ color Space. There we can compare them directly to the respective experimentally derived Color Matching Function:

Figure 7. The proposed CFA shown in Figure 6 transformed to the XYZ color space by Matrix 3, compared to CIE (2006) physiologically-relevant LMS fundamental colour matching functions.

Voilà, once transformed to XYZ by the matrix the red bump in the blue wavelengths has been resuscitated (+16.23% blue) and the relative peak has been brought back to size (83.41% of its value in Figure 6), despite the fact that in the real world the CFA has now just three peaks.

Note however that the r,g,b curves in Figures 7 and 5 aren’t exactly equal: for instance the blue curve in XYZ is slightly raised around 550nm, the result of adding 4.45% green per the relative matrix coefficient – an addition that is of course not there in the CMF. That’s a result of the compromises necessary to obtain the best matrix possible in the given conditions. Nothing to worry about at this stage though, SMI is still an unheard of 98.3.

Bumps are Red, Violets are Blue

It becomes then apparent that it is not strictly necessary for the red filter to leak into blue wavelengths in order to be able to reproduce violet, as once suspected. All one needs are properly positioned CFA peaks and a well tuned compromise matrix. Easier said than done I guess, otherwise we would all be walking around with cameras producing virtually perfect color.

On the other hand with the red bump baked into the CFA, a sensor could collect a wider bandwidth signal in the red channel with potential noise/white balance benefits and no penalty in color accuracy, so if it’s feasible and it meets design objectives why not have it? In fact this is not an all or nothing affair: one could choose to have just a little red bump in the blue wavelengths if that worked better with other imager compromises and let the matrix take over from there. I wonder if that’s why the CFAs of many current cameras show just such compromises, albeit not in as nearly a well behaved form as in the figures above.

Hardware Color IQ

In the end it turns out that the objective of the linear color system of a digital camera is to make linear combinations of the spectral sensitivities of filters before the photosites on the sensor look as much as possible like CIE Color Matching Functions under the given illuminant, in XYZ space. The recipe for the combinations is contained in the compromise color matrix.

And since CMFs are linearly related to cone spectral sensitivities in the average observer’s retina, the more the color filtering action of a digital camera and lens (dominated by the CFA) look like spectral sensitivities of cones in the human eye after multiplication by the relevant matrix, the more accurate its color capture potential – hence its color IQ.

The Perfect CFA, Take III

How hard would it be for a manufacturer to produce a CFA with Spectral Sensitivity Functions in the shape of Gaussians centered at 596.8, 560.2 and 447.1nm of standard deviation 33.0, 44.0 and 23.5nm for the r, g and b channels respectively – taking into consideration other filtering usually present in digital photography (lens, hot mirrors, semiconductor responsivity, etc.)? A question for the material science guys – but we know that there are many compromises to be made, especially as it pertains to stable generalization and noise. This is the reason why I suspect I have personally never seen an SMI above 90 in commercial sensors. Until then here is one proposed ‘perfect’ color CFA in more familiar form, normalized so that all Spectral Sensitivity Functions peak at 1.

Figure 8. One perfect color CFA of Gaussians, SMI 98.3, ignoring noise.

Of course since we are starting with white balanced raw data, which is proportional to the area under the curves, if we put this CFA through the routine that computes the optimum Compromise Color Matrix we will get Matrix 3 again and the CFA in XYZ will look exactly as in Figure 7 – the only difference being the coefficients required to white balance the demosaiced raw data, which effectively reset the curves to their natural state seen in Figure 6.

Matrix 4. Identical to Matrix 3, the difference absorbed by the white balance multipliers

Comparing to Digital Camera CFAs

We (I) now understand that the color IQ of a digital camera can be judged quantitatively and intuitively by comparing its CFA to CMFs wavelength-by-wavelength once projected into the same color space, say XYZ. All we need is an accurate estimate of the Spectral Sensitivity Functions of its CFA (+ other on-sensor filters) and a matrix for the given illuminant. For instance, here is such a comparison for a very good CFA under D50 illumination, from an Arriflex D21^[6]:

Figure 9. Arri D21 CFA (r,g,b) projected to CIE XYZ by its CC24, D50 optimized color matrix, compared to D50 Spectral Power Distribution weighted by 2006 CIE Color Matching Functions (x,y,z) and therefore also in the XYZ Color Space.

Ugh. And the Arri is very good, producing an SMI around 90. You can see how the matrix did a decent job of keeping SSFs projected to XYZ relatively close to the green and red CMF curves under D50. The biggest compromise here appears to be in the blue channel.

Seeing those differences in XYZ suggests metrics other than the classic dE 1976 or 2000, perhaps also based on euclidean distance between the curves, perceptually weighted wavelength-by-wavelength. A more in-depth look at this subject appears in a more recent article, ‘Connecting Raw Data to Color Science‘.

Notes and References

_{1. Lots of provisos and simplifications for clarity as always. I am not a color scientist, so if you spot any mistakes please let me know.

2. A camera or colorimeter is said to be colorimetric if it satisfies the Luther condition (also called the “Maxwell-Ives criterion”), if the product of the spectral responsivity of the photoreceptor and the spectral transmittance of the filters is a linear combination of the Color Matching Functions. See here (in german, right click to translate it).

3. Note that the CFA curves in this article are in relative energy units and cannot be used to read off Quantum Efficiency or be compared without conversion to diagrams in quantal units, such as those found in manufacturer spec sheets for instance. These curves are often also shown with their peaks normalized to one but for the purposes of this post their scale is determined by having the same area. Ignoring noise for now, what counts most for color is their shape, not their size, because the matrix optimizing routine adjusts size linearly and automatically to achieve the best possible color accuracy.

4. The means and standard deviations of the synthetic CFA’s gaussian sensitivity functions were obtained by minimizing wavelength-by-wavelength root mean square differences to the indicated CMFs.

5. A hint about the origin of the red bump around 441nm and why it is possible to generate it from a weighted combination of the three other curves comes from realizing that these CMFs are derived linearly from Cone Fundamentals, which have no secondary bumps.

6. This spectral CFA comes from Christian Mauer’s Thesis at the Department of Media and Phototechnology, University of Applied Sciences Cologne, which you can find here}

14 thoughts on “The Perfect Color Filter Array”

Grierson says:

February 6, 2018 at 8:11 am

Do you think the rods in the retina contribute nothing to the perception of colour?

Reply
1. Jack says:
  
  February 6, 2018 at 8:31 am
  
  Hi Grierson,
  
  Good question, the discussion above is for photopic conditions where stimuli have luminances of several cd/m^2 or more and cones alone rule. R.W.G. Hunt goes on to say in his excellent book Measuring Colour that “there is a gradual change from photopic to scotopic vision as the illumination level is lowered, and in this mesopic form of vision both cones and rods make significant contributions to the visual response”. Scotopic vision is instead based on a single receptor, rods, and therefore monochromatic (perceived as either gray or cyan, see below).
  
  Reply
  1. Geoffrey Grierson says:
    
    June 11, 2020 at 6:00 am
    
    Thanks for the reply. I was wondering if the rods response might encourage the brain to pay more attention to the cones so that the gradual change over is more gradual than the lowering of light intensity.
    
    Reply
    1. Jack says:
      
      June 11, 2020 at 11:03 am
      
      The short answer is that I don’t know. I think in photopic conditions the rods are ‘bleached’ and don’t contribute at all. And in scotopic conditions the cones are likewise immaterial. So that leaves mesopic effects.
      
      A related data point may come from something I inquired about a while back: if you are dark adapted and look at a faint LED, why does it instantly look red: that suggests that there are local phenomena at play. A mythical Color Scientist whose nickname is The_Suede had this to say about it (excerpt):
      
      “But I GUESS that the excitation of some rods in a point-like area in a FoV that’s dominantly in the scotopic intensity range will trigger something that’s close to at least mesopic adaptation, at least locally, but probably globally as well. Then the adaptation is entirely dependent on the total energy output of the point-source. Below a certain threshold, adaptation gradually goes to full scotopic (only two base colors are perceived, either the area is perceived “gray”, or it’s “cyan”).”
      
      Reply
Nick Glazzard says:

September 20, 2018 at 2:34 am

Thanks for this and all your articles. I really enjoy reading them. I was wondering if you had come across the AMPAS ACES recommended process for finding what they call an Input Device Transform (IDT) … essential the same as a “compromise colour matrix”. This doesn’t use any actual images of patches. Instead, it assumes that the camera spectral sensitivity curves are known pretty precisely, then uses a database of patch spectral reflectances to calculate the IDT matrix (by optimizing a cost function based on a perceptual colour difference metric). It is described in ACES Document P-2013-001 “Recommended Procedures for the Creation and Use of Digital Camera System Input Device Transforms”. There is also a “rawtoaces” project on Github which is a practical implementation of this thing. You’ve probably come across it before, but it might be of interest if not.
All the best,
Nick.

Reply
1. Jack says:
  
  September 20, 2018 at 6:57 am
  
  Thanks for your kind comment Nick – and for the ACES reference. First time I hear of it, I will read up on it when I have some time.
  
  Reply
John S. Sadowsky says:

February 18, 2020 at 7:31 pm

Here is another way of saying much the same thing as said in these last 4 excellent articles.

Trichromatic vision is the (Hilbert space) projection from the infinite-dimensional space of spectral illumination functions to the three-dimensional space spanned by the spectral sensitivity functions. The details of the shaping of those sensitivity functions are not important. The necessary and sufficient criterion for a “perfect” sensor is simply that the sensitivity functions span the same subspace as the cone cell LMS sensitivity functions. Red bumps, overlap, fall off – by themselves, … anything goes as long as the three sensitivities properly span the LMS subspace. (In light of that, I found some Phase One marketing to be, let’s say, amusing.)

CIERGB (the original CMFs) and CIEXYZ (= a transformation of CIERGB) satisfy this subspace-matching criterion by the design of the color-matching experiment.

Of course, there is an underlying assumption that everybody has the same LMS sensitivities. The 1920s color matching experiments used only 17 human test subjects! They didn’t have the LMS sensitivities in the 1920s. Those were measured by chemical analysis of cone cell opsins (the color dyes) from cancer patient biopsies in the 1970s. IMHO, the CIE 1931 standard observer was a truly phenomenal scientific achievement.

Camera sensors introduce color errors because their sensitivity subspaces are not perfectly aligned with the LMS subspace. Color accuracy dE measurements (albeit in Lab coordinates) are measurement that misalignment.

Reply
1. Jack says:
  
  February 18, 2020 at 9:10 pm
  
  Nice! Thanks for your thoughts John.
  
  Reply
Michael Clark says:

June 11, 2020 at 1:59 am

Why do pretty much all of the illustrations of Bayer CFAs, even in scientific papers, represent “red” filters with peak sensitivity at 595-600 nanometers not as the yellow-orange color our eyes perceive for 595-600 nm light, but rather as the 640 nm “Red” used by RGB color reproduction systems? Ditto for “Blue” and “Green”, though the differences there are significantly less between the actual colors used in Bayer CFAs and the colors used by RGB emitting devices.

Reply
1. Jack says:
  
  June 11, 2020 at 10:52 am
  
  Good question, I never thought about it before. I looked at a typical Nikon Red SSF and its weighted average is 590.35nm, somewhat pulled down by the ‘leak’ towards blue. And of course you are absolutely correct, that is the yellow-orange area of the spectrum. I guess calling it Red is shorter and easier, and then we would not want to confuse people by showing it yellow, would we? 😉
  
  Reply
  1. Michael Clark says:
    
    February 7, 2022 at 9:39 pm
    
    I think we confuse people even more by referring to a filter that is closer to yellow than red by the same name as the “red” color used by our emissive displays.
    
    Most folks wrongly assume that the colors used for our our CFAs and the colors our RGB displays emit match with a 1:1 correspondence.
    
    Reply
Heng-Kuan Lee says:

March 3, 2021 at 2:04 pm

Would you please tell me how to get the SMI=98.3 of Figure 5? Is it a simulation result?

Reply
1. Jack says:
  
  March 3, 2021 at 2:19 pm
  
  Hello Heng-Kuan,
  
  Yes, if I recall correctly the SMI metric in this page was obtained by assuming flat illuminant Se, then generating ‘raw’ values by using published reflectances for a CC24, running the matrix optimization routine of the article on Determining the Forward Matrix, computing DeltaEs for the 24 patches and then taking 100 minus 5.5 times the mean of just the color ones.
  
  Reply
  1. Heng-Kuan Lee says:
    
    March 3, 2021 at 2:30 pm
    
    Hi Jack,
    
    Got it! Thanks for your reply!
    
    Reply

Strolls with my Dog