Linearity in the Frequency Domain

For the purposes of ‘sharpness’ spatial resolution measurement in photography  cameras can be considered shift-invariant, linear systems.

Shift invariant means that the imaging system should respond exactly the same way no matter where light from the scene falls on the sensing medium .  We know that in a strict sense this is not true because for instance a pixel has a square area so it cannot have an isotropic response by definition.  However when using the slanted edge method of linear spatial resolution measurement  we can effectively make it shift invariant by careful preparation of the testing setup.  For example the edges should be slanted no more than this and no less than that.

Linearity in the Spatial Domain

Linear means that if uncorrelated light from two different subjects is projected on the sensing plane, the sensor’s response will be equal to the sum of what would have been its response to light from each  of them individually.   Twice the light, twice the effect.  In fact since Einstein light can be thought of as being made up of quanta of energy, the sum of the effect of each of which results in the total response.  This principle is called superposition.

For example if light from a flashlight at full power illuminating a diffuse reflecting surface captured by a digital camera resulted in a mean raw value of 100 out of a typical pixel, if the flashlight’s output power were reduced to half that pixel would record 50 in the corresponding raw data.  And if two such identical flashlights, one at full and the other at half power, were co-located and shone simultaneously on the same subject the output of the pixel in the raw data would be 150.

If equal-energy-per- small-bandwidth light S_e  went through a filter with the following spectral characteristics we could calculate the response of a sensing medium below it as the sum of its response at each individual frequency.   S_e would appear as a horizontal line in the graph below.Green CFA

Thanks to linearity we can calculate the total mean number of photons that would reach the sensing medium from such illumination because we know that each photon carries energy

(1)   \begin{equation*} E_{photon} = \frac{hc}{\lambda} = \frac{1240}{\lambda} \text{  eV} \cdot \text{nm} \end{equation*}

where h and c are Planck’s constant and the speed of light respectively.  Then the total number of photons is simply their sum at every wavelength \lambda.

For simplicity we can approximate the continuous answer by evaluating responses every small-bandwidth interval, small typically meaning 10 nm or less for the level of accuracy required in a photographic context.  For instance if the constant energy illumination S_e corresponded to 1 MeV every 10 nm,  the sensing medium could expect that figure divided by E_{photon}, or a mean of 5.73 photons from the 10nm interval centered around 475 nm.  Repeating this calculation every 10 nm over the effective range of wavelengths and adding them up would result in the  total number of mean photons reaching the sensing plane below the filter, in this case 143.14 .  Because we assume that the whole imaging system is linear this figure should be proportional to the relative values recorded in the raw data.

Works the Same for Red, Green and Blue CFA Filters

The black line above represents the spectral sensitivity of the average green color filter found in the CFA of 23 semi-current digital cameras as measured by Dengyu Liu, so assuming light S_e of constant energy per 10 nm of wavelength the number of photons striking silicon under the green CFA would on average be 143.14.  Performing the same exercise on the red and blue CFA channels we would get 64.39 and 123.28 mean photons under each color filter respectively.

Average RGB CFA Photons per 10 nm

Note that the total number of mean photons under each filter is in the proportion 0.401, 1 and 0.972 for the r, g, g1 and b raw channels respectively, showing that the peaks of these curves are less relevant than their areas for the purpose of calculating the number of interacting photons.   And of course high wavelength Red counts relatively less than low wavelength Blue.

Linearity in the Frequency Domain

Linearity is key to the physics of light, no matter its wavelength, therefore it applies to the frequency domain as well.  In fact if we assume shift invariance the Fourier transform of the sum of two stimuli in the spatial domain is equal to the sum of their individual Fourier Transforms:

    \[ I_1 + I_2 \Leftrightarrow\mathcal{F}(I_1)+\mathcal{F}(I_2) \]

This is of particular significance to spatial resolution measurements because the MTF curves which tell us so much about the sharpness characteristics of our cameras and lenses are nothing but the modulus of the Fourier transform of the impulse response (or point spread function) of the imaging system, normalized to one at the origin.

Linearity at Work: An Example with Diffraction

Imagine for instance that light S_e with the constant spectral profile described earlier were incident on a sensor’s silicon photo diode after having gone through a relative aperture of f/11 and the average green filter above.  The effect on the sensor can be thought of as the sum of the response of the system to each mean  individual photon at its wavelength.

To calculate the expected MTF curve we simply evaluate the un-normalized Optical Transfer Function due to diffraction for every mean photon, add them up, take the modulus of the result and normalize it to one at the origin.

With a few simplifying assumptions about phase we can do the same thing to MTF curves directly, weighted by the number of photons at each wavelength.   In fact with diffraction we can go straight to MTF50, bypassing the OTF calculation each time, because MTF50 due to diffraction is equal to

(2)   \begin{equation*} MTF50_{diffraction} = \frac{0.404}{\lambda \cdot N} \end{equation*}

with N the relative aperture or f-number.  Simplifying again we can calculate MTF50 as above every 10 nm at f/11 at the center wavelength, multiply the result by the number of photons within that bandwidth and add the normalized results up to get the total MTF50 of diffraction on the sensor in those conditions: 70.0 lp/mm in this case.  This is the reading we would get if measuring MTF50 off a plane under the green CFA filters only.

The system being linear, we could in fact plug 70 lp/mm and f/11 back into equation (2) and determine that if the same total number of  photons passed through the same relative aperture at a single wavelength of 524.7 nm the MTF measured would be exactly the same.

Likewise MTF50 due to diffraction at f/11 measured off planes under the red and blue average CFA filters would be 62.5 and 79 lp/mm, equivalent to the total number of photons under each filter at the single wavelength of 587.9 and 464.9 nm respectively, the wavelengths depicted below by vertical lines.

Note that line heights represent the relative number of mean photons for illuminant S_e under each CFA filter, which we calculated earlier to be in the ratio 0.401, 1 and 0.972 for the R, G and B channel respectively.  If the illuminant were to change, so would their ratio and position.

Average RGB CFA Photons per 10 nm

Superposition of r, g and b Responses

The vertical lines in the figure just above are significant because they represent the performance in the frequency domain of each raw color channel individually, for this specific example.  Because the green filters in the CFA are the same, this effectively gives rise to three color planes (r,g,b) with separate spatial characteristics in the frequency domain.  One would think that because of linearity the individual performance of each color plane could be added together to provide a composite system result, as was done with photons of different wavelengths earlier:

MTF50_L = c_r \cdot MTF50_r + c_g \cdot MTF50_g + c_b \cdot MTF50_b

But this being the frequency domain, where do the coefficients c_{r,g,b} come from?  Next.