Sampling in Imaging

This article and the following one will discuss the effect on resolution of digitizing a continuous optical image.

The sampling process carried out by the sensor results in digital values corresponding to an intensity at each pixel’s location.  These so-called Data Numbers are stored ideally as-is in the raw file and are proportional to infinitesimal point samples of a new continuous image: the optical image smoothed by the characteristics of the pixels’ effective active area, known as the pixel aperture function.

Figure 1. Simulated Pixel Aperture Function of a 4um pitch Back Side Illuminated pixel in isolation.  Note diffusion beyond the -2/+2um theoretical pixel boundaries suggested by pitch.

Smoothing by a finite pixel area reduces resolution.

Pixel Aperture, IPS, PSF

A pixel’s effective active area is typically known as its aperture in imaging, though when produced as an intensity map it is also referred to as Intra Pixel Sensitivity or pixel Point Spread Function, depending on context.

The function depends on the sensor’s physical characteristics: whether it is Front or Back Side Illuminated, its architecture, presence and type of microlenses, filters, etc.

Figure 2. The circuitry between arriving photons and the photodiode reduces the effectiveness of Front Side Illuminated pixels compared to Back Side Illuminated. Image courtesy of Cmglee under license, modified to show both options side by side.

As far as we are concerned it is just a detailed two-dimensional map of how likely photons impinging on different portions of the area of a pixel are to be captured.  Think of it as a wavelength dependent two-dimensional filter representing sub-pixel Quantum Efficiency weights.

In the rain-drops-in-bucket analogy, the effective aperture of a pixel can be envisioned as an uneven, leaky cheese cloth that funnels photons to a flow meter near its center.  Its leakiness and shape can vary, from square to round to pillow-like as determined by its physical make up.

For instance, this is how the folks at RIT measured Intra Pixel Sensitivity of the BSI sensor on the Kepler space mission, you can find the relative paper in the notes:[1]

Figure 3. Normalized intra-pixel response of Kepler’s CCD at 600nm, Figure 11 of the paper “Direct measurement of the intra-pixel response function of the Kepler Space Telescope’s CCDs“, Vorobiev et al, 2018

The intensity map to the left, and the contour map to the right, show unsurprisingly that the pixel is better at collecting photons landing near its center than towards the edges, the rate of change dependent on wavelength.  The map is normalized to a relative maximum weight of one which it shows as pure white, worsening through increasingly dark grays, down to black for zero.  The shapes become less symmetrical and well behaved towards the edges of the sensor.  The scale above is large because that particular sensor is apparently used for infrared work.

Pixel PSF in the Spatial Domain

BSI pixel aperture can be modeled as the convolution of hyperbolic secants or gaussians with the ideal aperture of a square window function.[2]   Below I used gaussians with the suggested standard deviation for 4 \mu m pitch BSI pixels with light around 550nm:

Figure 4.  Pixel Aperture Function as an image (left) and with intensity projected to the horizontal axis (right).  The yellow lines represent the boundaries of the ideal square pixel.

A single pixel in isolation is shown to emphasize the fact that the pixel’s PSF extends beyond the theoretical square area normally associated with it, because of physical effects like diffusion and crosstalk.  These effects can extend to several pixels away but in practice today’s sensor manufacturers are working hard to contain them much closer to a pixel’s boundaries.[3]

Nevertheless it is obvious that the decreasing sensitivity towards the edges means that fewer impinging photons will make it through there than in the center, resulting in a ‘Fill Factor’ of potentially less than 100%, meaning that the effective area of the pixel is less than that implied by pixel pitch.

The ideal model of 100% Fill Factor would look like a perfectly sharp square and top hat respectively in the Figure above, with sides along the edges of the pixel.  However, the depicted pixel would also qualify as 100% FF because of how it is modeled, with its effective area wider than the physical one without losses, as we will see in the next section.

Pixel PSF in the Frequency Domain

In the ideal case a perfect square aperture of width w shows a frequency response equal to a sinc with first null at 1/w.  The above function is close to ideal so its Spatial Frequency Response is near that of a theoretically perfect square aperture up to 1 cycle per pixel pitch (c/p), though the gaussian used to model diffusion/cross talk abates it substantially at higher frequencies, as can be seen in the Figure below:

Figure 5. Comparison of the Spatial Frequency Response of an ideal perfect square aperture vs the one with diffusion in Figure 4. The plot is for spatial frequencies in the horizontal and vertical directions.

The coincident null at 1 c/p is to be expected in this case since the pixel aperture simulation is the result of multiplication of gaussians with a perfect square  aperture response, which forces a zero there, see the Appendix.

However that is not a given when manufacturers tweak or remove microlenses in order to obtain an effectively smaller active area with the objective of reducing its smoothing function therefore maximizing ‘sharpness’, as we’ve seen in some recent Medium and Full Format landscape cameras.  In that case the effective top hat width w would be less than pitch p and a first null induced by a lower Fill Factor  would move to a higher frequency than the ideal 1 c/p.

Front Side Illuminated pixels have smaller sensing areas and their IPSs are less well behaved than Back Side Illuminated ones since they have to deal with interference and blockage by wiring, circuits and other structures in front of the photodiode.  However most camera sensors have moved to BSI over the last decade or so, in part because a larger effective area portends a higher Signal to Noise Ratio.

For additional detail and insight on pixel aperture in the frequency domain refer to this earlier article.

The Sampling Process

In what follows we will assume for simplicity flat, perfectly square contiguous pixel apertures filling 100% of a pixel’s area and no more, with effective aperture width w the same as pitch p:

Figure 6  Pixels count photons impinging on their surface during Exposure.  The count for each pixel is stored sequentially in the raw file.  In this 100% FF case Pixel aperture corresponds to a perfect square with sides equal to pitch, resulting in gapless tiling of the pixels.

The job of pixels in an imaging sensor is to count (integrate) photons projected onto their area during Exposure – after attenuation by the pixel aperture function and losses due to photoelectric conversion (QE).  The photoelectron count corresponding to each pixel is a local area sample of the intensity of the optical image.  This in essence is what sampling an image means.

The camera then writes these values to the raw file up to a gain, so-called Data Numbers (DN), to be later retrieved in order to reconstruct the image for display.  Ideally the values written to the raw file are unprocessed and maintain a linear relationship with the incoming light.

The integration of the intensity of an image in a moving neighborhood weighted by a function is known as 2D convolution. In the context of an imaging sensor, this mathematical concept is used to model the effect of the pixel’s physical structure.

Specifically, the light collection by the pixel aperture acts as a convolution of the continuous optical image projected by the lens onto the sensing plane, with the continuous pixel aperture function. The smoothed image resulting from this convolution can therefore  also be considered to be continuous.

The camera then samples this smoothed, continuous image at discrete points corresponding to the center of each pixel.[*] While physically the pixel collects the total integrated charge across its area, mathematically, this process is modeled as taking an infinitesimally small point sample of the convolved image at each pixel’s center. This is accomplished by a Dirac delta function acting on the smoothed image, as shown in red in Figures 7 and 8.

In other words, the pixel aperture functions as an inherent, image-wide low-pass filter, convolved with the optical image. The smaller the effective photosensitive area, the further away the first null of the filter, and the less the image is smoothed before sampling occurs.

Figure 7. Left: Highly magnified intensity distribution on sensing plane of light from two noiseless point sources at the Rayleigh Criterion after having gone through an unaberrated lens with circular aperture. Right: Same light after the effect of perfect square pixel apertures with 100% fill factor. The red dots are the location of the delta samples, they are pixel pitch p=0.61λN apart.  Sampled intensity is proportional to the Data Numbers written to the raw file.

Note above right the smoothing and spreading out of the two perfectly imaged stars  to the left, due to the filtering action of pixel aperture.   Sampling is then brought home by a delta function picking the value of the intensity resulting from convolution at the location corresponding to the center of the pixel’s aperture (the red dots).

The Meaning of Raw Data

So as far as sampling is concerned, we  can ideally think of values stored in the raw file as infinitesimally small point samples of the intensity of the image projected by the lens after smoothing by a single image-wide filter with the same characteristics as the pixel Aperture Function.

Below are the images of the two simulated stars in Figure 7 with their intensity projected to the horizontal axis for better clarity.  It shows the effect of convolution by pixel aperture (orange curve) on the continuous optical image (blue curve) and the delta samples of the smoothed curve at the center of each pixel that make up the unprocessed raw data (sampled values are the height of the red deltas).

Figure 8  Intensity profile of the two stars in Figure 7, showing the effect of convolution with a perfect 100% FF square pixel aperture (orange curve) and delta sampling of the curve at the center of each pixel (red arrows).  The height of the red arrows are proportional to the values written to the raw file.  The image is being sampled every 0.61λN, not quite enough for perfect reconstruction.

Image Reconstruction

The separation of the function of a pixel into two independent operations –  filtering by its aperture followed by point sampling – has important implications because it allows immediate application of concepts like the \text{sinc} interpolation formula for ideal reconstruction; and more intuitive analysis in the frequency domain.   The value of a pixel in the raw data represents a point on a curve, not a little square. [4]

In fact as long as the sampling rate is at least twice the highest spatial frequency present in the optical image, the Nyquist-Shannon theorem tells us that we should be able to reconstitute the smoothed continuous image from sampled data perfectly, at least in this ideal unprocessed, noiseless, monochrome case.[5]

In most photographic scenarios, the quality of the optical image is ultimately limited by diffraction caused by the nearly circular lens aperture.  This imposes a maximum spatial frequency, known as the extinction frequency (f_{ext}), which is equal to 1/(\lambda N), where \lambda is the wavelength of light (e.g. in microns) and N is the lens working f-number.

This optical limit means that the Modulation Transfer Function (\text{MTF}) of the image is zero beyond f_{ext}, effectively making any photographic image band limited Since the act of light collection by the pixel aperture is modeled as a convolution with this optical image, the resulting signal’s spectrum is the multiplication of the optical \text{MTF} by the pixel aperture \text{MTF}. Because multiplication by zero is zero, the final sampled signal is also band limited by the optical f_{ext}, regardless of the size of the pixel aperture (though a large aperture may impose an even stricter practical limit).

Therefore, to ensure perfect reconstruction independently of the subject (i.e., to satisfy the Nyquist criterion for the diffraction limit), we need the pixel pitch (sampling period) to be no greater than 0.5 \lambda N.

Figure 9. Perfect reconstruction of the ideal image smoothed by the pixel aperture in Figure 8, this time with pitch corresponding to twice the highest frequency in the optical image, as suggested by the Nyquist-Shannon Theorem.  Phase was randomly chosen.

Assuming mean wavelengths of about 0.55um and f-numbers in the f/4-11 range, that would require pixel pitches around 1-3 microns in typical landscape conditions.

That criterion is not quite met in Figure 8 so we could expect some aliasing in the displayed image, though its amount would be limited because fortunately in photography there is diminishing energy at higher frequencies as a result of imperfections in the setup.

We can therefore often relax that constraint, potentially accepting less than ideal reconstruction, as we all know given the 3-5 micron pitches of current enthusiast digital cameras.  Some sensors close the gap by enabling IBIS to shift the sensor by sub-pixel amounts and superimposing multiple captures, something that works well with static subjects, reducing aliasing very effectively while leaving the spatial frequency response unaltered.

In general, absent further processing, the best we can aim for in digital imaging is to reconstruct the continuous image smoothed by pixel aperture, though advanced post processing like deconvolution can often help to get us closer to the geometric image.

The effect of digitization on resolution is next.

 

Appendix for the Mathematically Inclined

Assuming a continuous optical image I_c(x,y) representing photon intensity on the sensing plane and an MxN pixel sensor (e.g. 6000×4000) on the (x,y) physical axis, Data Numbers written to the raw file at pixel index location (m,n) are equal to

(1)   \begin{align*} DN[m, n]_\lambda &= QE\!\cdot\! g\!\cdot\!\left[ I_c \ast h  \ast a \right]\!(x, y) \cdot \text{III}\!\left(\frac{x}{p}, \frac{y}{p}\right) \quad \\ & \text{evaluated at physical } (x, y) = (mp, np) \end{align*}

where:

  • all variables are wavelength dependent and
  • \ast denotes 2D convolution
  • \cdot element-wise multiplication
  • QE quantum efficiency in photoelectron/photon
  • g system gain in DN/photoelectron
  • h(x, y) filter stack effects that we are ignoring here (e.g. AA)
  • a(x, y) pixel aperture function
  • m = 0, 1, 2, \ldots, M-1, m \in \mathbb{Z}^+ (integer column index)
  • n = 0, 1, 2, \ldots, N-1, n \in \mathbb{Z}^+ (integer row index)
  • p pixel pitch in microns
  • \text{III}\left(\frac{x}{p}, \frac{y}{p}\right) = \displaystyle\sum_{m} \sum_{n} \delta\left(\frac{x}{p} - m, \frac{y}{p} - n\right) is the 2D Dirac comb.

In the frequency domain convolutions become multiplications and vice versa, so:

(2)   \begin{equation*} \tilde{DN}(f_x, f_y) = \frac{QE\!\cdot\! g}{p^2}\! \left[ \tilde{I}_c\! \cdot\! H\! \cdot\! A \right] \!(f_x, f_y) \ast \text{III}(p f_x, p f_y) \end{equation*}

with f_x, f_y horizontal and vertical spatial frequencies respectively and 1/p the sampling frequency.  See the article on physical units for how they are determined in practice.

 

Notes and References

1. “Direct measurement of the intra-pixel response function of the Kepler Space Telescope’s CCDs“, Vorobiev et al, 2018
2. “Improved Intra-Pixel Sensitivity Characterization Based on Diffusion and Coupling Model for Infrared Focal Plane Array Photodetector“, Li Zhong et al, 2021.
3. I have looked for IPS maps for current enthusiast and smartphone cameras without success. I would be happy to hear from anyone who could point me to some non-paywalled sources.
4. “A Pixel Is Not A Little Square, A Pixel Is Not A Little Square, A Pixel Is Not A Little Square“, Alvy Ray Smith, January 1995
5. Jim Kasson has an excellent series of posts showing the basics of image reconstruction and its limitations.
6. The code used to produce the above plots can be downloaded by clicking here.

2 thoughts on “Sampling in Imaging”

  1. This is a very nice article.

    As pixel sizes decrease, and approach the wavelength in size, another approach has to be taken based on full electromagnetic theory. Some of the latest phone cameras can have pixels as small as 0.6 micron square. To carry out a similar analysis to the one above for these small pixels, one needs EM finite element modelling analysis, unfortunately. There are novel effects, one aspect of which is described in the open access paper by Catrysse et al, “Subwavelength Bayer RGB color routers with perfect optical efficiency”, Nanophotonics 11(10) 2381-2387 (2022).

    1. Thanks Chris, reference appreciated, interesting new tech. One of the sources I looked at for current tech suggested a smaller standard deviation for smaller pixels, around 0.1 for 1 micron pixels. But it seemed short on facts so I look forward to more robust sources.

Leave a Reply to Jack Cancel reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.