This article is about specifying the units of the Discrete Fourier Transform of an image and the various ways that they can be expressed. This apparently simple task can be fiendishly unintuitive.
The image we will use as an example is the familiar Airy Disk from the last few posts, at f/16 with light of mean 530nm wavelength. Zoomed in to the left in Figure 1; and as it looks in its 1024×1024 sample image to the right:
Feeding the image in figure 1b to Matlab’s Fast Fourier Transform routine produces its two dimensional Discrete Fourier Transform. The DFT is the same size as the original image, so also 1024×1024 values. It is shown below in Figure 2a after taking its magnitude and normalizing it to one at the origin, the top left hand corner. The result of these operations is referred to as the Modulation Transfer Function. Figure 2b is a plot of just the top row, all 1024 samples of it. It represents the spatial frequency response of the imaging system in the horizontal direction.
Let’s concentrate on the one dimensional simple case first: what are the spatial resolution units of the plot, horizontal linear frequency in the direction ? They would be the units of the DFT’s image x-axis. In fact, because the sampling points on the image plane are evenly spaced in the x and y directions, they would also be the units of the y-axis, .
Cycles/set and Cycles/sample
The DFT only knows that it was fed a set () of equally spaced samples (n), say a line of 1024 numbers representing the projection on the XY plane of the Airy function in Figure 1a (its Line Spread Function). Its output is infinitely periodic but by convention it produces just the first period which contains the same number of entries as the initial set, 1024 in this case. Therefore by definition, for every integer between zero and n-1 the DFT computes a value by summing up all figures in the input set according to the following function:
with n the number of samples in the set. The results of the DFT, , are laid out in sequence from to . Every value in the DFT’s output therefore corresponds to an increasing integer spatial frequency : 0 cycles per set, 1 cycle per set, 2 cycles per set… all the way to the number of input samples (n) minus one, in this case 1023. The units of the 1D DFT are therefore
So cycles per input set are the natural units of the DFT. In our example the continuous Airy pattern on the sensing plane was sampled by a square grid, 1024 samples on the side. The output of the DFT would therefore correspond to linear spatial frequencies from 0 to 1023 cycles/set in 1 cycle/set increments, and those would be the units along the axes of the plot in Figure 2b.
We can express these units in cycles/sample by dividing frequencies in cycles/set by the number of evenly spaced samples/set (n) – watch how the units cancel out – to obtain the frequency range from 0 to almost 1 cycle/sample in 1/1024th cycles/sample increments, as shown in Figure 3a:
So the DFT’s output can always be referred to units of cycles/set and cycles/sample. In the absence of additional information, however, these units cannot be directly related to linear space or to other cycles/sample measurements.
Cycles per Pixel, lp/mm, lp/ph
Fortunately we normally do have information that relates the set or sample to physical units. In photography we know the size of the sensor that captured the image and the layout of the evenly spaced pixels it contains. The layout is typically a rectangular grid with evenly spaced pixels a distance pixel-pitch apart horizontally and vertically, center to center.
Assuming that the data set fed to the DFT was the result of one sample per pixel we can replace the label of the axis of Figure 3a below with cycles/pixel pitch (c/p) – and it’s then easy to convert to more practical units like lp/mm, lp/ph etc. as explained in an earlier article.
So that’s how you get units for the output of a Discrete Fourier Transform. If you’ve read this far you may also be interested in this practical take on the same subject. The rest of this post explains more in detail how some of the units in previous articles of this series were arrived at.
One MTF, Twice the Data
By convention the DFT produces an image in the frequency domain which is the same size as the original in the spatial domain (1024 values on the side), but only the first half of it is needed when ignoring phase, as we are doing for now. In such a case the second half is the mirror image of the DFT in the first half. To make the transformed data look more intuitive when seen in isolation, and remembering that the shown function is infinitely periodic, it is sometimes useful to shift the right part of the plot in Figure 3a one cycle/sample to the left, indicating the shifted units below zero as negative frequencies. The plot would then appear as in Figure 3b, with the x-axis representing frequencies from -0.5 to almost 0.5 cycles/sample in 1/1024th cycles/sample increments:
-0.5, -0.5+1/n, -0.5+2/n, -0.5+3/n, … , 0.5-1/n cycles/sample.
If the available image data referred to super-sampling at a rate of 5 samples/pixel, as in the examples of the last few articles, we would have five times the samples and therefore need to multiply shifted (or unshifted) above by this figure. Watch how the units in Figure 3b cancel out, yielding a range of -2.5 to almost 2.5 cycles/pixel at 5/1024th c/p increments (about 1/204th c/p)*.
If a pixel is 5 micrometers on the side, we could work that into the units: -2.5 to almost 2.5 c/p in 5/1024th cycles/pixel increments divided by 0.005 mm/pixel yields a range of -500 to almost 500 lp/mm in increments of 1000/1024th lp/mm.
Both of these cases are shown below, zoomed into the frequencies of interest from zero to 1 c/p, which in this case correspond to zero to almost 200 lp/mm.
If one knows the vertical size of the photographic sensor, say typically 24mm for a Full Frame camera, one can multiply units in lp/mm by 24mm/picture height to obtain spatial resolution in lp/ph. Alternatively one can obtain lp/ph by multiplying spatial resolution in cycles/pixel by the number of pixels on the sensor’s size.
Spatial frequency is often expressed in lp/mm or lp/ph when evaluating the performance of an imaging system for photographic applications.
Slicing and Dicing
So those are the units corresponding to the horizontal and vertical axes of Figure 2. They are the units and in the vertical and horizontal directions respectively. But what happens if we take a 45 degree radial slice from the origin to the opposite corner, as shown in Figure 5a below (yellow line)? This would represent the imaging system’s frequency response in the diagonal direction. The number of samples in the set would still be the same, 1024, but the samples would physically be further apart as a result of the square sampling grid.
Remember that a sample or a set does not have physical dimensions. So if the result of the yellow diagonal FFT slice in cycles/set is plotted together with the earlier horizontal FFT in Figure 2 we would get this misleading graph:
We know the two dimensional Airy disc and its MTF are rotationally symmetric, so the linear result should be the same no matter the direction of the slice of the 2D MTF. To correct the situation all we need to do is convert them both to comparable units. If the (horizontal) pixel pitch is 5um, samples will be 5 um apart diagonally, because the sampling grid is square. Therefore we need to multiply the diagonal spatial frequency units associated to the relative DFT output by the difference in sample spacing, . Here is the result, zoomed into the zero to 1 c/p range. The pixel in the units is the horizontal or vertical, as opposed to the diagonal, pixel pitch:
Slices in other directions would result in different linear sampling intervals which would need to be expressed in compatible units for comparison to others.
The ‘down’ Sampled Image
As a final example let’s revisit the sampled image of the previous article. In that case we pretended that Figure 7a below represented a continuous image. Here we will assume that it is the image in the raw data of the actual capture of an Airy pattern that we wish to downsize 5:1. The image in Figure 7b represents the nearest-neighbor downsampled result (nearest-neighbour implemented as averaging).
The original capture to the left is the same Airy intensity pattern of Figure 1b above, captured in a 1024×1024 sample raw file. Every new pixel to the right corresponds to 5×5 of those before downsampling. That means that the downsized image will be made up of 204×204 pixels. The DFT of the downsampled image will therefore refer to vertical and horizontal spatial resolutions from 0 to 203 cycles/set. Since there are 204 pixels/set we divide the two figures to obtain a range of 0 to almost 1 cycles/new pixel in increments of 1/204th cycles/pixel. Using these spatial frequencies paired with the result of the DFT of the downsampled image produced the plots that compare the performance of the downsampled vs the original image, as shown in the previous article. Pixel in the units refers to the new downsampled pixel 5 um on the side.
From here it’s easy to convert to typical units like lp/mm or lp/ph as discussed earlier.
*As an aside, MTF Mapper super-samples the Line Spread Function at 8 samples/pixel.