Color, Formats, Processing, Raspberry Pi, Sensors

Opening Raspberry Pi High Quality Camera Raw Files

May 15, 2020 Jack 17 Comments

The Raspberry Pi Foundation recently released an interchangeable lens camera module based on the Sony IMX477, a 1/2.3″ back side illuminated sensor with 3040×4056 pixels of 1.55um pitch. In this somewhat technical article we will unpack the 12-bit raw still data that it produces and render it in a convenient color space.

still life raw capture data file raspberry pi high quality hq cam f/8 1/2s base analog gain iso adobe rgb — Figure 1. 12-bit raw capture by Raspberry Pi High Quality Camera with 16 mm kit lens at f/8, 1/2 s, base ISO. The image was loaded into Matlab and rendered Half Height Nearest Neighbor in the Adobe RGB color space with a touch of local contrast and sharpening. Click on it to see it in its own tab and view it at 100% magnification. If your browser is not color managed you may not see colors properly.

My First Open Source ILC

When the HQ module was announced a couple of weeks ago I was excited to discover that it came with a CS standard mount, opening the possibility of using any lens ever made with it – as long as it respected back flange limits and an adapter were available. Finally an inexpensive open source ‘camera’ with a decent sensor and interchangeable lenses of potentially photographic quality.

Mine arrived a few days ago. The CS mount affixed to the board has V1.0 2018 markings and it came with a CS-C adapter, which was promptly used to attach the 16mm f/1.4-16 C ‘kit’ lens. The lens is from CGL Electronic Co. LTD, a Chinese company specialized “in the R&D, production and sale of accessories for smartphones, as well as Bluetooth products”. One of their lines is Megapixel CCTV lenses, and this one is spec’d at 10 MP, as printed on the box. What size P that refers to, we are not told. Its field of view is about 27.6 degrees on the diagonal, equivalent to roughly 88mm on Full Frame.

The Sensor

The IMX477 was released in December 2016 by Sony. It is a 1/2.3″ 4:3 Back Side Illuminated, Stacked Exmor CMOS sensor designed for “consumer use camcorder” applications, though in this article I will evaluate its ability to capture still images, thus ignoring Sony’s stated use case. Given its tinkering potential, I am sure I will not be the first or last to do so.

There are 3040×4056 pixels usable for imaging, with a 1.55um pitch. This portends a sensor active area of 4.712 x 6.287 mm with a 7.857mm diagonal, implying a 5.51x multiplier compared to, say, a Full Frame Nikon Z7 with 5520×8288 4.35um pixels.

It sports a Bayer Color Filter Array in the BGGR configuration. If the figure reported in the MakerNote is to be trusted, its fully electronic shutter has a minimum exposure time of 1/8771.9 of a second and it has been clocked at a maximum of 239 s. It is capable of producing 12-bit raw data when in still Mode 3. You can read more about its specs and performance in the next article.

Unpacking the 12-bit Raw Data

When the -r or –raw switch is used with the Pi’s still image command raspistill -md 3 (see the post scriptum at bottom for raspiraw), 12-bit raw CFA data is appended to the resulting 8 bit jpg file, in a block starting with the characters ‘BRCM’. The first part of the block is a 2^15 byte header, which I ignore, followed by the (3040+16)x(4056+28B) sensor array data written row-wise. The key is realizing that there are 3056 total rows and formatting the data accordingly.

Each row consists of 4056 12-bit elements (4056*12/8 bytes), followed by 12 bytes of zeros and 16 bytes of non-imaging data. The 12 bytes of zeros mark the end of the active area all the way down to the 3040th row. After that there are 16 additional rows of full length system data. In the past this non-imaging system data included optical black pixels that helped determine BlackLevels dynamically, but recent sensors tend not to have clearly demarcated such patches and I was not able to identify them. Should you know more about these service rows and columns I would be interested in the details.

From the start of every row to the twelve zeros, pixel raw values are packed in triplets: three 8-bit bytes are written for every two 12-bit pixels in the following format

AAAAAAAA BBBBBBBB BBBBAAAA

The first two bytes represent the 8 individual high bits while the third one contains the 4×2 respective low bits as shown. Unpacking them is easily accomplished in Matlab [and with a bit of adaptation Octave or your interpreter of choice, see the comments for some Python code]^[1] as follows, vector ‘bin’ holds pixel data row-wise:

Figure 2. Unpacking 12 bit raw data loaded from Raspberry Pi High Quality Camera jpeg generated by raspistill -r. Full Matlab code available at the bottom of the article.

The full function used in this page can be downloaded from the link at the end of the article. So now we have the Pi HQ Camera’s 12-bit raw CFA data in 16-bit integer format.

Exif and MakerNotes

There is very little information in the jpeg Exif tags and some of it is incorrect unless explicitly set by the user. For instance any information related to the lens, like focal length or f-number, because the module and the lens don’t speak to each other. The goodies are instead in the MakerNotes, where we can find white balance multipliers, a compromise color matrix and more. The field is made up of a few hundred characters, here is the one from the capture in Figure 3 below:

ev=-1 mlux=-1
exp=900 ag=256 focus=255
gain_r=3.238 gain_b=1.515 greenness=0 ccm=8466,-3816,-550,-476,6390,-1816,302,-1790,5588,0,0,0 md=0 tg=247 247 oth=247 216 b=0 f=247 247 fi=0
ISP Build Date: Feb 12 2020, 12:39:13 VC_BUILD_ID_VERSION: 53a54c770c493957d99bf49762dfabc4eee00e45 (clean) VC_BUILD_ID_USER: dom VC_BUILD_ID_BRANCH: master

Ignore ev (EC) and mlux, which appear fixed. Then:

exp is Exposure Time in microseconds
ag divided by 256 is Analog Gain, related to ISO, so in this case it had a value of 1 (the range is 1 to 16)
gain_r and gain_b are the white balance multipliers; greenness has so far always been zero in my experience
ccm is the Compromise Color Matrix, divide by 4096 and drop the last row of offsets that in my tests has always been zero.

I don’t know what the rest of the entries are but I suspect they are related to automatic exposure because they all turn to zero when that mode is turned off (-ex off, undocumented but all I use).

Decoding the Matrix

The matrix in the MakerNote looks like this:

It changes with the lighting, so I would guess that the module interpolates it based on the estimated illuminant. Where does this matrix take us? It looks very much like a demosaiced, white-balanced data to sRGB matrix.

To find out I took my setup to the balcony in a veiled, sunny, city afternoon to capture a purposely slightly defocused ColorChecker 24 target. Here is the image produced by the Pi’s GPU-accelerated on-board engine, straight Out Of Camera:

out of camera ooc raw capture cc24 colorchecker whibal base analgo gain iso slightly out of focus oof raspberry pi hq cam high quality — Figure 3. Out of Camera sRGB jpeg image produced by raspistill -r with a Raspberry Pi High Quality Camera and 16mm CS lens at f/5.6. It is purposedly defocused to smooth out the impact of irregularities in the patches.

It looks a bit desaturated. Extracting the raw values of the 24 CC patches as described in the article on determining the Forward Matrix we obtain the following fit against BabelColor’s 30 database, assuming about a 5800K D illuminant, as suggested by the color meter:

The white-balanced data to sRGB matrix looks fairly similar to the one in the HQ Camera’s MakerNotes, suggesting that its purpose is indeed the same, something I have since confirmed with subsequent captures. A Sensitivity Metamerism Index (SMI) of 88 indicates a pretty good fit – thus colorimetrically friendly CFA dies, well done! These are the dE2000 errors for the target under the current D5800 illuminant and the white balanced ‘raw’ to XYZ Forward Matrix above:

raspberry pi high quality hq cam base analog gain iso deltaE 2000 raw capture CC24 colorchecker CFA SSF spectral sensitivity functions — Figure 4. CIEDE2000 difference from BabelColor30 ColorChecker database reference data when the derived matrix was applied to the relative raw capture. It indicates a very good fit, suggesting colorimetrically friendly CFA spectral sensitivity functions in the HQ camera.

Very good results. Using the D5800 Forward Matrix so discovered we can easily calculate matrices for any of the standard color spaces around this color temperature, as described in the linked article.

Rendering 12-bit Raw Data to sRGB

We now have the raw data, white balance multipliers and matrix necessary to render the captured raw image to a final color space. I used simplified demosaicing that results in a half height image with every final pixel corresponding to a BGGR quartet in the CFA, similar to dcraw -h. Each color channel within a pixel maintains the relative BlackLevel subtracted ‘raw’ R and B values while the two G values are averaged, as described in the article on rendering:

Figure 5. Standard Matlab/Octave code[1] to convert raw CFA data to 16-bit standard color spaces. It produces a half-height image without interpolation. See the linked article for details.

The Black Level is about 256.3 at base gain/ISO in my unit at room temperature, as we will see in the next article. Applying that script to the captured raw CFA data results in the following ‘final’ sRGB image:

raspberry pi high quality hq cam color checker passport cc24 5800k base analog gain iso slightly out of focus raw conversion — Figure 6. Raspberry Pi HQ Camera with 16mm kit lens at f/5.6 sRGB image. It was converted from raw after loading the raspistill -r jpeg, unpacking the raw data, subtracting black level, applying white balance multipliers and the matrix.

Much better, though we know from previous posts that this linear rendition probably still needs to be mapped into the smaller Contrast Ratio of typical display devices with the help of a Tone Mapping Operator or, at the very least, a bit of contrast.

And Full-Rez to Adobe RGB

Of course after doing all that it becomes apparent that the small size of the HQ’s pixels bump against their physical limitations. Images from this sensor are bound to look a bit fuzzy and noisy compared to those produced by larger cousins of similar resolution when demosaiced to full size and shown at 100%: there are only so many photoelectrons to be captured and diffraction to be oversampled when you are 1.55 microns on the side.

With mixed lighting and an estimated color temperature of 3400K, the compromise color matrix from black subtracted, white balanced camera space to Adobe RGB came out as follows:

Full-size demosaicing reveals that focus is a little forward of where I had intended. Manual focusing is a time consuming exercise with this setup unless one has a monitor plugged into the Pi’s HDMI output and can look at it interactively when doing so.^[2] It’s the same raw file as in Figure 1, whose rendition I tend to prefer: viewed at 100% on a monitor, the half height resolution seems to be a better match for this capture, sweeping some small-pixel weaknesses under the rug.

Conclusions

In its DIY, Open Source context this little module shines and is a major improvement over previous iterations. The new interchangeable lens CS standard mount opens up brilliant new possibilities. Good work Foundation!

In the next article we will further characterize the sensor for still photography use.

P.S. raspiraw

There exists another routine called raspiraw designed for fastest possible frames per second. It just reads the raw data out to file as quickly as it can, without any of the usual processing performed by raspistill. For instance it does not generate an OOC Jpeg or collect metadata. Nevertheless it seems that nobody has been able to get more than about 11 fps out of full resolution 12-bit Mode 3, and I am no exception.

I installed it following the procedure outlined in this very helpful post and ran it by just specifying shutter speed in microseconds (-eus) and output file name (-o). It produces a file containing raw data in the same format as that produced by raspistill, without the header required by the pi-specific version of dcraw (the header can be added with the -hd switch).

The file contains just 3040 rows, each with the output of the 4056 pixels followed by 12 bytes of zeros and 16 bytes of unknown information as before. One would assume that the additional 16 rows seen earlier are therefore assembled by raspistill from ‘shadow’ metadata frames. Reading the linked thread, one discovers that the camera needs memory equivalent to 6 full raw images: 3 actual images and three piggy-backed ‘metadata’ images. One suspects the metadata images to be for on the fly Automatic Lens Shading Corrections and the like.^[3] If you know more about what information they contain or why there are three for single image captures drop me a line.

One also discovers that the sensor has an RGGB CFA layout, which only becomes BGGR once the image is flipped to the correct orientation for the 16mm lens I used.

The gain switch (-g) has a working range of 0 to 1023 but its effect appears to be non-linear in the range. raspiraw files can be opened in Matlab / Octave or your interpreter of choice by following the script below.

Notes and References

_{1. * The Matlab/Octave function used in this page to open, unpack and render full resolution Raspberry Pi HQ Camera 12-bit raw stills created by raspistill -r -md 3 and raspiraw can be downloaded from here.

2. To control the HQ Cam when on the go, one can use a phone or tablet running RealVNC while connected to the same network as the headless Pi via hotspot or a battery powered wi-fi ‘puck’ as follows: ssh to the Pi and 1) vncserver -randr=1600×1200 (chosen VNC window size) ; then start RealVNC on the tablet and 2) right click on the VNC icon above the Pi’s desktop, RealVNC Viewer > Menu > Options > Troubleshooting > Optimize screen capture – select ‘Enable direct capture mode’; 3) raspistill -T 20000 -r -o test.jpg –Focus and maximize the figure of merit for best focus in the shown region of interest during the 20 second preview.

3. The libcamera driver manual and relative Sony IMX477 json file provide a wealth of insights into the low-level workings of this and other sensors.}

17 thoughts on “Opening Raspberry Pi High Quality Camera Raw Files”

Steve says:

July 15, 2020 at 2:29 pm

Wow! Fantastic analysis!

Really appreciate you sharing the outcome of your time and effort. And you explain it so well.

I’m off to experiment with your Octave function. Thanks again.

Reply
1. Jack says:
  
  July 15, 2020 at 2:50 pm
  
  My pleasure Steve. The code works as-is in Matlab, you may have to apply some minor adaptations to get it to work in Octave. If you have any questions drop me a line via the About tab top right.
  
  Jack
  
  Reply
Dan Kimberling says:

September 1, 2020 at 3:18 pm

Great post, very useful links.
Just one small correction:
Under P.S raspiraw you mention that to add a header to the output use -h. That should actually be -hd according to the listing in GitHub.

Reply
1. Jack says:
  
  September 1, 2020 at 3:25 pm
  
  Oops, you are right thanks! Corrected in the text.
  
  Reply
Derek Griffith says:

October 14, 2020 at 10:13 am

Thank you very much. This is the only place I could find the relevant information to decode the raw HQ images. I wrote the equivalent decoding in python (no demosaicing):
import numpy as np import io # Read in the whole binary image tail of the # .jpg file with appended raw image data with open('piraw.jpg', 'rb') as filraw: filraw.seek(-18711040, io.SEEK_END) imbuf = filraw.read() if imbuf[:4] != b'BRCM': print('Binary data start tag BRCM was NOT found at this seek position') else: print('Binary tag data BRCM was found at this seek position')
# Image data proper starts after 2^15 bytes = 32768 imdata = np.frombuffer(imbuf, dtype=np.uint8)[32768:] # Reshape the data to 3056 rows of 6112 bytes each and crop to 3040 rows of 6084 bytes imdata = imdata.reshape((3056, 6112))[:3040, :6084] # Convert to 16 bit data imdata = imdata.astype(np.uint16)
# Make an output 16 bit image im = np.zeros((3040, 4056), dtype=np.uint16) # Unpack the low-order bits from every 3rd byte in each row for byte in range(2): im[:, byte::2] = (imdata[:, byte::3] <> (byte * 4)) & 0b1111)

[Edit: there may have been a transcribing error in the unpacking line of code just above, see my reply to James’ comment below]

Reply
Jack says:

October 14, 2020 at 10:29 am

Excellent, thankyou Derek.

Reply
James says:

November 3, 2020 at 8:02 am

Thank you for your excellent efforts, Jack!

The last line in Derek’s post was giving me problems in Python3. It may be the result of a transcription error or the comments filtering changing the text [Edit:…]

Reply
Jack says:

November 3, 2020 at 9:56 am

Hi James, thanks for your thoughts.

I am not fluent in Python so I don’t feel qualified to comment on the unpacking line in Derek’s post. The inefficient but intuitive way to think about it is to create two images corresponding to each pixel, one with just the 8 high bits shifted left by four and one with the 4 low bits only. Then add (or OR) the two together. My naive attempt:

im[:, byte::2] = ( (imdata[:, byte::3] << 4) | ((imdata[:, 2::3] >> (byte * 4)) & 0b1111) )

Something like that – but caveat emptor since I don’t know Python and the comment editor does its own weird formatting.

Reply
James says:

November 7, 2020 at 9:23 am

Thank you again, Jack.

I too am a novice with python (and raw data manipulation), and it seems I was throwing away some of the image data with my edit. Yours appears to capture the total bit depth. Very nice!

If you think it’s appropriate you can remove my previous post so as not to confuse others

Reply
Jack says:

November 7, 2020 at 10:06 am

You are welcome James, I’ll erase the offending line from your previous comment for clarity.

Reply
Freisei says:

February 2, 2022 at 11:24 am

Fine Work!

U wrote
“Nevertheless it seems that nobody has been able to get more than about 11 fps out of full resolution 12-bit Mode 3, and I am no exception.”

Do you think it is possible to read out only a small stripe (i.e. 10×4056 px) with frame rate at about 100fps?

Or can the sensor (sensors memory) only be read out fully at once?

Greets from Bavaria

Reply
1. Jack says:
  
  February 2, 2022 at 3:45 pm
  
  Hi Freisei,
  
  Above I was referring to full-rez snapshots taken with the HQ Cam in Mode 3. As I assume you know there are several video modes in the HQ Cam that can record lower resolutions at over 100fps (Mode 4), though I am not sure one exists to do what you wish. There is a gentleman that goes by the name HermannSW on the Pi Forums Camera Board who is an expert at high fps video. You may want to search the board for his posts and/or PM him about your application.
  
  Jack
  
  Reply
  1. Freisei says:
    
    February 2, 2022 at 5:23 pm
    
    i need nearly full resolution in one axis. So the predefined modes are maybe not enough.
    
    Thank you very much, i´ll search for HermannSW!
    
    Reply
Andy Dodd says:

March 24, 2022 at 7:18 pm

FYI if you really want the extra framerate, Arducam sells a 4-lane version – but you will either need a custom CM4 carrier board or an appropriate NVidia Jetson unit (ALSO requires a custom carrier board. 🙁 )

https://www.arducam.com/product/b0242-arducam-imx477-hq-camera/

Reply
cpixip says:

September 5, 2022 at 6:47 pm

In your code “loadRaspiraw.m”, you use a table “asn” with 11 entries. And of course the corresponding ccm-table with 11 entries. One of the entries in the “asn”-table, the one for a cct = 3000K, “3000,0.4782,0.4221;” is missing in the current (Sep 2022) tuning file for the IMX477. I assume that you have simply added that one to your code?

Another question: I noticed wildly varying component values for the ccms (in the tuning file as well as in “loadRaspiraw.m”) when the color temperature is varying. For example, the first component of the ccm jumps from 1.73511 to 2.06374 when the cct changes from 2970 to 3000.

In total, there are five ccms in the tuning file, all in the range of 3000 to 5000, which look a little bit pecular. Maybe the light source used in these calibrations was a little bit spiky?

As far as I understand it, the ccm which is actually embedded in a .DNG-capture is created by libcamera on the basis of those ccms taken from the tuning file. So it’s probably not a good idea to use the in the DNG-file embedded one for processing raw images?

There are also DCPs based on your work (https://github.com/davidplowman/Colour_Profiles/tree/master/imx477), which give in principle a much smoother variation of the ccm as a function of the cct. So that would probably be a better way to read and process the raw image data?

Reply
1. Jack says:
  
  September 6, 2022 at 7:04 am
  
  Hello cpixip,
  
  It’s been a while, for loadRaspiraw it looks like I simply copied the ccms that the foundation offered up for the HQ cam at the time. IIRC asn stands for AsShotNeutral (the DNG label, ct_curve in the link), for 3000K I just copied the values for 2970K.
  
  But you are absolutely correct, those matrices and asns looked incompatible and lacked relevant information for their use. David did not know what to make of them, which is why I estimated my own.
  
  The dcps you link above should work better because they were created directly from raw files David provided, by using Anders Torger’s excellent Lumariver Profile Designer (open source DcamProf). Though I also did not have much information as to the quality of the light sources and their spectra. Feel free to use a DNG workflow with them or to lift the relative data from the ‘Repo’ file for use in Matlab.
  
  However, as you probably know I am not a fan of interpolating between sources of wildly different, possibly unknown, spectra – see for example Figure 7 and 8 here. So for color-critical work you would be better served by profiles based on similar sources (e.g. D50-D65, LED2700-LED3300, FluorescentXXXX… etc.)
  
  Jack
  
  Reply
  1. cpixip says:
    
    September 6, 2022 at 4:19 pm
    
    Jack, thank you very much for the additional information. That matches my expectations.
    
    My use case is using this sensor in film scanning applications. And for such an application, a single dedicated ccm would of course be sufficient (possibly different ones for different film stock)
    
    This task is complicated a little by the fact that it is close to impossible to aquire color targets of old film stock in the small size required (Super-8/Kodachrome = 5.69×4.22 mm) for a classical calibration approach.
    
    Currently, I am trying to circumvent this by simulating the whole sensor-illumination setup, based on (partially) known filter curves of the CFA and IR-filters used in the Raspberry Pi’s IMX477. Initially I wasn’t expecting too much, but the results I’m getting aren’t too far removed from a classic color proof calibration of the sensor. But still some additional work is needed.
    
    I would like to take this opportunity to thank you for your excellent website, which is a veritable treasure trove of all things related to color science and cameras!
    
    Reply

Strolls with my Dog