Video card gamma table bit depth considerations

This topic has 2 replies, 3 voices, and was last updated 3 years, 3 months ago by Vincent.

Viewing 3 posts - 1 through 3 (of 3 total)

Author

Posts
2020-12-30 at 11:41 #27617
petrakeas
Participant
- Offline
This is more of a technical post regarding my findings about the way NVIDIA handles the 1d LUTs (gamma table) in contrast to what DisplayCal does . I don’t know if this is the responsibility of DisplayCal or Argyll CMS but here it goes.

As we all know, NVIDIA cards (except for Quadro ones) don’t use dithering to the output after applying the LUT. Hence, the effective bitdepth, when the GPU output is 8bit, is 8 bit even though the LUT internal bitdepth is 16bit. This can be confirmed by using DisplayCal’s “Report on uncalibrated display device”. Using DisplayCal Profile Loader, you can change the LUT bitdepth and switch between the following values: 8, 10, 12, 14, 16. Theoretically, 8 and 16 bit should have the exact same result on screen (since no dithering is applied and since no other processing is done after that). However, they are different. This puzzled me and made me dig deeper and found the reason.

NVIDIA truncates 16bit to 8 bit in a different way that DisplayCal does. Thus, when using 8bit bitdepth in DisplayCal, the final color mapping is different and slightly wrong.

NVIDIA maps 16 bit to 8 bit using multiples of 256 in the following way:

0-255 -> 0
256-511 -> 1
512-768 -> 2
…
65280-65535 -> 255

For example, the 16 bit value “260” fits in the 256-511 range and will result in “1” 8 bit color value.

DisplayCal, on the other hand, finds the nearest 257 (not 256) multiple of the 16 bit value.

For example, the 16 bit value “5350” is converted by DisplayCal Profile Loader to the 8 bit value “5397” (21×257).
When NVIDIA processes the LUT,
“5350” maps to 20 (fits in the range 5120-5375), while
“5397” maps to 21 (fits in the range 5376-5631).

So, the final color values on screen are different.

Another issue I observed with my RTX2080Max-Q (but not with my GTX1080) is that color values are interpolated between LUT values even in 8 bits. For example, consider the following part of a LUT:

37 9159 9640 9495
38 9406 9933 9763
39 0 0 0
40 9904 10511 10300

where bin “39” has been set to (0, 0, 0) on purpose. As expected, the color value “39” is viewed black on screen. What I didn’t expect was color value “40” being slightly darker on screen, indicating that it was interpolated between “39” (0,0,0) color data and “40” (9904 10511 10300) color data.

When the GPU output is 10 bit, it’s expected (and it happens) that 1 bin value affects 4 color values since 10 bit values (a total of 1024 values) are interpolated between the 256 LUT values.

To make these tests I used Xrite’s LUT tester, which allows you to read the LUT data and set your own. Note that to set your own LUT, you need to close DisplayCal Profile Loader to avoid it overwriting your data. I used the attached grayscale pattern from lagom, viewed with IrfanView having color management disabled. For 10 bit tests, I connected to LG C8 with 12bit GPU output and used madVR in 10 bit fullscreen window with dithering disabled and PC level output using the attached 16 bit pattern. I used the attached test LUT that sets the “39” bin to black. You can also find attached default LUTs and the ones created by DisplayCal (the 8bit version is the one created by DisplayCal Profile Loader when setting the bitdepth to 8 bits).

As a conclusion, it seems that the way LUT is applied is different between NVIDIA generations and different to what DisplayCal Profile Loader thinks, which leads to precision errors.

Attachments:
You must be logged in to view attached files.
2021-01-07 at 14:48 #27764
Petros Drakoulis
Participant
- Offline
Wow. So, this is the reason why 8-bits and 16-bits settings seem so different. I see annoyingly high banding with 8-bits, while 16-bits fix it up to the monitor’s capabilities.
2021-01-07 at 16:56 #27766
Vincent
Participant
- Offline
To make these tests I used Xrite’s LUT tester, which allows you to read the LUT data and set your own. Note that to set your own LUT, you need to close DisplayCal Profile Loader to avoid it overwriting your data. I used the attached grayscale pattern from lagom, viewed with IrfanView having color management disabled. For 10 bit tests, I connected to LG C8 with 12bit GPU output and used madVR in 10 bit fullscreen window with dithering disabled and PC level output using the attached 16 bit pattern. I used the attached test LUT that sets the “39” bin to black. You can also find attached default LUTs and the ones created by DisplayCal (the 8bit version is the one created by DisplayCal Profile Loader when setting the bitdepth to 8 bits).

If Xrite tester uses the same OS APIs as i1Profiler LUT loader (XRGamma.exe or something like that in Windows) it causes truncation no matter which GPU & LUT bitdepth you have, thus using it should be avoided (again if it uses the same API).

As a conclusion, it seems that the way LUT is applied is different between NVIDIA generations and different to what DisplayCal Profile Loader thinks, which leads to precision errors.

That is Nvidia HW (long lasting) wrong design. Unless they do the same way as AMD you’ll have banding over 8bit connections.
Direct truncation to 8bit as nvidia does will result in banding too.

Wow. So, this is the reason why 8-bits and 16-bits settings seem so different. I see annoyingly high banding with 8-bits, while 16-bits fix it up to the monitor’s capabilities.

If you mean “color managed” it is usually application fault (Photoshop, GIMP etc): a fast simple way of truncating. ACR/LR/CaptureOne do it in the proper way, dither (a dither applied on app, before sending data to LUT).
Author

Posts