Adventures in Scene Referred Space - Part One
A colorful conversation
Did you know your computer is in an unhealthy relationship? One that’s holding it back?
I’m not talking about the relationship it has with you - that’s between you, your computer, and your Candy Crush addiction. No, I’m talking about the relationship your computer has with its display.
It doesn't matter whether your machine is a laptop hooked up to its internal display or a desktop hooked up to an external flat-screen monitor, TV, or even projector. Your display is letting your computer down.
You see, a computer can think about color and light (henceforth discussed somewhat interchangeably) in a way that accurately reproduces the physics of how light works in the real world. But a display can’t output it that way. Even the most modern 4K HDR TVs can’t represent the way color and light work in the real world. They’re limited to a specific range of color, light, or energy that’s just a mere thin slice of the real world.
We describe the way displays show color and light as a limited “Display Referred Space”.
What’s going on?
There are a number of ways in which displays can’t or don’t accurately represent the real world, but the most often discussed is how displays represent the intensity of light (AKA luminance).
Because a display can only show a limited range of luminance, a “Color Transform” is applied to computed color before it is displayed to ensure that the result can be displayed within the limitations of display technology.
You’ll likely have stumbled upon the term “sRGB” at some point in the past when futzing with your computer. That’s a color space standard that includes one of the most common Color Transforms. To really get into the weeds of it, check this out.
Most of the time, a computer display will use the “sRGB” transform or one of its similarly-named variants. Though there are others: for example, high definition wide-screen TVs use a Color Transform called “BT.1886”.
A Color Transform does exactly what the name suggests - it transforms an input range of colors into an output range of colors. It does this with two key operations. First it defines exactly how colors are re-mapped from input to output within the transformation (more complex than we need discuss here) and second, it defines how luminance (the intensity of light) is transformed.
Let’s break that down a bit:
Real-world luminance works in a “linear” way from dark to bright - like this:
But our eyes don’t perceive light in a linear way. We’re not as attuned to noticing subtle shifts at the darkest and lightest ends of luminance so we see luminance more like this:
And it turns out that our eye’s “flaw” can be used to a display’s advantage: Because a display has a limited ability to display luminance, by applying this gamma curve to color information, your display limits the amount of “wasted” data representing light and dark that we can’t perceive anyway. And in turn, “saves” this limited luminance information for the middle of the color spectrum where are eyes are more attuned to inaccuracy.
Notice something else about this chart? Numbers have appeared. A display can’t display an unlimited amount of luminance, so it’s “compressed” down into a range the display can show. Typically referred to as zero-to-one.
Think of it as a way of compressing all the possible light in the world into a limited framework that makes sense for our eyes. It’s not unlike music compression algorithms that work to remove only the parts of sound we can’t typically hear and save the sounds that we are very attuned to, giving us a compressed, but pleasant listening experience.
Working with light in 3D rendering
So, the physiology of our eyes impacts the way we perceive light, and displays take advantage of this fact to optimize color representation to a limited set without any perceived loss of information. But if we want our 3D renders to accurately calculate the complex interactions of real-world light, we need a different approach when the actual number-crunching takes place.
In short, we need to make light calculations in a color space that isn't limited by the ways displays work. The correct approach is to employ what’s called a linear workflow and it uses the real-world behavior of light as shown in the first graph in the previous section.
Thankfully, a start-to-finish linear workflow is something many CG packages adopt by default now; most of the time we don’t even have to think about it. We set up a scene, model some elements, bring in some textures, apply some materials, add some lighting, and render the results. Light and color will have been computed as linear and only for the sake of being visible to our eyes in a way that makes sense is the result output to our display in Display Referred Space using an appropriate Color Transform like sRGB.
So, why do we care?
Let’s go back to our music analogy. If you’re a music fan you may have heard the expression “Garbage in, garbage out”. It’s a term used by recording engineers to describe why it’s important to record every instrument or vocal at the highest possible quality because processing and layering multiple audio tracks into a final mix can often be a degradative process. So, to have the best chance of a high quality final output, your inputs need to be of the highest possible quality.
Now, take a moment to think about how there’s an analog for that in the world of 3D rendering. If we want to render the real world accurately we need to make sure that the image inputs we use (typically photos, video, textures, and HDRIs) AND the work-flow we use during light computation are of the highest quality representation of the real world (linear).
We want to make sure that we’re not putting in garbage, but putting in gold. Which brings us briefly to the “Garbage Out” part of the equation...
Saving color information
Let’s say we’ve been working in a linear workflow and kept all that accurate color and luminance goodness. Just because we can’t display it doesn’t mean we should lose it. Particularly if we want to pipe our rendered output into the input of another part of our CG pipeline such as compositing.
If we output our final render to a common file format like a PNG we’ll have locked our render into Display Referred Space because we’ve limited the complex nature of light down to just a slice of its potential information.
Without getting into a math lesson, files like PNG and JPG can only contain up to approximately 16 million colors. That’s a lot but it’s still not enough to represent color in the real world. And the luminance of our output will have typically had the gamma curve of the Color Transform (typically sRGB) baked into the information so that it’s “display ready”.
This is where 32 bit “Floating Point” file formats such as OpenEXR come into play. These files not only contain twice as many bits of information, but also aren’t limited to just integers and can also contain numbers with decimal points (hence “floating point”). Using these files opens up the storage of “Deep Color” - information that can represent not just millions, but billions of possible colors. Further, these files don’t typically employ a Color Transform and thus retain the luminance information. So, gold in. Gold out.
But wait...
Beyond linear - Scene Referred Space
Ho boy, talking about color management is always a head-scratcher, so if you’ve made it this far, thank you. But there’s one final very important concept I want to introduce you to before I bring this post to a close and follow up with some practical examples. This is leading to something, trust me.
The concept is “Scene Referred Space” - think of it as the Ying to Display Referred Space’s Yang. The final (actually first because it’s an input) piece of bread in our color management sandwich.
We’ve established that a display can only display a limited set of values known as a “Display Referred Space” and that this space inherently has a curve applied to it that both mimics the way our eyes perceive light and has the added bonus of allowing color information to be compressed.
And, we’ve established that if we use a linear workflow, we don’t use this curve during light computation, and therefore more accurately reflect the way light behaves in the real world within our computer.
But here’s the kicker. Remember that in our linear workflow diagram, we are also working in a zero-to-one space. What we’ve actually been working in is technically more accurately called “Display Linear Space” - a linear workflow within the upper and lower limits of our display.
But light in the real world doesn’t work like this. It’s not compressed into a specific set range. Light can go from anything from zero to who-knows-what.
This is where the notion of “Scene Referred Space” comes into play. Scene Referred Space describes the lighting conditions recorded by a camera in a scene. That scene can be a real world scene captured with a DSLR, phone camera, or professional movie camera, or it can be a virtual scene captured by a camera in CG.
What’s important is that this range of light isn’t limited by an upper threshold of one. This is the true gold we should be digging for in our CG workflow, our inputs, and our output.
Wrapping up
I’m going to leave you with one final head-scratcher: unlike a virtual camera in CG, a physical camera like your DSLR contains its own internal camera processing. Its own Color Transform that, no matter how subtle, doesn’t just capture the photons of light entering the camera lens but alters them significantly before saving to the memory card (a form of garbage introduced before we’ve even got going.)
And this is where all this has been leading...
If we want to seamlessly mix photographed elements with CG elements we need to make sure both inputs are pure “Scene Referred Space” gold and not (already tainted as) garbage. To do this we would need to make sure what we capture with our real-world camera is a pure expression of physical light. Our workflow handles light and color in a physically correct manner not limited by an arbitrary upper threshold and the files we output respect the physics of light for future leverage in downstream pipelines.
Only then, when we’re ready to take in our work with our eyeballs on a display, at the final moment should we transform it (think: visually compress it) into “Display Referred Space” for presentation on a display.
Stay tuned for my next post for a practical expression of all this theory.
Feeling Bold?
If you feel like you’ve begun to get a grasp on why color management is critical during a CG pipeline, here’s a great video suggestion for you. In it, compositing & color pipeline supervisor, Alex Fry from Animal Logic discusses the challenges of working in limited color spaces during the production of CG animation and how working in a Scene Referred Space (in this case a new standard called ACES that’s gradually establishing its foothold) frees up light to behave exactly as it does in the real world, allowing studios to get to physically correct renders much more quickly and accurately.
Some of the terms might be a bit “inside baseball” but armed with what you now know about Color Transforms and Curves, the gist should start to make a lot of sense.
Thanks
If you’ve found this post hopeful, please consider following me on Twitter for updates on future posts.
Once again, a huge thanks has to go to Troy Sobotka, industry veteran, and the brain behind Filmic Blender (standard in Blender as of version 2.79 and inspired by the ACES workflow discussed in the video) and a huge wealth of knowledge on all things color and lighting who opened my eyes to the importance of a Scene Referred workflow during the production of a recent VFX project. Be sure to follow him on Twitter.