Complexity Of Overdrive for Variable Refresh Rate (G-SYNC, FreeSync)

YouTube · Post by **Chief Blur Buster** » 23 Jun 2020, 23:07

The Complexity of VRR Overdrive

Variable refresh rate overdrive is extremely difficult.

Some panels (especially VA) benefit from frametime prediction (overdrive tuning that utilizes multi-frametime history of the last 2 frametimes, and the next 1 frametime). This is very hard to do without buffering for input lag, to improve overdrive processing for VRR.

The Variable Overdrive Problem

With fixed overdrive, you can have varying artifacts for varying refresh rates -- e.g. more ghosting (smearing) at certain frame rates, and more coronas (bright afterimages) at other frame rates, If you're not familiar wth LCD motion artifacts, see LCD Motion Artifacts 101.

That's variable artifacts for generic untuned/uncertified Adaptive Sync that doesn't use VRR-aware overdrive tuning.

Good Variable Refresh Rate Overdrive is Horrendously Complex

That's why G-SYNC chip has been so impressive in the quality of their overdrive tuning. It's horrendously complex, since you need to correctly calculate an Overdrive Gain based on multiple frametimes (all of them different).

Sometimes knowing only the previous frametime (refresh interval) is not sufficient information to have high quality variable-refresh overdrive panels. And some panel electronics don't behave very well with realtime precise changes to Overdrive Gain.

Basically, the scaler needs some basic knowledge of how much the panel is ghosting/coronaing at ALL times, and dynamically change overdrive for the next refresh cycle to compensate for an accumulated/deficit (which can look different for a pair of fast-slow, slow-slow, slow-fast and fast-fast frametimes for the last 2 refresh cycles). While also trying to predict the next frametime, to try to pre-emptively reduce artifacts in advance of the next expected refresh cycle.

Even without knowing what companies do -- a lot of complexity lurks underneath good variable refresh rate overdrive. I know this because I have done custom overdrive tuning for some panels before.

Science of The Generic Overdrive Algorithm

It also can be done in software too (overdrive executed in a GPU shader). Fixed refresh rate overdrive is simply an A(B)=C lookup table operation, where A is original grey, B is destination grey, C is an intentionally undershoot/overshoot grey to speed up transition from A to B.

Depending on panel or firmware, C might be a final value or a delta value (difference versus 'B' destination color), but the OD LUT is the same simple High School Mathematics.

The Overdrive Lookup Table for Fixed Hz

This overdrive LUT operation is executed 3 times per pixel (once each for each subpixel, R, G, B) So a transition from A=20 to B=200 may require C=215 (the overdriven value to speed up 20 to 200). While at the panel level it's voltages, you can actually map voltages in software simply by using dimmer pixels or brighter pixels. This is how "ATI Radeon Overdrive" worked 15 years ago, a software-based overdrive. Most scalers are simply doing the same thing; they're using exactly the same kind of LUT operation to control the panel's overdrive voltages -- greyscale colors representing voltages.

This is very easy-peasy, you just run a simple LUT operation. Fixed-Hz overdrive

Overdrive Gain is often simply a multiplier to an OD LUT

In most panels, Overdrive Gain is simply a multiplier value to the delta between B and C (to amplify/attenuate).

Corners cut in fixed-Hz Overdrive

Most panels use a truntated LUT (such as 17x17 or 9x9 or 65x65) instead of a true full 256x256 overdrive lookup table which can make Overdrive look MUCH better, especially on panels with small GtG-heatmap problem spots. The scaler interpolates a low-resolution OD LUT, since a full OD LUTs takes up scaler memory (64 kilobytes per specific video mode).

Generating OD LUTs

You run GtG analysis on a panel for all color combos, and generate an OD LUT based off that. Unfortunately, this is not as easy as thought, as there can be over 60,000 different GtG numbers for a single panel -- see forum thread

Now try to add VRR compatibility...

But now, try to add frametimes to the simple overdrive lookup table A(B)=C ... even just one frametime lookbehind, you have to generate multiple OD LUTs for different framerates. You can simply interpolate between multiple LUTs for simplicity.

To save scaler memory -- you can generate OD LUTs at intervals. For example, you generate OD LUTs for 30fps, 40fps, 50fps, 60fps, 70fps, 80fps, 90fps, 100fps, 110fps, 120fps, 130fps, 140fps.

That would be 12 different LUTs, at 64 kilobytes per 256x256 OD LUT, consuming about 768 kilobytes of scaler memory for much better VRR overdrive. This is a crude simplistic example, but you get the idea.

And when a game runs at 47 frames per second (1/47sec frametime) you might mathematically calculate an interpolated value as 0.3 times 40fps OD LUT value, plus 0.7 times the 50fps OD LUT value. Now you've added VRR OD!

This can improve VRR overdrive quite a bit, but there's still more room for improvement...

But previous-frame frametime is not always enough information...

Overdrive artifacts look different for these situations of last 2 refresh cycles:

Long frametime immediately followed by short frametime
Short frametime immediately followed by long frametime
Long frametime immediately followed by long frametime
Short frametime immediately followed by short frametime

Those LCD GtG curves will intersect differently in all those cases=DIFFERENT overdrive artifacts!

As we know, frametime equals refresh interval whenever running within VRR range. So for 48fps-240fps, you've got 1/48sec down to 1/240sec frametimes.

As framerates change, consecutive different-frametimes creates different overdrive artifacts (ghosting or coronas). This is a bigger problem with slower-responding panels, and with colors beyond overdriveable range (e.g. primary colors)

Now you're having to modify your VRR overdrive algorithm to factor in 2 frametimes. Whether it's the 2 previous frametimes, or the previous-and-next frametime.

Some good VRR overdrive uses next-frametime prediction

That's another complexity thrown into variable overdrive for variable refresh rate displays. The best-looking VRR occurs if you know about at least the last few refresh cycles (frametime), followed by 1 next refresh cycle (frametime)

For fast pixel response displays that can always complete near GtG 100% (Say, about GtG98%) before next max-Hz refresh cycle -- then you really only only need 1 frame lookbehind, and possibly 1 frame lookahead (budget VRR overdrive omits that though).

For panels unable to reach near GtG100% (i.e. GtG98%-GtG99%) after 1 refresh cycle, you ideally want multiple-frame lookbehind and/or lookahead (or prediction), for improved overdrive.

Now simple High School Math becomes complex University Calculus... Almost.

Scaler memory is a big problem for good VRR overdrive

If you go with a memory-based approach, you need to test multiple frametime combinations. Remember the 12 OD LUT situation (768 kilobytes)? If you generate precalibrated LUTs for a 2-frametime history, you've got a 12x12 situation of 144 OD LUTs. Now you're consuming about 9 megabytes of scaler memory! Many scalers don't even have that much RAM.

Now, if you're trying to massively improve VA-panel overdrive (the VA problem GtG heatmap spots), at 240 Hz refresh rate, you now may need to literally do 20 LUTs per 1-frametime depth, 400 LUTs per 2-frametime depth, or 8,000 OD LUTs per 3-frametime depth. Ouch. That's half a gigabyte of scaler RAM for amazing VA-panel 240Hz VRR (assuming user's room temperature matches the same temperature as the laboratory calibration -- temperatures really changes the look of VA-panel overdrive ghosting in dark shades).

To save memory, you can use low-resolution LUTs (such as 17x17 instead of 256x256), but that can create problems for some panels that have very thin GtG-heatmap hotspots; you need much higher resolution OD LUTs to fix those.

Alternative way to save memory, one can simply create a math formula (calculus/algebra) to replace the OD LUTs, but that becomes extremely hard for ASICs/scalers. LUT operations + interpolation between LUTs are much simpler, especially if you're outsourcing to Chinese/Taiwanese scaler companies, trying to communicate across language barriers.

To improve VRR OD by 10% gets geometrically complex, so there's diminishing curve of returns.

VRR overdrive can also be done in software via a virtual display driver

Remember ATI Radeon Overdrive? It can be done again for VRR overdrive.

Pixel voltages are simply pixel brightnesses, and thusly, can be done at the software level, as long as it's a refresh-granularity processing rather than frame-granularity processing.

I've been able to achieve better-looking VRR overdrive (for certain frametime combinations) in a software GPU shader better than the display scaler itself, in early tests. It appears some scaler/TCONs don't have enough capacity to handle the realtime processing of a slightly more complex overdrive.

The VRR Overdrive Rabbit Hole...

All of this is generic Overdrive knowledge, that has been written about for years in many public locations; including via Google Scholar and SID journals over the last two decades. I've just resummarized the overdrive complexity problem for VRR.

This has enabled me to understand and appreciate how horrendously complex dynamic overdrive for VRR panels can become... Hats off to vendors (such as NVIDIA) that has actually mastered this art -- it is a large part of the value of the G-SYNC premium.