Page 3 of 7

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 06 Jan 2019, 16:17
by pegnose
@Chief Blur Buster:
This is seriously getting out of hands!! :D

I will read your post carefully and maybe come back with one or two questions, if I may.

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 06 Jan 2019, 21:38
by Chief Blur Buster
pegnose wrote:So RTSS replaces the present() function with one that is able to wait until the given frame time requirement is fulfilled. Effectively the system is CPU limited (hence 0 pre-rendered frames), even if actually not much work is done.
There are many tactics of waiting. Let's also note the GeDoSaTo technique of delaying inputread+render+flip too, which is not normally typically done by RTSS. It's possible for a capper to present quickly, followed by immediately a predictive blocking-wait (based on historic frametimes) before returning to the caller's Present() to put the next inputread-render-flip closer to the VSYNC intervals, like this:

Image

There is a new framerate capper here too:
viewtopic.php?f=10&t=4724
pegnose wrote:
RealNC wrote:The highest value you can use in NVidia Profile Inspector is 8. Maybe the driver allows even higher values, but there's no point in having an option to that. Even 8 is pretty much useless. Games become unplayable.

Not sure what the default ("app controlled") is when a game doesn't specify a pre-render queue size. I've heard it's 3, and that's decided by DirectX. But not sure. It could be 2. Or it might depend on the amount of cores in the CPU. The whole pre-render buffer mechanism and the switch to asynchronous frame buffer flipping was done because mainstream CPUs started to have more than one core in the 2000's.
Makes sense. Wow, 8!
It can be useful in Quad-SLI situations, then it's only 2 per card.

Look at Quad Titan setup for an eye opening setup I wrote about five years ago. Whoo!

Correct me if I am wrong, but it is my understanding that prerendered frames can be useful for assisting in SLI framepacing to fix microstutters. (Hopefully it's not multipled by cards, like 8 times 4 equals 32 -- that would be insanely ridiculous)
pegnose wrote:@Chief Blur Buster:
This is seriously getting out of hands!! :D
I will read your post carefully and maybe come back with one or two questions, if I may.
No worries, but since people were coming up with "scanout latencies" discussion. I had to make sure that was covered. Most of this is far beyond the average layperson's ability to conceptualize but this topic thread certainly attracted highly skilled discussion.

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 06 Jan 2019, 22:20
by jorimt
pegnose wrote:From the previous discussion I understood that VRR with V-Sync enabled in NVCP has the same issues as mere V-Sync, correct? Only if you disable V-Sync with VRR, and accept some minor tearing here and there, you can get rid of all the down-sides?
I *think* you're talking about the issue (correct me if I assumed wrong here) where G-SYNC + V-SYNC, when within the refresh rate, can exhibit more hitching during frametime spikes or transitions from 0 to x frames (and visa-versa) when compared to standalone V-SYNC, no sync, or G-SYNC + V-SYNC "Off"?

Because G-SYNC (even paired with the V-SYNC option) has none of the issues that standalone V-SYNC has within the refresh rate; it has a different issue (and cause) entirely.

If that's what you're referring to, it's because 1) G-SYNC + V-SYNC appears to have what could be called a brief "re-initialization" or "re-connection" period (between the module and the GPU) when there is an extremely abrupt transition in the framerate (FYI, there were official reports of a 1ms polling rate on the original modules, and it was never confirmed whether that polling rate either diminished or was entirely eliminated in further module iterations), and 2) since the scanout speed itself on the display is static and unchanging, unavoidable timing misalignments between framerate and scanout progress can occur during these abrupt transitions where the module attempts to adhere to the scanout (to completely avoid tearing), which will create frame delivery delay where the other methods would not, since the module may have to hold the next frame and skip part of (or sometimes, even all of a scanout cycle) before delivery in extreme instances, and thus repeat display of/refresh with the previous frame once or more in the meantime...

At very high frametimes (or instantly going from a very high frametime to a very low frame time, or visa-versa) that can obviously becomes problematic, causing a visible hitch (or hitches), separate and individual of (but ultimately directly triggered by) the system, which causes the frametime spikes (that are still visible without or with any form of syncing) in the first place.

Point "1" has never been fully confirmed (and isn't easy to test for), and as for point "2," even if there isn't a "polling rate" between the monitor module and the GPU, the issue can only really be diminished/avoided by nearly constant, perfect frametimes (which is virtually unachievable the vast majority of the time on all but older/least demanding games paired with the most powerful systems) or (the more achievable) continued increase of the static scanout speed (240Hz, 480Hz, etc).

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 13 Jan 2019, 17:03
by pegnose
Thank you very much, this post was extremely interesting and helped my understanding!
Chief Blur Buster wrote: Which means latency is more uniform for the whole screen plane during VSYNC OFF, unlike for VSYNC ON
At the time a particular line appears on the screen this line is less old then if it was created with VSÝNC ON. So latency is more uniform with respect to when data appears on the screen. But at the time all lines have been drawn and the image is complete, the latency with VSYNC ON is more uniform again from a certain standpoint, because all pixels have been created at the same time and have the same latency wrt the time point of viewing (while with VSYNC OFF the image consists of multiple images in fact).
Chief Blur Buster wrote: For VRR, best stutter elimination occurs when refreshtime (the time the photons are hitting eyeballs) is exactly in sync with the time taken to render that particular frame. But VRR does current refreshtime equal to the timing of the end of previous frametime (one-off)...
Enjoy. :D
I am having trouble understanding this, particularly the last sentence. Do you happen to have a figure to illustrate this clearer? Which time depends on what time from 1 frame earlier?

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 13 Jan 2019, 17:34
by pegnose
Chief Blur Buster wrote: There are many tactics of waiting. Let's also note the GeDoSaTo technique of delaying inputread+render+flip too, which is not normally typically done by RTSS. It's possible for a capper to present quickly, followed by immediately a predictive blocking-wait (based on historic frametimes) before returning to the caller's Present() to put the next inputread-render-flip closer to the VSYNC intervals, like this:

Image
Ah, jorimt and RealNC were talking about that, too. Seems like the optimal solution. So, some games implement this?
Chief Blur Buster wrote: There is a new framerate capper here too:
viewtopic.php?f=10&t=4724
Nice.
Chief Blur Buster wrote:
pegnose wrote:
RealNC wrote:The highest value you can use in NVidia Profile Inspector is 8. Maybe the driver allows even higher values, but there's no point in having an option to that. Even 8 is pretty much useless. Games become unplayable.

Not sure what the default ("app controlled") is when a game doesn't specify a pre-render queue size. I've heard it's 3, and that's decided by DirectX. But not sure. It could be 2. Or it might depend on the amount of cores in the CPU. The whole pre-render buffer mechanism and the switch to asynchronous frame buffer flipping was done because mainstream CPUs started to have more than one core in the 2000's.
Makes sense. Wow, 8!
It can be useful in Quad-SLI situations, then it's only 2 per card.

Look at Quad Titan setup for an eye opening setup I wrote about five years ago. Whoo!
Holy shit! :D
Chief Blur Buster wrote: Correct me if I am wrong, but it is my understanding that prerendered frames can be useful for assisting in SLI framepacing to fix microstutters. (Hopefully it's not multipled by cards, like 8 times 4 equals 32 -- that would be insanely ridiculous)
Interesting, never tried that. Unfortunately, I have sold my 1080 SLI setup in favor of a 2080 Ti. Basically no VR game supported VR SLI.

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 13 Jan 2019, 17:42
by pegnose
jorimt wrote:
pegnose wrote:From the previous discussion I understood that VRR with V-Sync enabled in NVCP has the same issues as mere V-Sync, correct? Only if you disable V-Sync with VRR, and accept some minor tearing here and there, you can get rid of all the down-sides?
I *think* you're talking about the issue (correct me if I assumed wrong here) where G-SYNC + V-SYNC, when within the refresh rate, can exhibit more hitching during frametime spikes or transitions from 0 to x frames (and visa-versa) when compared to standalone V-SYNC, no sync, or G-SYNC + V-SYNC "Off"?

Because G-SYNC (even paired with the V-SYNC option) has none of the issues that standalone V-SYNC has within the refresh rate; it has a different issue (and cause) entirely.

If that's what you're referring to...
Honestly, I can't say anymore without reading up much of my past comments. I might have only meant the frame stacking issue. But this was much more interesting! If you think you know by now most of the intricate details of a technique, there is still so much in between. :)

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 14 Jan 2019, 01:58
by Chief Blur Buster
pegnose wrote:Ah, jorimt and RealNC were talking about that, too. Seems like the optimal solution. So, some games implement this?
Very few. It takes a lot of intentional programmer work to do that. I've seen some emulators with a configurable inputread-delay to reduce input lag.

However, it's nowadays much easier to simply use VRR for that because you can simply use an accurate software timer to tighten the inputread+render+display cycle, since a VRR display immediately refreshes on frames being Present()'d to the display output (no such thing as a scheduled refresh interval for VRR...)

However, that does piddly squat for fixed-Hz use cases, as well as those of us who want to get perfect framepacing at low latencies for ULMB/motion blur reduction. (That's a big raison d'etre of the new RTSS scanline sync feature -- it makes great low-lag stutterless ULMB possible in some low-GPU-overhead games)
pegnose wrote:Interesting, never tried that. Unfortunately, I have sold my 1080 SLI setup in favor of a 2080 Ti. Basically no VR game supported VR SLI.
I've got an Oculus Rift, so I guess I should be replacing my GTX 1080 Ti Exteeme with an RTX 2080 Ti eventually.

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 14 Jan 2019, 02:13
by Chief Blur Buster
jorimt wrote:If that's what you're referring to, it's because 1) G-SYNC + V-SYNC appears to have what could be called a brief "re-initialization" or "re-connection" period (between the module and the GPU) when there is an extremely abrupt transition in the framerate (FYI, there were official reports of a 1ms polling rate on the original modules, and it was never confirmed whether that polling rate either diminished or was entirely eliminated in further module iterations), and 2) since the scanout speed itself on the display is static and unchanging, unavoidable timing misalignments between framerate and scanout progress can occur during these abrupt transitions where the module attempts to adhere to the scanout (to completely avoid tearing), which will create frame delivery delay where the other methods would not, since the module may have to hold the next frame and skip part of (or sometimes, even all of a scanout cycle) before delivery in extreme instances, and thus repeat display of/refresh with the previous frame once or more in the meantime...
From what I am understanding in newer developments:

FreeSync is fundamentally unidirectional, and VRR can be operated unidirectionally. I think some newer GSYNC implementations are now unidirectional from what I'm seeing, e.g. closer to simplified NVIDIA-specific piggyback enhancements onto VESA Adaptive-Sync panels. Further engineering appears to be simplifying VRR implementation. Which may mean that the difference between GSYNC and FreeSync is gradually diminishing on the low-end and mid-range. High end GSYNC probably still require some 2-way behaviours for best operation, but the low end doesn't need it anymore -- in fact, NVIDIA already piggybacks on the VESA Adaptive-Sync with probable simplifications to VRR-based overdrive algorithms that avoids the needs of FPGAs and such. The best FreeSync improved to the point where the venn diagram of quality begins to overlap the cheaper/basic G-SYNC monitors -- and NVIDIA decided it was time to certify certain FreeSync monitors as meeting G-SYNC quality requirements.

Now....
Theoretically (if the drivers lets it be precise), the granularity of FreeSync is simply 1 unit of horizontal scanrate, so 160KHz Horizontal Scan Rate (seen in Custom Resolution Utilities) means 1/160000sec granularities in timing your refresh cycle delivery timings.

Picture, this, the GPU simply is in an "infinite loop" of outputting Vertical Back Porch scanlines until the game Present()s the new refresh cycle (that variable-length blanking interval of VRR). Which thereupon the GPU output immediately begins scanning-out at the very next scanline output immediately right after the current Back Porch pixel row (offscreen) is being scanned-out of the GPU output.

So, with proper driver software design, the 1ms granularity can be reduced to just a few microseconds (granularity of horizontal scanrate). All but eliminating the granularity, and making it safer to framerate-cap much closer to VRR max-Hz (e.g. 143.9fps instead of 141fps for 144Hz). Currently, newer VRR displays will have to be lag-tested to see if tighter caps are possible without adding input lag.

In practicality, there are possibly software precision limitations preventing such precision, but theoretically, you can have few-microsecond precision transitioning from VRR to VSYNC ON, and vice versa, and design it in a way that there's never any framebuffer backpressure or monitor refresh-blocking during ultra-tight framerate caps from a good microsecond-accurate framerate capper such as RTSS (e.g. 143.9fps cap during 144Hz VRR operation). That "3fps below" capping is seemed unnecessary, but appeared necessary due to the original 1ms polling granularity. Newer VRR tests will have to determine if tighter caps are becoming lag-free on newer VRR displays.

Basically, graphics drivers and the GPU chip would control repeat-refreshing behaviours (LFC algorithms) instead of the monitor, and high framerates would simply throttle at the driver level, instead of display. So, fundamentally, VRR is optimizeable into a unidirectional protocol, and analog FreeSync also works on certain CRTs.

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 15 Jan 2019, 18:28
by pwn
Greetings.
I have one question that worries me very much, if you can easily answer please.
I will clarify in advance my English is not very (what I regret), and I do not really understand your specific technical conversation.
But based on the conversation, I understand the best configuration for playing CsGo is Locked FPS 237 (I have a 240Hz monitor supporting Gsink) and not free FPS 300+?

Re: Pre-rendered frames etc. (continued from G-Sync 101 arti

Posted: 15 Jan 2019, 19:35
by Chief Blur Buster
pwn wrote:But based on the conversation, I understand the best configuration for playing CsGo is Locked FPS 237 (I have a 240Hz monitor supporting Gsink) and not free FPS 300+?
Yes, but only if using G-SYNC (what you called Gsink)

If you use VSYNC OFF instead, use a higher cap (e.g. 300fps or 500fps), to gain Advantages of framerates above refresh rates.

You do have a choice of:
1. G-SYNC + 237fps cap
2. VSYNC OFF + higher cap (300fps or 500fps)