I've created the equivalent of "symbolic links" to this thread from both the Input Lag and Programming forums, because this thread spans multiple topics.
Great programmer talk here too.
Now I'm going to open the famous
Blur Busters Latency Pandora Box.... latency gradients. Scanout latencies and how they're affected by the various sync technologies.
An additional factor to understand is a GPU output serializes a 2D framebuffer into a 1D transmission, in a raster fashion, left-to-right, top-to-bottom, as the standard scanout direction over the last literally ~100 years (whether a 1930s analog television broadcast or a 2020s DisplayPort cable).
Now visualize this as a scanout diagram, here's the famous Blur Busters time-based diagrams:
So the button-to-pixels input lag of a framebuffer is actually a
latency gradient.
During VSYNC ON and GSYNC,
the latency gradient is full framebuffer height at the scanout velocity (during VRR, scanout velocity is always max Hz, e.g. 40Hz on a 240Hz GSYNC is always 1/240sec scanout).
During VSYNC OFF,
each frameslice are independent latency gradients. Three frameslices per refresh cycle (e.g. 180fps at 60Hz) means each frameslice is roughly 1/3th screen height (taking into account the VBI size between refresh cycles).
VSYNC ON = Present() blocks for next refresh interval (if the frame queue, if any is used, is full)
VSYNC OFF = Present() nonblocking and splices realtime into scanout
VRR = Present() controls the timing of refresh cycle (monitor begins refreshing the moment the software Present()s
Symbollic scanout diagrams to help software developers understand the deterministicness of latency from Present() to photons.
Sure, other factors abound in the latency chain. There's often a fixed absolute lag, GtG lag, queued framebuffer lag, monitor processing lag, etc -- and some monitors are virtually lagless (e.g. many eSports TN monitors which displays pixels essentially realtime off the port) but we're omitting this, and focussing on latency from API-to-graphics port). But this symbollic diagrams will be a big help to the software developer who want to understand area-related latencies better.
Present() essentially "splices" into the existing scanout during VSYNC OFF
(Think of scanout = as the act of serializing a 2D frame buffer out of the 1D graphics port transmission)
240fps at 60Hz means all pixels output on the GPU port has no more than 4ms lag (maximum). The top edge of frameslices have the least lag (being the first pixels to display), and the bottom edge of frameslices have the most lag (being the last pixels to display).
Which means latency is more uniform for the whole screen plane during VSYNC OFF, unlike for VSYNC ON
Also, those who are familiar with the Leo Bodnar lag tester, top/center/bottom have increasing amounts of lag. Leo Bodnar is a 1080p 60Hz VSYNC ON lag tester. However, VSYNC OFF breaks the scanout latency barrier to make sub-frame scanout latencies possible, at the penalty of tearlines. The higher the framerate, the smaller the latency gradients become, and the aiming becomes more predictable/smoother (e.g. ultra-high-framerate CS:GO).
This is why game developers must pay attention to framepacing, and make sure that the gametimes are in sync with frametimes, to prevent stutters and erratic latencies. Sub-frame millisecond errors are still hugely visible as annoying stutter & annoying latency jittering. I've met some 60fps @ 60Hz games that felt like they had random internal latency jitter as big as 15 milliseconds (nearly one refresh cycle) with gametimes badly out of sync with frametimes. BAD, BAD! Don't do this, game developers, please.
Easiest method is to simply keep gametimes in sync with Present() times, though fluctuating frametimes can make this an imperfect algorithm, so the middle gametime of the center of a frametime, can sometimes be a more ideal metric (exact middle of variable-height frameslices from variable frametimes) for perfect latency averaging, but this requires predicting rendertimes, so this "ideal latency approach" is almost never done, and gametimes are just synchronized to top edges of frameslices, which is "Good Enough" especially for consistent framerates.
Present() triggers the refresh cycles on VRR monitors:
(VRR -- including G-SYNC and FreeSync -- is essentially variable-sized blanking intervals to temporally space out the dynamic/asynchronous refresh cycles that are software-timing-triggered. As long as Present() interval is within VRR range, the display refresh cycle timing are always software-triggered on a VRR monitor)
Now stutters can still show up in VRR if gametime intervals grossly diverge away from frame rendering times, e.g. very erratic frame rendering times, e.g. one frame is 1/40sec render and next frame is 1/200sec render. This is often because the refresh display time is based on the PREVIOUS frame render (the frame fully delivered to the monitor). If that PREVIOUS frame render was a very fast render, but the next frame render is very long, that "fast rendered frame" will be the displayed refresh cycle for a much longer duration (because the next frame -- a slow frame -- is still rendering). So your rendertime-displaytime is more out of sync. Stutters start showing through more again during VRR, as VRR is not perfect at eliminating every single stutter.
For VRR, best stutter elimination occurs when refreshtime (the time the photons are hitting eyeballs) is exactly in sync with the time taken to render that particular frame. But VRR does current refreshtime equal to the timing of the end of previous frametime (one-off), so very bad stutters will still show through if frametimes grossly diverge from refreshtimes (because it's 1-frame-trailing)
The moral of the story is very accurate gametimes are a must in modern engine programming where the render is running off the gametimes, in the era of erratic intervals between renders, and the need to not contribute additional microstuttering (= same thing as latency jittering) where unnecessary.
The worse the microstutter from errors, the more latency jitter there is, and it's harder to do twitch aims during bad latency jitter (Even at the same framerate). Because of bad programming (microstutter = latency jittering), I've seen worse aiming even at 80fps than a well-optimized engine running at 50fps.
This concludes another part of the
Blur Busters Latency Pandora Box series. I do apologize for opening this Pandora's Box, but this topic is rather interesting!
Enjoy.