Crossposting an educational thread.
hmukos wrote: ↑
22 May 2020, 19:02
I would understand this if multiple frameslices to be scanned out in this cycle would be somehow preremembered and scanned only after as a whole. But doesn't scanout happen in realtime and show frame as soon as it is rendered?
There's a cable scanout and a panel scanout, and they both can be higher/lower than each other.
Jorim nailed most of it for the panel scanout level, though there should be two separate scanout diagrams to help understand context (scanout diagram for cable, scanout diagram for panel) whenever the scanout are different velocities.
However, there are some fundamental clarifications that is needed.
The frameslices are still compressed together because the frameslices are injected at the cable level, but the monitor motherboard is buffering the 60Hz refresh cycle to scanout in 1/240sec.
Fixed-scanrate panels create input lag at refresh rates lower than max-Hz, unless Quick Frame Transport is used to compensate.
I would bottom-align the 60Hz like this, however:
Scaler/TCON scan conversion "compresses the scanout downwards" towards the time delivery of the final pixel row. So about 3/4ths of 60Hz scanout is delivered before the panel begins refreshing at full 1/240sec velocity.
Flexible Scanrate Panels
However, some panels are scanrate multisync, such as the ASUS XG248, which has excellent low 60Hz console lag:
Learn more about Quick Frame Transport
For more information about compensating for buffering lag, you can use Quick Frame Transport (Large Vertical Totals) to lower latency of low refresh rates on 240Hz panels: Custom Quick Frame Transport Signals
The Quick Frame Transport creates this situation:
This can dramatically reduce strobe lag, but Microsoft and NVIDIA needs to fix their graphics drivers to use end-of-VBI frame Present(). Look at the large green block, so frame Present() needs to be at the END of the green block, to be closer to the NEXT refresh (less lag!).
Microsoft / NVIDIA Limitation Preventing QFT Lag Reductions
Unfortunately, Quick Frame Transport currently only reduces lag if you simultaneously use RTSS Scanline Sync (with negative number
tearline indexes) to move Present() from beginning of VBI to the end of VBI. So hacks have often been needed.
This simulates a VSYNC ON with a inputdelayed Present() as late as possible into the vertical blanking interval.
The software API, called Present(), built into all graphics drivers and Windows, to present a frame from software to the GPU. Normally Present() blocks (doesn't return to the calling software) until the blanking interval. But Present() blocks until the very beginning of VBI (after the final scan line) before releasing. Many video games does the next keyboard/mouse read at that instant right after Present() returns. So it's in our favour to delay returning from Present() until the very end of the VBI: That delays input reads closer to the next refresh cycle! Thus, delayed Present() return = lower input lag because keyboard/mouse input is read closer to the next refresh cycle.
A third party utility, called RTSS, has a new mode called "Scanline Sync", that can be used for Do-It-Yourself Quick Frame Transport.
Then that dramatically reduces VSYNC ON input lag (anything that's not VSYNC OFF) on both fixed-scanrate and flexible-scanrate panels, because the 60Hz scanout velocity is the same native velocity of 240Hz.
Great for reducing strobe lag, too!
(Not everyone at Microsoft, AMD, and NVIDIA fully understand this.)
We successfully reduced the input lag of ViewSonic XG270 PureXP+ by 12 milliseconds less input lag, while ALSO reducing strobe crosstalk, with this technique. ViewSonic XG270 120Hz PureXP+ Quick Frame Transport HOWTO
Earlier, I tried large Front Porches, hoping that Microsoft inputdelayed to the first scanline of VBI before unblocking return from Present() API call. But unfortunately, Microsoft/NVIDIA unblocks Present() during VSYNC ON at the END of visible refresh (before first line of Front Porch). Arrrrrrgh. Turning Easy QFT, into Complex QFT.
But Wait! G-SYNC and FreeSync are Natural Quick Frame Transports
Want an easier Quick Frame Transport? Just use a 60fps cap at 240hz VRR. All VRR GPUs always transmit refresh cycles at maximum scanout velocity. Present() immediately starts delivering the first scanline at that instant (if monitor not currently busy refreshing or repeat-refreshing) since the monitor slaved to the VRR.
Present() is already permanently connected to the end of VBI during VRR operation. Unless the monitor is still busy refreshing (frametime faster than max Hz) or the monitor is busy repeat-refreshing (frametime slower than min Hz). As long as frametimes stay within the panel's VRR range, software is 100% controlling the timing of the monitor's refresh cycles
This is why emulator users love high-Hz G-SYNC displays for lower emulator lag.
60fps at 240Hz is much lower latency than a 60hz monitor, because of the ultrafast 1/240sec scanout already automatically included with all 60fps material on all VRR monitors! The magic of delivering AND refreshing a "60Hz" refresh cycle in only 4.2 milliseconds (both cable and panel), means ultra-low latency for capped VRR
This is why VRR is is the world's lowest latency "Non-VSYNC-OFF" sync technology.
It doesn't help when you need to use fixed-Hz (consoles, strobing, non-VRR panels).
This Posts Helps you to:
- Understand Flexible-Scanrate LCD panels (most 1080p 144Hz panels, few 1080p 240Hz panels)
- Understand Fixed-Scanrate LCD panels (most 1080p 240Hz panels, most 144Hz 1440p panels)
- Understand Quick Frame Transport
- Understand Quick Frame Transport's ability to workaround low-Hz lag on Fixed-Scanrate Panels
- Understand VRR
- Understand How VRR is similar to Quick Frame Transport
- If you are a software developer, Understand that software controls triggering variable refresh monitor's refresh cycle via Present()