Some observations on Scanline Sync

Aldagar · Post by **Aldagar** » 09 Apr 2020, 14:03

These last days I've been studying the behaviour of Scanline Sync and its interaction with other sync technologies and I'm writing this for anyone who might be intrigued about it, but also to pose some doubts of my own. Please, understand that it will based on my technical knowledge, and as such, it might be subject to errors and I'm fully aware I might have misinterpreted the results, so feel free to correct me if you see any mistake. Also, English is not my native language, but I will do my best to explain it well.

The software I've used are AMD's OCAT, Nvidia's FrameView and CapFrameX. These programs give you detailed data of every single frame they capture such as when it's started (presented to the CPU), when its render is complete (sent to the back buffer) and when it's displayed (sent to the front buffer). They also allow you to see frametimes graphs and CapFrameX offers input lag approximation and until-displayed-times graphs too.

I would like to point out that while overlay monitoring software such as Afterburner seem to measure frame time as the delta time between two consecutive frames being submitted to the GPU, the programs I utilized do it as the delta time between two consecutive frames being presented to the CPU. I believe this is the more appropiate way to do it and a better representation of frame pacing, since, assuming game time and render time are in sync, the data being processed by the CPU determines what will be displayed in a frame. This is also the reason why Afterburner shows perfectly stable frame times when using the RTSS framerate cap, and because the way RTSS operates is by blocking the CPU from delivering a new frame to the GPU until a specified time interval is reached.

I used the data of these three programs to contrast the results. I would like to post screenshots, but because it's a lot of cramped numerical data I don't think it's a good idea, but I encourage anyone who's interested to try these tests on their own.

I used the following list of games to test Scanline Sync with diferent engines and APIs:

Assassin's Creed Unity
Assassin's Creed Syndicate
Battlefield 1
Battlefield V
Doom 2016
Doom Eternal
Black Mesa
Crysis 1, 2 & 3
Dark Souls 1, 2 & 3
Hollow Knight
Mad Max
Call of Duty Modern Warfare 2019 (I got a permanent ban from Activision servers, I think they mistakenly detect monitoring software as cheats, so I don't recommend testing this game)
Sleeping Dogs
Skyrim Special Edition
The Witcher 3
Ghost Recon Wildlands
Wolfenstein The New Order
World War Z

All the observations were made on a 60Hz monitor, so 1 refresh cycle = ~16.67 ms.

From what I can gather, Scanline Sync makes the GPU try to push a new frame to the front buffer (or back buffer in combination with other sync methods) when the display reaches a certain scanline (the index you introduce in RTSS), and at the same time, presents the next frame to the CPU. This means that, not taking into account other sources of input lag, you will only get a refresh cycle equivalent of latency, but also that your hardware must be able to render frames faster than that.

VSYNC adds a back buffer, adding up to another refresh cycle of input lag, and swaps it with the front buffer usually at the beggining of the Vertical Blank Interval (also known as VBI or VBlank), but sometimes at the end of the VBI depending on the game's implementation.

But that's not all. VSYNC also involves CPU pre-rendered frames. The majority of games use a configuration of 3 pre-rendered frames. This means that when frame 1 beggins being displayed (or is sent to the front buffer), the CPU has already rendered frames 2,3 and 4, and when frame 1 is halfway of its refresh, the CPU will begin rendering frame 5. That's 3 and a half refreshes of input lag (around 60 ms at 60Hz!).

This is when Scanline Sync comes into play. Scanline Sync will prevent VSYNC from pre-rendering too many frames and alleviate its effect on input lag. With a scanline index of 1 (or exactly at the start of your display's VBI), just as frame 1 begins being displayed, the CPU will start rendering frame 3. That is, exactly 2 pre-rendered frames. The higher the positive scanline index, the later the CPU will start rendering new frames, being able to achieve only 1 pre-rendered frame and a similar configuration as stand-alone Scanline Sync, but again, your hardware will have less overhead to complete the rendering before the next VBI. I should note that Scanline Sync is not always 100% accurate and the numbers regarding the scanline index may vary.

Unfortunately, Scanline Sync is not an all-encompassing solution. Every game works differently (different VSYNC implentation, number of pre-rendered frames, buffer swap at the start or end of VBI), and Scanline Sync may interact negatively. Sometimes it causes stuttering and a sawtooth input lag effect, similar to Low-Lag VSYNC ON. I also think it messes up input read in some games, because in some situations I get bad camera judder when using keyboard&mouse, but not with a controller.

Another thing I would like to discuss is the behaviour of both sync methods when there is a framerate drop. When the frame buffer queue is emptied, VSYNC speeds up rendering in order to fill it again. Common sense leads me to think that it causes a slow motion effect, since I've observed that after a framerate drop, VSYNC stabilizes by rendering the next frames 3-8 ms apart, but they are still displayed every 16,7 ms.

Scanline Sync, on the other hand, halts the CPU until the next refresh if it misses a sync. That means the next frame will start rendering ~33 ms after the previous, so you may get more frequent and noticeable stutters, but less input lag and better frame pacing.

What I still don't get is how Scanline Sync achieves perfect glassfloor frametimes, even when the scanline jumps erratically. While VSYNC usually does a good job at keeping a consistent input lag and until-displayed-time, the time between frame starts fluctuate between 15-18 ms. With Scanline Sync, they stay at 16.680-16.690 ms. I'm not sure if it interferes with frame presentation or gametime:rendertime synchronization, but I haven't seen any other framerate limiter (in-game or external) manage to do that.

All these observations led me to wonder how console developers manage to alleviate these issues. I know that many games just use traditional double or triple buffering and suffer from stuttering when they fall below the target framerate, but there are others that feel smoother and with better frame pacing than PC even while fluctuating below 60 fps or capped at 30 fps. If you cap a game at 30 fps and use VSYNC in PC, you will still get heavy stuttering since frame rendering and delivery will not be in sync with the display.

I think I found the answer here:
https://developer.android.com/games/sdk/frame-pacing

If I'm not mistaken, you can achieve a similar behaviour to non-pipeline mode as described in the Android website just by using Scanline Sync + Enhanced Sync/Fast Sync/borderless windowed (to force triple buffering). For 30 fps, Scanline/2 + VSYNC/borderless works too.

Nvidia users might just use Adaptive VSYNC (if they prefer tearing instead of stuttering) or half-refresh-rate VSYNC, but these features don't seem to work well in all games and Radeon users can't benefit from them.

So, my final conclusions and recommendations are:

In games (or emulators) in which VSYNC is forced ON, or when you don't care about input lag and just want something that works, just let VSYNC do its thing.
For competitive or fast paced games in which reaction speed is key, use stand-alone Scanline Sync.
For graphically demanding games in which you want a smooth and tear-free image, avoiding the input lag and frame pacing issues that comes with traditional VSYNC, use Scanline Sync in combination with Enhanced Sync/Fast Sync/borderless and give it a generous margin (scanline index at the middle or even at the top of the display, the equivalent of 1.5 or 2 pre-rendered frames instead of 3.5).
If you can't get a stable frame rate that matches your display's refresh rate, use Scanline/2 + VSYNC/borderless, again, with a generous margin.

YouTube · Post by **Chief Blur Buster** » 10 Apr 2020, 09:47

Great observations.

With that generous game list -- you actually did more tests than I had time to do!

Thank you for posting -- it is consistent with my findings.

Aldagar · Post by **Aldagar** » 10 Apr 2020, 13:54

Thanks for the appreciation. I just wanted to make my own tests and my little contribution!

I expected you to already know all of this. After all, you were involved in the creation of Scanline Sync.

I must say that the games in which it gave me more problems were Ubisoft's ones, Skyrim, Crysis and Wolfenstein, all known for being badly optimized. Also it gave me the impression that it tends to work better with DX12 and Vulkan (no matter the SyncFlush parameter), albeit that might be purely coincidential and I could have tested more extensively. In Battlefield, for instance, it works perfectly with DX12 but causes camera judder when playing with a mouse in DX11.

Besides, it's totally understandable that an external tool might not work with every single hardware and software configuration, but when it works, Scanline Sync truly is a godsend.

Something I would like to clarify is how Scanline Sync achieves perfect frame times (delta time between CPU presentations). If I had to guess, I would say in-game limiters either use less accurate event timers or decide when is the best time to present a new frame (due to factors unknown to me).

If I'm not missing something, would it make sense to design an equally accurate frame pacing method for Variable Refresh Rate displays, assuming it doesn't interfere with gametime:rendertime sync?

Another question I have is if it would be feasible to implement Scanline Sync to operate like pipeline mode as described in the frame pacing Android website. That is, when the display reaches the sync scanline, frame 1 is presented to the CPU, but then waits until the next refresh to submit it to the GPU, and then presents frame 2. Wouldn't that give the hardware more overhead and achieve better frame pacing in demanding situations at the expense of adding a refresh cycle of input lag?

knypol · Post by **knypol** » 11 Apr 2020, 15:21

Don't want to create new topic but i have one simple question:
does Scanline sync works with -1 value? I can see tearline only with 10+ (positive) to 1020+ values. I'm on PG258Q 1080p with 144hz ULMB.Or maybe i should set positive value?

Aldagar · Post by **Aldagar** » 11 Apr 2020, 15:37

knypol wrote: ↑
11 Apr 2020, 15:21
Don't want to create new topic but i have one simple question:
does Scanline sync works with -1 value? I can see tearline only with 10+ (positive) to 1020+ values. I'm on PG258Q 1080p with 144hz ULMB.Or maybe i should set positive value?

It does work with negative values, but taking the bottom of the display (the end of the VBI) as the reference point, instead of the top of the display as it would with positive values. That's why you don't see a tearline, but it's recommended to give it some margin to avoid seeing occasional tearlines at the top of your display.

In my case, I use an index of -30 on my 1440p monitor with 1481 scanlines. You can see how many scanlines your display has by inserting "SyncInfo=1" in the OSD section of RTSS config files, and then restarting RTSS and selecting the "Show own statistics option" and launching a game.

P.S. I noticed you own a G-SYNC monitor. Maybe you're just experimenting, but you should be using G-SYNC instead of Scanline Sync.

knypol · Post by **knypol** » 11 Apr 2020, 16:09

I want to use ULMB so can't use GSYNC at the same time.

So i made a test and with 144hz and 1080p i have total of 1157 scanlines. tear line is visible starting from +1 (and is like 1/10th of the monitor height from the top). Tearline dissapear at +1040. So my range is from 1041-1156? If i set like +1100 will be the best? Are positive value increasing input lag comparing to negative one?

ps. isn't that strange that with value of +1 tearline show up at 1/10th of the monitor height not exactly form the beginning?

Aldagar · Post by **Aldagar** » 11 Apr 2020, 16:25

I want to use ULMB so can't use GSYNC at the same time.

Ah, that makes sense.

So i made a test and with 144hz and 1080p i have total of 1157 scanlines. tear line is visible starting from +1 (and is like 1/10th of the monitor height from the top). Tearline dissapear at +1040. So my range is from 1041-1156? If i set like +1100 will be the best? Are positive value increasing input lag comparing to negative one?

ps. isn't that strange that with value of +1 tearline show up at 1/10th of the monitor height not exactly form the beginning?

Scanline Sync is not always 100% accurate, it might take it some time to sync, so most of the time you will see the tearline below the actual scanline you introduced. It doesn't matter if the index is positive or negative, it's just a question of which reference point it takes. So, in your case, an index of 1 will be the same as -1156. Just experiment with it and do it so that the tearline is always hidden inside the VBI.

knypol · Post by **knypol** » 11 Apr 2020, 16:27

I made another test and RTSS osd doesn't show negative values of sync line. If i set +1100 osd shows +1100. If is set -57 it also shows +1100 in osd.

Aldagar · Post by **Aldagar** » 11 Apr 2020, 16:35

knypol wrote: ↑
11 Apr 2020, 16:27
I made another test and RTSS osd doesn't show negative values of sync line.

No, it doesn't, because when you input a negative index, RTSS interprets it as "scanline total - index".

deama · Post by **deama** » 14 Jun 2020, 13:45

Do the various scanline sync values have any affect on input lag? E.g. is there a difference between +1 and +500 or -1 and -500?

EDIT: Alright so I think I found out the answer, and it appears to be yes. I used the presentmon program to measure and found out some info, I posted in the thread:
viewtopic.php?f=10&t=5552&p=53520#p53520

Blur Busters Forums

Some observations on Scanline Sync

Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync

Re: Some observations on Scanline Sync