Input lag on fullscreen

Everything about latency. This section is mainly user/consumer discussion. (Peer-reviewed scientific discussion should go in Laboratory section). Tips, mouse lag, display lag, game engine lag, network lag, whole input lag chain, VSYNC OFF vs VSYNC ON, and more! Input Lag Articles on Blur Busters.
ManWhoSoldTheWorld
Posts: 28
Joined: 18 Jun 2021, 18:58

Re: Input lag on fullscreen

Post by ManWhoSoldTheWorld » 11 Nov 2024, 13:24

Did you ever find a fix for this?

When I run fullscreen borderless (FSB) and fullscreen exclusive (FSE), both run "hardware independent flip".

FSB is consistently more responsive.
- Would have been interesting to try "hardware composed: independent flip" in FSE, but I've never been able to get it running.

Image

User avatar
Gias
Posts: 41
Joined: 26 Nov 2021, 16:37

Re: Input lag on fullscreen

Post by Gias » 11 Nov 2024, 15:46

ManWhoSoldTheWorld wrote:
11 Nov 2024, 13:24
Did you ever find a fix for this?

When I run fullscreen borderless (FSB) and fullscreen exclusive (FSE), both run "hardware independent flip".
that should be fine.

though technically you're not actually using FSE there. if you're seeing an independent flip presentation when checking the game's presentation model, then the game is using eFSE/FSO or windowed/borderless with flip model (which is good).
ManWhoSoldTheWorld wrote:
11 Nov 2024, 13:24
FSB is consistently more responsive.
should be the same or at least there shouldn't be a difference due to the presentation model in this case

however, if the game in eFSE/FSO (with its in-game fullscreen option) is forcing something else like vsync a certain way or manually lowering your display's max refresh rate (some games do this with their in-game fullscreen option and not in windowed/borderless...), then that could explain a difference in responsiveness... as it would result in the fullscreen option having higher latency etc

ManWhoSoldTheWorld wrote:
11 Nov 2024, 13:24
- Would have been interesting to try "hardware composed: independent flip" in FSE, but I've never been able to get it running.

Image
eh i don't think that getting "hardware composed: independent flip" would change anything in this case (with you already getting one of the independent flip presentations).

short version:

"hardware: independent flip" and "hardware composed: independent flip" basically perform the same.

those bypass the desktop window manager for presentation

plus a game that bypasses the dwm/compositor with an independent flip presentation can get more from the cpu/gpu compared to cases where the game is going through the dwm/compositor (such when "composed: flip" or "composed copy with gpu gdi" is being used for presentation. both "composed: flip" and "composed copy with gpu gdi" involve the dwm for composition). microsoft has documented this and can be seen in practice/testing too. apps/games that run through the dwm/compositor for presentation especially may suffer more (in terms of fps and latency) the more you have running in the background or on other displays...

also, games running with an independent flip presentation may use waitable swapchains to reduce latency further in certain scenarios (and waitable swapchains aren't supported in FSE / legacy flip).
--
more regarding the independent flip presentations:

"hardware: independent flip" indicates the game did not take ownership of the screen (either because it was windowed/borderless or because eFSE/FSO was applied), but it is still swapping the displayed surface every frame (via buffer pointer swap/flipping), independently, with the same efficiency as fullscreen exclusive (hardware: legacy flip).

"Hardware Composed: Independent Flip" is the same as the above, but also indicates the app has been granted at least 1 hardware overlay plane (MPO plane).

MPO (multiplane overlay) planes are basically additional hardware scanout planes enabling the gpu to take over composition from the desktop window manager, thus allowing the game to retain independent flip presentation in more scenarios (instead of falling into the subpar "composed: flip" presentation).

and it's normally needed to have at least 2+ MPO planes assigned and available to the display the game is running on if you want MPOs to save you (keep your game using an independent flip presentation in more scenarios... such as when the windows volume overlay is on top of the game or when your game's window doesn't cover the whole screen).

also typically it's "hardware: independent flip" because the game is on plane 0 (but the system could still leverage the MPOs and the game may retain independent flip presentation in certain scenarios if 2+ MPO planes are available...)

and it's typically "hardware composed: independent flip" if the game got on plane 1 or higher (and if it's on plane 1 and there's only 1 MPO plane available, then the game would still fall into the "composed: flip" presentation (which involves dwm composition) if some external window is on top of the game or if there's a resolution mismatch between the game's window and the desktop etc)

plane 0 is also the dwm's plane (the dwm's swapchain); however, even if the game is on plane 0, dwm composition can still be bypassed if the game has engaged directflip/independent flip optimizations (with or without MPOs if certain criteria has been met...)

directflip/independent flip optimizations is, to put it simply, when the game using an independent flip presentation.

and it's not possible to get either "hardware: independent flip" or "hardware composed: independent flip" in FSE.

FSE can be "hardware: legacy flip" or "hardware: legacy copy to front buffer" for presentation.

(the sankey diagram you posted doesn't include the "hardware: legacy copy to front buffer" presentation because it's very rare... but it's a subpar exclusive mode presentation due to the copy operation basically. anyway, you'd have to be very unlucky to get that one in fullscreen or be using windows 7...)

tl;dr:

you could try to see if vsync is set a different way by the game when using the in-game fullscreen option. you could also see if the game is lowering your display's max refresh (could check in windows's display settings after launching the game). you may also want to play while monitoring the game's presentation model for a while and see if it's getting stuck with the subpar "composed: flip" presentation at some point for some reason...

i would generally avoid a game's fullscreen mode anyway because of slower alt-tabbing and display mode changes that can be annoying...

windowed/borderless with an independent flip presentation is the good stuff.

ManWhoSoldTheWorld
Posts: 28
Joined: 18 Jun 2021, 18:58

Re: Input lag on fullscreen

Post by ManWhoSoldTheWorld » 13 Nov 2024, 17:45

Gias wrote:
11 Nov 2024, 15:46
ManWhoSoldTheWorld wrote:
11 Nov 2024, 13:24
Did you ever find a fix for this?

When I run fullscreen borderless (FSB) and fullscreen exclusive (FSE), both run "hardware independent flip".
tl;dr:

you could try to see if vsync is set a different way by the game when using the in-game fullscreen option. you could also see if the game is lowering your display's max refresh (could check in windows's display settings after launching the game). you may also want to play while monitoring the game's presentation model for a while and see if it's getting stuck with the subpar "composed: flip" presentation at some point for some reason...

i would generally avoid a game's fullscreen mode anyway because of slower alt-tabbing and display mode changes that can be annoying...

windowed/borderless with an independent flip presentation is the good stuff.
Thank you for the in-depth feedback! :)
I've found out that my GPU-Montior-combo does not support MPO (dxdiag)

The picture below shows two runs on the same full CS2 official DM server during the same round. As one can see, the lowest latency between FSO+FSB vs FSE is achieved by FSE (Legacy Flip). Still... I find it impossible to play with FSE -> It seems buffered and as if there are frames missing during a 360hz cycle on the monitor. Same experience with different monitors.

FSO+FSB is on the other hand much more consistent with a lower perceived latency vs FSE.

Image

User avatar
Gias
Posts: 41
Joined: 26 Nov 2021, 16:37

Re: Input lag on fullscreen

Post by Gias » 12 Dec 2024, 07:22

ManWhoSoldTheWorld wrote:
13 Nov 2024, 17:45
Gias wrote:
11 Nov 2024, 15:46
ManWhoSoldTheWorld wrote:
11 Nov 2024, 13:24
Did you ever find a fix for this?

When I run fullscreen borderless (FSB) and fullscreen exclusive (FSE), both run "hardware independent flip".
tl;dr:

you could try to see if vsync is set a different way by the game when using the in-game fullscreen option. you could also see if the game is lowering your display's max refresh (could check in windows's display settings after launching the game). you may also want to play while monitoring the game's presentation model for a while and see if it's getting stuck with the subpar "composed: flip" presentation at some point for some reason...

i would generally avoid a game's fullscreen mode anyway because of slower alt-tabbing and display mode changes that can be annoying...

windowed/borderless with an independent flip presentation is the good stuff.
Thank you for the in-depth feedback! :)
I've found out that my GPU-Montior-combo does not support MPO (dxdiag)

The picture below shows two runs on the same full CS2 official DM server during the same round. As one can see, the lowest latency between FSO+FSB vs FSE is achieved by FSE (Legacy Flip). Still... I find it impossible to play with FSE -> It seems buffered and as if there are frames missing during a 360hz cycle on the monitor. Same experience with different monitors.

FSO+FSB is on the other hand much more consistent with a lower perceived latency vs FSE.

Image
huh ? other than those looking like presentmon columns... i'm not sure where you pulled those numbers from or what you did exactly, but something looks really wrong there...

waitable swapchains aren't supported in FSE. the lowest latency is not achieved by FSE (legacy flip). FSO or windowed/borderless with independent flip for presentation is normally as good or better

See:
The DWM manages the composition/organization of the desktop display content from various applications, meaning it controls what is rendered and presented to the front of your display and what is held in the background. However, this control has historically resulted in a slight performance overhead vs FSE, where the game has full control.

To get back this performance overhead, we enhanced the DWM to recognize when a game is running in a borderless full screen window with no other applications on the screen. In this circumstance, the DWM gives control of the display and almost all the CPU/GPU power to the game. Which in turn allows equivalent performance to running a game in FSE. Fullscreen Optimizations is essentially FSE with the flexibility to go back to DWM composition in a simple manner.
https://devblogs.microsoft.com/directx/ ... mizations/
Once your swapchain has been “DirectFlipped“, then the DWM can go to sleep, and only wake up when something changes outside of your application. Your app frames are sent directly to screen, independently, with the same efficiency as fullscreen exclusive. This is “Independent Flip“
https://devblogs.microsoft.com/directx/dxgi-flip-model/
If you’ve watched DirectX 12: Presentation Modes In Windows 10, you’ll see talk about "Direct Flip" and "Independent Flip." These are optimizations that are enabled for applications using flip model swapchains. Depending on window and buffer configuration, it is possible to bypass desktop composition entirely and directly send application frames to the screen, in the same way that exclusive fullscreen does.
"Flip model presents go as far as making windowed mode effectively equivalent or better when compared to the classic "fullscreen exclusive" mode.
https://github.com/MicrosoftDocs/win32/ ... p-model.md
When these optimizations are used, games that originally use the legacy blt-model presentation can use the newer flip-model presentation instead (if the game is compatible). This results in lower frame latency and lets you use other newer gaming features; for example, Auto HDR, and variable refresh rate (for displays that support it)
https://support.microsoft.com/en-au/win ... 389e535952

https://learn.microsoft.com/en-us/windo ... wap-chains

https://www.youtube.com/watch?v=E3wTajGZOsA

besides microsoft's documentation and presentations from directx developers... below is also Riot finally updating valorant's windowed/borderless mode to use dxgi flip model about 2 year ago... and yeah they basically ended up concluding that now it's no longer necessary to use fullscreen and differences were within margin of error etc

https://www.youtube.com/watch?v=sj3bCaRsL4s

i've also done end-to-end system latency tests using RLA (nvidia's reflex latency analyzer hardware) with different presentation models across different games, and what i've seen basically matches microsoft's documentation etc. I've technically even seen FSO and windowed/borderless with dxgi flip model giving slightly lower latency numbers than FSE in the same game (though really could be seen as being within margin of error or normal variance...)

for example, below are some tests i did with the game deliver us mars with vsync off / uncapped fps

first below are some fps results i got using capframex (here capframex uses presentmon and samples based on present-to-present)

Image

next are some end-to-end system latency results i got using RLA -- nvidia's reflex latency analyzer from my 1440p 240hz viewsonic monitor along with a reflex compatible mouse. also, the game tested has the reflex flash (provides a consistent flash from the game when doing mouse clicks, which is great for latency tests with my RLA monitor & mouse) and i did 50 samples/mouse clicks multiple times for the averages

  • Deliver Us Mars uncapped (210) fps Fullscreen exclusive (hardware: legacy flip) -

    average (end-to-end) system latency: 20.4 ms

  • Deliver Us Mars uncapped (210) fps windowed borderless fullscreen (hardware composed: independent flip) -

    average (end-to-end) system latency: 20.2 ms

  • Deliver Us Mars uncapped (179) fps windowed borderless fullscreen (composed: flip) -

    average (end-to-end) system latency: 30.5 ms
Image

Image

Image


i've also observed how the game running with dwm composition / "composed: flip" for presentation would drop in performance more when i had more apps running in the background or on other displays, and after restarting the pc and having less apps open in the background... i managed to get the deliver us game with the composed flip presentation to give me about 195 fps as shown below (whereas previously it was giving about 179 fps with composed flip):

Image

i didn't feel like retaking screenshots for legacy flip and independent flip again then, but i did check those again and both legacy flip and independent flip were still giving me about 210 fps in that area/position. it makes sense to me though. microsoft's documentation also basically mentions that fse/legacy flip and windowed/independent flip can get more from the cpu/gpu compared to running with dwm composition...


anyway, here i tested end-to-end system latency in the game you were playing also: cs2 (for this one i used my 1080p 360hz alienware monitor that also has RLA):

  • counter strike 2 Fullscreen exclusive (hardware: legacy flip) -

    average (end-to-end) system latency: 8.9 ms

  • counter strike 2 windowed borderless fullscreen (hardware: independent flip) -

    average (end-to-end) system latency: 8.7 ms

Image

Image

cs2 also gives the reflex flash with each mouse click from my reflex mouse connected to the RLA monitor, so i also looked at end-to-end system latency while moving and playing. i did not see fse averaging better.

fps can vary and fluctuate a lot in cs2 though depending on the stage and where you're looking within the stage... but i also tested stationary in the same stages / position too, and i did multiple tests with averages of 50 samples/mouse clicks each. i technically saw slightly lower latency numbers with independent flip in cs2, but really again it's basically within margin of error or normal variance...

Post Reply