I hear some people do lots of tweaking to only be surprised of input lag popping up again. Do you reckon hardware cache prefetcher, that regards predicting ahead to cache code in the CPU cache for performance gains.
I've read if the instructions are linear, simple, predictable, then this prefetcher works a treat, as you could simply imagine by the nature of it.
If the instructions are random, then the hardware prefetcher could become a hindrance. Others have mentioned, and I agree, a video game is surely random, as the computer does not know your next input will be, and will not know what you're about to activate in a game, or next stare at (rendering).
So therefore, hardware cache prefetcher could be a nice thing for basic system processes or linear tasks and not a good idea for gaming? I think if you haven't tuned your PC, slimed down the background processes, tasks and services etc. then hardware prefetcher may do a good job ironing that crap out. If you have slimed down your OS, then the prefetcher's use becomes a guy twiddling its thumbs somewhat and getting in the way of gaming.
Thoughts? I've not looked much on documentation on this feature, just going off some basic logic of what it does and the nature of certain applications.
For random input lag: Hardware Cache Prefetcher? (BIOS)
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
That's not how it works. Player input does not change the code. The result of the execution of the code changes, but the code itself does not.
Steam • GitHub • Stack Overflow
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
No. In fact, prefetching and speculative execution are what make modern CPUs so fast. Without it, the CPU you bought it 2023 would be even slower than a CPU you bought 15 years ago.
Steam • GitHub • Stack Overflow
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
No. That statement doesn't even make sense It's like asking if the result of executing all the instructions of a cooking recipe is in itself a cooking recipe. It's not. The result is food.
Btw, the analogy to recipes is quite fitting for code. Even if you change the ingredients randomly, the recipe steps remain the same. They're fixed and not affected by randomness. The data the code operates can vary and even be random, but the code itself does not change.
Steam • GitHub • Stack Overflow
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
1. Add 1 tbsp flour (suddenly becomes carrot instead of flour)
In my head that's what i see for random ingredient swapping but the step staying the same.
That wouldn't be the step staying the same.
If in a game a player was to stare at a tower then switch to looking at a bench. To a CPU that was random. The fundamental code on how in-game assets are drawn is the same, the sequence of code to draw the bench is different however.
The binary that sequences the pixel illumination pattern for the monitor to make a pattern of the bench is different binary to the tower.
I guess the hardware prefetcher works with just the functions of the code purely then. And the code output from them functions can't be predicted due to user input, obviously.
The prefetcher can't predict ahead of time, what binaries that user will illict for the monitor in the way of pixels and asset loading, well, not perfectly.
In my head that's what i see for random ingredient swapping but the step staying the same.
That wouldn't be the step staying the same.
If in a game a player was to stare at a tower then switch to looking at a bench. To a CPU that was random. The fundamental code on how in-game assets are drawn is the same, the sequence of code to draw the bench is different however.
The binary that sequences the pixel illumination pattern for the monitor to make a pattern of the bench is different binary to the tower.
I guess the hardware prefetcher works with just the functions of the code purely then. And the code output from them functions can't be predicted due to user input, obviously.
The prefetcher can't predict ahead of time, what binaries that user will illict for the monitor in the way of pixels and asset loading, well, not perfectly.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
These things are not meaningfully affected by prefetching and speculative execution. They are too high level. The prefetcher operates on things that happen within nanoseconds. For example the code that applies geometry transformations when preparing a frame, will do that in a loop that walks over all vertices and applies the change. A single multiplication operation takes like a few nanoseconds, but this is going to be much slower if the data needs to be fetched from RAM first. The prefetcher is responsible for loading that data (in this case vertices) into the CPU cache so it's readily available once the CPU needs it. There's no user input or randomness here. All of that has happened on a much higher level and all decisions about what data to operate on and which code to call have already been made long before the code starts doing the heavy lifting and number crunching.Jonnyc55 wrote: ↑15 Feb 2024, 09:22I guess the hardware prefetcher works with just the functions of the code purely then. And the code output from them functions can't be predicted due to user input, obviously.
The prefetcher can't predict ahead of time, what binaries that user will illict for the monitor in the way of pixels and asset loading, well, not perfectly.
You can code a graphics demo that has no user input. It just renders the same thing every time. If you add user input later on and make it more like a game, the performance will not change at all. It will run just as fast. The randomness of player input is just too far removed from the level at which CPU optimizations operate on.
Steam • GitHub • Stack Overflow
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
With the vertices already loaded, does the fact a user moves his camera around, not push anything new to the CPU then? I'm not having all viewing angles of all the vertices are pre-calculated.RealNC wrote: ↑15 Feb 2024, 10:18These things are not meaningfully affected by prefetching and speculative execution. They are too high level. The prefetcher operates on things that happen within nanoseconds. For example the code that applies geometry transformations when preparing a frame, will do that in a loop that walks over all vertices and applies the change. A single multiplication operation takes like a few nanoseconds, but this is going to be much slower if the data needs to be fetched from RAM first. The prefetcher is responsible for loading that data (in this case vertices) into the CPU cache so it's readily available once the CPU needs it. There's no user input or randomness here. All of that has happened on a much higher level and all decisions about what data to operate on and which code to call have already been made long before the code starts doing the heavy lifting and number crunching.Jonnyc55 wrote: ↑15 Feb 2024, 09:22I guess the hardware prefetcher works with just the functions of the code purely then. And the code output from them functions can't be predicted due to user input, obviously.
The prefetcher can't predict ahead of time, what binaries that user will illict for the monitor in the way of pixels and asset loading, well, not perfectly.
You can code a graphics demo that has no user input. It just renders the same thing every time. If you add user input later on and make it more like a game, the performance will not change at all. It will run just as fast. The randomness of player input is just too far removed from the level at which CPU optimizations operate on.
Even if the CPU was quickly lining up the rest of summoned calculations in nanoseconds into cache, it would quickly get swamped with new needs by the new camera angles.
I don't know, I just can't get my head around it. I can't help but think the CPU is surely met with something new every time you move the camera around, even if things are calculated like models, you still have to queue the frame of those objects.
Maybe you have already explained it, I'm just not grasping it. No big worry, it's not important. I tested without hardware prefetcher anyway, and I didn't see much of anything or feel much of anything, other than slight sluggishness around windows.
It's always interesting these settings, I do read there are reasons to turn it off by professionals:
https://docs.oracle.com/cd/E19962-01/ht ... gljyu.html
https://communities.vmware.com/t5/VI-VM ... is%20doing.Hardware prefetchers work well in workloads that traverse array and other regular data structures. The hardware prefetcher options are disabled by default and should be disabled when running applications that perform aggressive software prefetching or for workloads with limited cache. For example, memory-intensive applications with high bus utilization could see a performance degradation if hardware prefetching is enabled.
A guy responds:IBM enables the CPU hardware prefetch by default but Intel recommends turning the feature off depending on what the server is doing. Anyone have any preferences?
And then:I think you would be wrong. Try it and see what happens.
Instruction supply may become a substantial bottleneck in future generation processors that have very long memory latencies and run application workloads with large instruction footprints such as database servers. Prefetching is a well-known technique for improving the effectiveness of the cache hierarchy
employs a hardwarebased breadth-first search of future control-flow to cope with weakly-biased future branches, prescient instruction prefetch uses precomputation to resolve which controlflow path to follow. Furthermore, as the precomputation frequently contains load instructions, prescient instruction prefetch often improves performance by prefetching data.
prefetch uses helper threads to perform instruction prefetch on behalf of the main thread.
A key challenge for instruction prefetch is to accurately predict control flow sufficiently in advance of the fetch unit to tolerate the latency of the memory hierarchy. The notion of prescient instruction prefetch was first introduced as a technique that uses helper threads to improve single-threaded application performance by performing judicious and timely instruction prefetch.
When I read this back and forth, pros/cons on stuff like this, it gets my imagination going as to all the real nuanced stuff going on deep inside. Which has me testing.Processor Hardware Prefetcher
When this setting is enabled, (disabled is the default for most systems), the
processors is able to prefetch extra cache lines for every memory request.
Recent tests in the performance lab have shown that you will get the best
performance for most commercial application types if you disable this feature.
The performance gain can be as much as 20% depending on the application.
For high-performance computing (HPC) applications, we recommend you turn
HW Prefetch enabled and for database workloads, we recommend you leave
the HW Prefetch disabled.
Both prefetch settings do decrease the miss rate for the L2/L3 cache when they
are enabled but they consume bandwidth on the front-side bus which can reach
capacity under heavy load. By disabling both prefetch settings, multi-core setups
achieve generally higher performance and scalability.
But there we are.
Re: For random input lag: Hardware Cache Prefetcher? (BIOS)
Input has already been read. Games don't read input in the middle of rendering a frame and then cancel the frame rendering because there's new input. If that was the case, simply moving the mouse quickly would results in your FPS dramatically decreasing, because the mouse movement would result in constantly aborting the rendering of frames. Once the input is read, the next frame is prepared and then rendered. Nothing is going to change in the middle of rendering a frame. Every code path that needs to be taken and all data that is to be operated on is decided upon right after reading input.
Also keep in mind that games don't read user input constantly. They only poll the state of your input devices once per frame. Once input is polled, the game doesn't care anymore about input until the next frame. And that is a very long, long time for a CPU. The timescale on the other hand that the prefetcher operates in, is tiny.
Nothing changes while rendering a frame. Once the frame is rendered, the state of the input devices is read again, and the next frame is getting prepared and rendered based on that input state. During the frame prepare+render step, the game needs to operate on data, and that data is being loaded by the prefetcher into the cache in parallel while the CPU operates on data already in the cache.Even if the CPU was quickly lining up the rest of summoned calculations in nanoseconds into cache, it would quickly get swamped with new needs by the new camera angles.
Also, the BIOS feature you're talking about is not about disabling prefetch entirely. It's an additional setting that mostly tweaks the prefetcher to aggressively load more things into cache. You cannot disable prefetch entirely. It's a vital optimization of any CPU for many years now and performance would be crippled without it.
Steam • GitHub • Stack Overflow
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.
The views and opinions expressed in my posts are my own and do not necessarily reflect the official policy or position of Blur Busters.