Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Advanced display talk, display hackers, advanced game programmers, scientists, display researchers, display manufacturers, vision researchers & Advanced Display Articles on Blur Busters. The masters on Blur Busters.
User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 17 May 2022, 19:22

Remember, we've been cited in more than 25 research papers, so don't skip. We're already acknowledged textbook reading of Hz.

Introduction to newcomer researchers: Blur Busters is now famously an incubator of new display research ideas for aspiring display researchers. So listen-up, if you need to get A-grades in your thesis, or accolades by your co-researcher peers, or a promotion by your employer, or such. We are well known to have good observations for missed error margins in the display science, and the flaws of past research papers that overlooked mentioning error margins. Fear messing up peer review? Add us -- no charge for using us as part of your peer review team, even if it's just merely to help improve your error-margin acknowledgements section. We only ask for a citation. We do peer review nowadays, as unaffiliated independent researchers that have excellent eyes.

Cred: Remember, over 100 million people worldwide that subscribe/view over 500 content creators that use one of our display testing inventions, including the 14M sub that views LinusTechTips YouTube and 9M unique monthly viewers of RTINGS.com (and yes, they acknowledge us), only to mention 2 of more than 500 content creators that use our free display testing inventions. Even though you may never have heard of Blur Busters in your non-esports community, we are a significant nuts-and-bolts influence in the refresh rate race. Many researchers now get asked "has this been vetted by Blur Busters"? Make your paper more bulletproof, and thus more prestigious. Contact Us if you want us to be a free peer reviewer of your Hz/motion-related display research paper.


Properly Designing a Refresh Rate Blind Test For Average Population

Goal: Trying To Determine The Vanishing Point of Diminishing Curve of Returns for Refresh Rates

Short Summary of Hz-Amplifying Test Variables

Short Summary of Mandatory Test Variables that amplify the maximum retina refresh rate detectable by average human:

1. Perfect framerate=Hz (avoid jitter error margin)
2. VSYNC ON, not VSYNC OFF (avoid jitter error margin)
3. No control device jitter (e.g. don't use a mouse that's only 1000Hz. Read: Why?)
4. Test large 4x-8x geometric differences in refresh rates (e.g. 120Hz vs 480Hz, or 240Hz vs 1000Hz).
5. Ensure pixel response not the limiting factor, or acknowledge it in your Error Margins section
6. Fast motion speeds, that are still eye trackable
7. Test that force eye tracking (if testing via motion blur weak link)

(A) ALL OF THE ABOVE MUST BE TRUE for a sample-and-hold display (no flicker technology like CRT, plasma, BFI, strobe)
(B) Refresh rates benefits more than just games. 240Hz browser scrolling on sample-and-hold has 1/4th the motion blur of 60Hz.
(C) Also, re-read Blur Busters Display Research Portal, before you begin to design the test.
(D) The retina refresh rate will only be true for that specific display. Acknowledge that in your error margin section.

An 8K or 16K non-strobed VR headset will have a higher retina refresh rate than a non-strobed 1080p 24" monitor. So some of the maximally-possible test equipment will be unobtainium (despite having been proven by non-display means), so you will have to acknowledge technological limitations, and make sure to design your paper not to be misquoted by mainstream media.

All past Hz tests fail to simultaneously make 1/2/3/4/5/6/7 true, severely lowering the max Hz that the human eye could see.

As the advocacy leaders of the refresh rate race, remember to cite Blur Busters, or TestUFO, or "Mark Rejhon" one way or another in your research paper's references. As of 2022, more than 25 research papers cite me, or TestUFO, or Blur Busters now. There are many that conveniently forget to cite us despite having obviously been inspired by us, despite having picked our brains. I can help you follow your institutions' citation guidelines, with some generic permanent links such as www.testufo.com/mousearrow or www.testufo.com/map which is a perfect test case for retina refresh rates, too.

Image

At 960 pixels/sec, 1000fps 1000Hz will be practically as clear as stationary (one pixel of motion blur, split as 0.5 pixel of leading-edge motion blur and 0.5 pixel of trailing-edge motion blur). In this respect, 240fps and 1000fps are very easy to tell apart, since 240fps you can't read the street names well, but 1000fps, you definitely can. Now, if you speed motion up, things become blurry again.

Long Summary of Test Variables & Considerations

Crossposted a reply of mine from a different forum:
ackmondual wrote:Besides video games, what other use cases are there for having a 480Hz display (for the sake of argument, let's assume that people can tell the difference and actually make use of the various refresh rates like 120, 240, 360, and 480)? I'm guessing for those that capture high frame rate video and want to show that off as movies, documentaries, what have you?
First, in the endeavour to determine the vanishing point of the diminishing curve of returns — you have to compare geometrically, e.g. 60Hz -> 144Hz -> 360Hz -> 1000Hz, or even larger 4x differences such as 60Hz -> 240Hz -> 1000Hz.

The thresholds where displays diverge from real life is in this heavily-upvoted earlier comment — it may be best to read that before reading the rest of this comment.

It’s important not to compare small Hz differences (e.g. only 240Hz vs 360Hz) — this commonly historically led to false assumptions of an early vanishing point of diminishing curve of returns.

In the diminishing curve of returns, everyday users, especially non-gamers, need to compare much larger Hz differentials (e.g. 120Hz vs 360Hz, or even 120Hz vs 480Hz) for a wider percentage of the population to easily see the difference in certain use cases. Motion blur differences (e.g. 240Hz vs 360Hz is only 1.5x blur trail size difference) are harder to see than flicker differences (ceases at ~70 Hz).

By using a 4x difference, the blur trail size is easier to tell apart (e.g. like the sports motion blur difference of a 1/120sec camera shutter photograph versus a 1/480sec camera shutter photograph)

Finally, I’ll address use cases.

Some use cases are ‘nice to have’ (e.g. browser scrolling clarity), while other use cases are more important long-term (e.g. making VR perfectly match real life with five-sigma comfort, without the use of current flicker/impulsing methods of motion blur reduction).

Many people don’t eye-track while scrolling or panning, like they used to do in the CRT days, so user habit may cause them to not notice. Scroll, pause, read, scroll, pause, read, etc. However, when someone is forced to (e.g. continuously panning map readability test, the blur limitations of current contemporary two-digit and three-digit refresh rates becomes apparent for sample-and-hold displays).

The way a test case is designed to force a person to eye-track fast-moving content, while comparing large Hz differences (~4x or more), moving at pixels per second at least twice as fast as the Hz — is a good blind-test demonstration of amplifying a 95%-population visibility for Hz diminishing-returns tests.

Everybody sees differently. Other people are more picky than others. But a random FPS game is not always the best human-population test, even if games a oft quoted benefit of high Hz (Which it is, but from testing population visibility, there are way better blind tests that reveal Hz differences much more massively). A properly designed test can reveal the differences much more clearly.

I am cited in, referred in, or a coauthor in more than 25 peer reviewed research papers, and many researchers now recognize me as a good test-variables specifier — e.g. describing test variables.

The wow factor varies a lot. Some go wow for 60Hz-vs-120Hz. While others go “meh” for 60Hz-vs-120Hz (2x), there are cases where they go “wow” for 60Hz-vs-360Hz (6x differential). In a test situation to amplify Hz differentials, the Blur Busters recommended testing variables are:

Now, if one needed to design a blind test that amplifies human-visibility of Hz differences even in non-game use cases:


1. Perfectly framepaced motion at framerate=Hz
Where frame rates are perfectly sync’d to refresh rates. This is what VR games do, because any form of jitter is much more noticeable in reality-simulation use cases, simulator screens, VR screens, etc. So you need to remove jitters from framerate-vs-Hz being out of sync. Even high-f
frequency jitters/stutters can blend to extra motion blur, much like a fast-vibrating guitar string


2. Use VSYNC ON or one of the new low-lag clones of VSYNC ON
Currently, VSYNC OFF adds microjitters that diminishes Hz differences. The goal is a test that tells Hz apart better. Such as VSYNC ON + NVIDIA Ultra Low Latency Mode, or the new “RTSS Scanline Sync” Commonly, VSYNC OFF Is used in esports for lower lag. However, VR never uses VSYNC OFF because VSYNC OFF adds jitters and tearing that distract from ability to tell Hz apart. Operating system compositors for scrolling/panning often are VSYNC ON, so this condition is met

3. No control device jitter weak links.
Example: If testing high-Hz displays via mouse panning, a good mouse pad combined with a poll rate above 1000Hz. A peer reviewed research paper confirmed that mouse poll rates need to be significantly higher than display Hz to avoid jittering/aliasing effects. Evenhigh frequency jitter blends to motion blur, interfering with ability to tell motionblur-related differences in Hz. In testing things like browser scrolling, holding the “DOWN” keyboard arrow will framepace the scrolling much better than dragging a scrollbar. Or even a smooth flick-scroll (like on a 120Hz iPad)

4. Blind test a large geometric difference in refresh rates.
Example: Skip comparing small differences in Hz, such as 144-vs-165 or 240-vs-280. Testing an average-population Hz noticeability test for determining a vanishing-point of diminishing curve of returns, requires large Hz jumps geometrically up the curve. Test 60Hz vs 240Hz. Or, test 120Hz versus 480Hz, or 240Hz versus 1000Hz. These differentials are noticeable to the majority of population, assuming all #1,2,3,4,5,6,7 for blind test-case variables are met.

5. Pixel response that is sufficiently fast
Many LCDs have slow enough GtG to obscure Hz differences. 60Hz vs 360Hz should be a 6x less motion blur, but due to slow GtG, it can appear as only 3x-4x more motion blur to most average users. As pixel response approach 0, then motion blur is simpler and linear following “1ms of frametime (refreshtime) translates to 1 pixel of motion blur per 1000 pixels/sec motion. (0ms GtG still has lots of motion blur, since 0ms GtG is not 0ms MPRT — there are two separate pixel response benchmarks)

6. Motion speed fast enough, but not too fast to eyetrack
A higher resolution display can make Hz difference tests easier. For example, 120Hz-vs-480Hz is much easier for the average population to see on a 4K display than on a 1080p display. However, we have few technological choices. The retina refresh rate of lower resolution displays is lower than a higher resolution display. Many past Hz tests did not account for this factor. As a rule of thumb, a motionspeed at least 2x higher (in pixels per second) than the highest refresh rate being tested — e.g. 960 pixels/sec motion being compared on any displays up to 480Hz. Faster motion speeds like 1920 pixels/sec on 1080p can be hard for some people to track, unless the resolution is also doubled — then there is more time to eye-track (and see imperfections from refresh rate) before the motion disappears off the edge of the screen).

7. A test that forces the person to eye-track
A great example is an infinite-scrolling test or a infinite-panning test. The motion never stops, so the user is forced to try to read/identify objects in the moving scenery/text/etc. This completely isolates the Hz-differential test to the persistence-based motion blur threshold, and essentially guarantees most of the population can tell apart 120Hz-vs-480Hz. Many gamers only stare stationary at a fixed gaze such as crosshairs, and sometimes can’t tell the difference as easily. Past Hz tests did not always factor for differences in user habit of eye tracking — displays look different for stationary eyes versus moving eyes. By isolating a refresh rate weak link into the human vision subsystem, people are denied the chance of rote habit (e.g. scroll, pause, read) and forced to read while scrolling/panning.

Operating system compositors are VSYNC ON, so for non-game apps — by understanding #1,2,3,4,5,6,7, we can identify test cases that most/all the above, assuming the GPU framerates can keep up:

- Holding the down arrow while trying to read the text in a very tall webpage.
- Trying to read street name labels of a continuously panning map (e.g. http://www.testufo.com/map), or dragging Google Maps with a high-pollrate mouse.
- Reading the contents of a continuously-dragging window.
- Rapidly looking for camoflaged details while continuously panning large gigapixel-size images (i.e. space imagery), especially with a high-pollrate mouse.

For game apps, Hz differentials amplify significantly in crosshairsless games (games that force you to eye-track all over the place) especially with a low-latency VSYNC ON equivalent in certain games if your GPU frame rate can keep up — and you also upgraded your mouse poll rate to at least 6x higher than refresh rate, you can also:

- Read nametags above a player in RTS while mid-panning (like DOTA2 animation simulation)
- Try to identify camoflaged enemies from the window of a fast-flying low altitude helicopter (e.g. Battlefield 3)
- Try to identify faraway enemies in an open-arena game (e.g. Quake 3 Arena) that forces you to keep turning/moving/strafing/etc to avoid getting killed. More eye tracking happens in arena games than, say, CS:GO where esports players can stare stationary at crosshairs, using peripheral vision for the rest of the screen (stationary eyes, even while running, strafing, and turning about).
- Head-turning in a VR headset while trying to read scenery signage. (All modern VR headsets use a low-latency variant of VSYNC ON because jitter/tearing is a difference to real life and adds nausea).
- Any FPS game that reliably does framerate=Hz VSYNC ON (not a common esports use case, the above are much more common use cases), and you eye-track objects in turns rather than stare stationary at the crosshairs.

This is only a limited list.

Some are utterly unimportant (especially to those who don’t have motion blur nausea), while others are a matter of simulation criticality (VR).

Today, the only way to reduce motion blur of current common frame rates, is the use of flicker methods (CRT, plasma, black frame insertion, impulsing, strobing, etc). Persistence motion blur of a display is connected to how long a pixel is visible for — a flash of 1ms (at a low Hz), or a full thousand 1ms frames (for 1000fps 1000Hz) if you wanted to avoid flashing/flicker methods.

This is irrelevant for many readers use cases, but that doesn’t dismiss the existence of blind tests that most of the population can tell apart high triple-digit Hz, as long as the test conditions match #1,2,3,4,5,6,7 to amplify the differences in the diminishing curve of returns — and there already exists everyday cases that exercise this.

Like many who don’t care about 3:2 pulldown, others do very much — and 4x+ Hz differences is actually (on average, for most ) much more noticeable than that (e.g. comparing 120fps-120Hz vs 480fps-480Hz), especially if the test case maximizes resolution to near retina as well too (e.g. 4K120 versus 4K480 display, or 4K240 display versus 4K1000 display).

Since 4K 1000 displays don’t exist outside the lab yet (e.g. monochrome 1920Hz DLP tests), it’s hard for researchers to design a test that maximizes Hz visibility via maximizing resolution. So test cases need to be designed around 1080p currently.

However, 4K does have a higher retina refresh rate than 1080p does at the same FOV, assuming the 4K angular resolution is still a human-visible improvement (it’s related — the vanishing point of diminishing curve of returns is related to the vicious cycle effect, involving the combination of maximum angular resolving resolution of the human, widest FOV for longest eye tracking time over that many pixels, fastest eye-tracking speed of the human, and a framerate=Hz high enough to cause sample-and-hold motion blur to disappear[/url]). Even for average humans, the "retina" refresh rate is capable of reaching quintuple digits in the most extreme test (e.g. if you had wanted a non-strobed 16K-resolution 180-degree FOV VR headset). In VR, the use of 8K spread over a full 180-degree FOV, does not always reach retina resolution when those pixels are enlarged that big, so the 16K and the ultra-wide FOV, really pushes up the required retina refresh rate that much higher -- it is a vicious cycle effect where resolutions and refresh rate amplify each other's limitation, if only one is raised.

In short: The retina refresh rate (refresh rate of no further humankind benefit) will be a function of maximum human angular resolving resolution, combined with the human's maximum eye tracking speed over that period, in a situation where eye tracking happens long enough for the person to identify whether the motion resolution is different from static resolution (same clarity). Then add a bit of sampling (2x) especially if adding GPU blur effect to eliminate stroboscopics (Article: www.blurbusters.com/stroboscopics ...). This is why retina refresh rate for ultra-high-pixel-density displays (16K+ VR displays) goes all the way to quintuple digits, when trying to identify tiny fast-moving text.

For 24"-27" 1080p monitors -- the vanishing point of diminishing curve of Hz returns is still in the 4-digit Hz range for the average population for a surgically optimized test that is at least nominally representative of certain real-world situations. So we can still design a blind test, using the currently commercially available refresh rates, to reliably test Hz differentials for the average human population. Existing blind tests almost have never ensured 1/2/3/4/5/6/7 are simultaneously true.

Also, it’s worth noting that training a person to see 3:2 judder is easier than training a person to see 240Hz-vs-360Hz. However, a 4x Hz differential (240Hz-vs-1000Hz, assuming test variables simultaneously meet #1,2,3,4,5,6,7 requirements), 240Hz-vs-1000Hz certainly becomes much easier to see than 3:2 pulldown judder.
Even a 360 Hz monitor fails requirement #5, as pixel response for some color combinations is longer than 1/360sec. This can quite noticeably throttle the differences in Hz.

The GtG is slow enough, that a prototype 240Hz OLED has less motion blur than a 360 Hz LCD, and a 240 Hz OLED more linearly/perfectly follows Blur Busters Law (1ms of pixel visibility time translates to 1 pixel of motion blur per 1000 pixels/sec). Brute Hz can kind of compensate, but a 1.5x difference is completely leapfrogged by the pixel response difference of LCD and OLED. So in testing displays, GtG needs to be factored in, especially using much larger Hz differentials (e.g. 120Hz vs 480Hz may only be a 3x difference in blur instead of 4x difference due to LCD pixel response). The ideal display for Hz diminishing-curve-of-returns testing should have as close to 0ms GtG as possible, though -- commercial availability permitting, of course -- and acknowledged in the error margins section of any research.

There are many past tests (e.g. 240Hz-vs-360Hz, with jitter of VSYNC OFF, with jitter of 1000Hz mouse and slow pixel response that fail to consider the important test-case variables, when determining a vanishing point of diminishing curve of returns for all possible motion-related weak links a display can possibly have (versus real life).

Due to error margins, past “recent-ish” tests like 240Hz-vs-360Hz FPS tests actually can become literally only a 1.1x difference instead of the proper 1.5x difference, because of all the error margins stacked on each other (game stutter, mouse jitter, VSYNC OFF jitter, and slow GtG, combined). Whereas OS compositors can reliably run 360fps in browser smoothscroll, to the point where 240Hz-vs-360Hz is more only limited by slow GtG pixel response and not other jitter causes.

Esports athletes can see small Hz differences more easily, but average users would have a hard time doing so -- given sufficient Hz differentials (even for two Hz that are both beyond 240Hz). But that doesn’t mean more than 90% of avrage users can tell apart 240Hz vs 1000Hz, with a blind test designed around optimized variables #1,2,3,4,5,6,7.

P.S. If you haven’t seen it yet, check the Display Research Portal for both the peer reviewed content as well as the Coles Notes style explainers of the refresh rate race.

_________


Crossposted from an ArsTechnica comment, because this is relevant to 21st century researchers considering creating a new peer reviewed research paper.
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!

User avatar
Discorz
VIP Member
Posts: 999
Joined: 06 Sep 2019, 02:39
Location: Europe, Croatia
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Discorz » 21 May 2022, 11:53

As soon as interactive device such as mouse is included, this eye test also becomes a "feel" or latency test. This introduces a new set of requirements that need to be fulfilled - minimizing latency errors that come along with different setups. E.g. running 480 fps test at 40% gpu utilization vs 1000 fps at 80% would not be apples to apples comparison latencywise (refresh rate latency).
Compare UFOs | Do you use Blur Reduction? | Smooth Frog | Latency Split Test
Alienware AW2521H, Gigabyte M32Q, Asus VG279QM, Alienware AW2518HF, AOC C24G1, AOC G2790PX, Setup

User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 21 May 2022, 13:04

Discorz wrote:
21 May 2022, 11:53
As soon as interactive device such as mouse is included, this eye test also becomes a "feel" or latency test. This introduces a new set of requirements that need to be fulfilled - minimizing latency errors that come along with different setups. E.g. running 480 fps test at 40% gpu utilization vs 1000 fps at 80% would not be apples to apples comparison latencywise (refresh rate latency).
Good catch -- and (portion of) the solution is already built-in into this experiment, fortunately!

This is definitely true these are additional consideration. I alluded to this in control device jitter, because variable latency is mirrored as jitter and vice-versa.

Also, indirectly, following item #1 and #2 indirectly solves this somewhat.

1. Perfectly framepaced motion at framerate=Hz
2. Use VSYNC ON or one of the new low-lag clones of VSYNC ON

This is a defacto frame rate cap that can help reduce GPU % utilization, even if it's to refresh rate. Ensuring reliable framerate=Hz also automatically forces some GPU margin, because you need overkill GPU to get framerate=Hz in many fluctuating-framerate content. This automatically keeps utilization below 100%, and helps this to an extent. That being said, this isn't a perfect guarantee, especially with rendering peaks, but largely sidesteps the GPU100% issue.

If you're unable to get framerate=Hz you're already at 100% but then again you're not meeting the required experimental variables. Then the researcher modifies the software or the rig to meet the experimental conditions specifically for a maxed-out-as-currently-technologically-possible "find out the retina refresh rate" experiment.

So, following the book of framerate=Hz, this will solve the GPU utilization problem (in a roundabout way)
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!

User avatar
Discorz
VIP Member
Posts: 999
Joined: 06 Sep 2019, 02:39
Location: Europe, Croatia
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Discorz » 23 May 2022, 03:53

Chief Blur Buster wrote:
21 May 2022, 13:04
Also, indirectly, following item #1 and #2 indirectly solves this somewhat.

1. Perfectly framepaced motion at framerate=Hz
2. Use VSYNC ON or one of the new low-lag clones of VSYNC ON
From what I understood same framerates at different gpu utilization will result in different latencies. There is a certain linear-ish curve to it. This also holds true for perfectly flat frametimes. Wouldn't this mean we need to test e.g 120 vs 480 fps or 60 vs 1000 fps at same gpu utilization? I'm not even sure if there is a way to go around this if we want to keep the same resolution scale. We could exclude interactive device out of the equation, but we don't want that. On top of this, nature of fluctuating gpu usage (even with flat frame pacing and times) may still affect it. Or I might be turning wrong direction here.

Image
Compare UFOs | Do you use Blur Reduction? | Smooth Frog | Latency Split Test
Alienware AW2521H, Gigabyte M32Q, Asus VG279QM, Alienware AW2518HF, AOC C24G1, AOC G2790PX, Setup

User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 23 May 2022, 13:03

Discorz wrote:
23 May 2022, 03:53
This also holds true for perfectly flat frametimes. Wouldn't this mean we need to test e.g 120 vs 480 fps or 60 vs 1000 fps at same gpu utilization? I'm not even sure if there is a way to go around this if we want to keep the same resolution scale. We could exclude interactive device out of the equation, but we don't want that. On top of this, nature of fluctuating gpu usage (even with flat frame pacing and times) may still affect it. Or I might be turning wrong direction here.
Well-intentioned but wrong direction:

The intent is testing smoothness, not testing latency.

Also, this thread is valid for designing a passive test too (e.g. watching content rather than playing). Item 3 can be correctly met by a lack of controller (e.g. watching www.testufo.com/map ...). In this case, latency is a nonissue. You can also see this in realtime at www.testufo.com/animation-time-graph -- the sideways scrolling of that does not stutter on Chrome browsers on Windows platforms unless the spikes almost match a refreshtime.

There's no difference in smoothness for perfect framerate=Hz whether latency changes from 1ms to 2ms to 3ms to 4ms, as long as the GPU latency is well inside the frametime -- it doesn't interfere with the smoothness of VSYNC ON or a good framerate=Hz sync technology, as long as it's not hairtrigger custom algorithm (e.g. inputdelay algorithm that's super sensitive to frametime changes) but most researchers would just test VSYNC ON for simplicity.

VR headsets currently use the rough equivalent of "VSYNC ON + NULL" natively, out of necessity, and their varying GPU rendering times (internal frametimes, essentially, despite being glassfloor frame pacing at the display level) generally don't affect stutter until their frametimes exceed refreshtime.

So this consideration can be skipped by researchers in a "VSYNC ON" situation. There's just merely the simple test-design consideration of making sure your frametimes are always less than refreshtimes, when developing software pertaining to a blind test for a human vision's "retina refresh rate".
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!

User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 23 May 2022, 20:26

Discorz wrote:
23 May 2022, 03:53
This also holds true for perfectly flat frametimes. Wouldn't this mean we need to test e.g 120 vs 480 fps or 60 vs 1000 fps at same gpu utilization? I'm not even sure if there is a way to go around this if we want to keep the same resolution scale. We could exclude interactive device out of the equation, but we don't want that. On top of this, nature of fluctuating gpu usage (even with flat frame pacing and times) may still affect it. Or I might be turning wrong direction here.
I forgot to mention...

GPU rendering times don't affect VSYNC ON because gametimes are keyed to the VBI's of VSYNC ON.

(Even as simple as just merely grabbing a timestamp immediately upon return from a blocking Present() on VSYNC ON -- which returns from a blocking API call when it hits the display's own VSYNC in the signal -- that's why it's called VSYNC ON).

VSYNC stands for Vertical SYNChronization, a component of a display signal that is a defacto comma-separator between refresh cycles (in digital POV). But back then in the analog days, it was an analog signal to vertically move the electron beam back to the top edge of the screen, to begin scanning a new refresh cycle.

Today it's still in the signal, but functioning more like a comma-separator signal and a time-delay (to allow display motherboard processing more time to begin to prepare a new refresh cycle -- but many display processors are so fast, you only need a tiny 1-scanline VSYNC).

Image

That's why it's called "VSYNC ON" in graphics drivers, it's named after the video signal's VSYNC, something that has existed for about 100 years since analog TV broadcasts has a VSYNC in it too.

(...Another name is "blanking interval" or "vertical blanking interval", but that's (porches/overscan + vsync) totalled. (porches were used as overscan area on a CRT tube, because tubes weren't perfectly shaped like digital monitors, and you needed to overscan a rectangular broadcast onto an odd-shaped tube, and extra overscan was also added to prevent the electron beam from going too far beyond the edges of the tube...)

Image

So this is very old technology. VSYNC existed in the 1920s Baird and Farnsworth TV broadcast experiments! They didn't call it VSYNC yet, but it's exactly the same signal still used on 2020s HDMI and DisplayPort cables. During the analog to digital transition we kept a 1:1 pixel clock.

A 1080p analog signal and a 1080p digital signal can be flawlessly adaptored to each other via an unbuffered realtime HDMI-to-VGA or VGA-to-HDMI adaptor. Yyou can even mirror the same 1080p HDMI output to both a 1080p digital monitor and a Sony FW900 CRT tube concurrently -- from the same digital GPU output (on cards that don't have VGA -- even a current RTX 3080 card. As long as both displays support the resolution and refresh rate and the timings (ATSC HDTV standard vertical total of 1125, which is used for both 1080i and 1080p), it works fine.

Digital signals are simply digital versions of the analog signal in a 1:1 pixel symmetry. Everything was preserved during the analog to digital transition, including the VSYNC signal embedded.

Now you understand the history of what "VSYNC" really means.

______________

Now back to modern nomenclature of GPU-based "VSYNC ON" which tells the drivers to synchronize frames to the signal, from a programmer perspective:

Windows waits for the GPU output to be aligned to a new refresh cycle, before returning from the Present() API.

Even with NULL (avoid buffering up), these waits are one big reason why VSYNC ON still has more latency than VSYNC OFF, but it does solve a big stutter weak link.

The magical thing is that at 1000Hz, VSYNC ON latency can become negligible (1ms). The higher the Hz, the less difference in latency between VSYNC ON and VSYNC OFF! So even if this visual test also became a latency test, the problem automagically solves itself because you're forced to ultrabrief frametimes with ultratiny differences between sync technologies (in latency).

For an experimental test, perfect framerate=Hz (ala VSYNC ON) is used to avoid a microstutter weak link that diminishes the difference between refresh rates. It's easier to tell apart 144Hz versus 180Hz if the content isn't microstuttering. 144fps@144Hz and 180fps@180Hz is generally easier to tell apart than unsynchronized-fps@144Hz vs unsynchronized-fps@180Hz, so we're removing various weak links that may lower the "retina refresh rate" measured in an experiment.

Because it's in exactly the same location of a refresh cycle, this keeps gametimes synchronous with refresh cycles.

1. Gametime clocks increases monolithically during VSYNC ON, keeping gametime:refreshtime in sync.
2. Object positions moving at constant speed always move at exact steps, despite varying GPU render times.
3. So gametime gets internally jittered by the varying GPU rendertime
4. But the VSYNC ON does the equivalent of 1-dimensional snap-to-grid, putting refreshtime back in sync with gametime.
5. Gametimes and refreshtimes are in perfect sync

Problem & weak link solved.

Thanks to VSYNC ON, increasing the threshold of retina refresh rate. VSYNC ON isn't perfect (lag, lag) but it's important for a researcher to measure a retina refresh rate threshold from a vision perspective. The sheer nature of retina refresh rates, also diminishes latency, since VSYNC ON latency can be optimized to refreshtime instead frametime, causing major latency drops as you increase refresh rates trying to test a humankind's retina refresh rate.

So multiple birds are hit concurrently. The intent was visual smoothness / blur / stroboscopic testing like The Stroboscopic Effect of Finite Frame Rates as well as 1000Hz Journey.

Thus, varying GPU rendertimes has no effect on this test, as long as gametime:refreshtime stays in sync (thanks to rendertimes staying less than refreshtime).
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!


User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 24 May 2022, 16:10

Crossposting for Scott Daly of Dolby (who I've emailed a link to this thread):
stl8k wrote:
24 May 2022, 13:34
Here's the author giving a talk on this research...

phpBB [video]
For the final slide in this video...

Good research, but definitively answering the question of final slide in the video.

Image

I call out "Woefully low."

Try these test variables:
1. 16K 180-degree FOV virtual reality screen (widest possible "retina resolution" FOV to amplify Hz liimts as FOV and resolution increases visibility of Hz limts)
2. Sample and hold. That's zero flicker, zero impulse driving (simulate real life, no flicker)
3. Perfect framerate=Hz (avoid jitter error margin)
4. VSYNC ON, not VSYNC OFF (avoid jitter error margin)
5. No control device jitter (e.g. don't use a mouse that's only 1000Hz. Read: Why?)
6. Test large 4x-8x geometric differences in refresh rates (e.g. 120Hz vs 480Hz, or 240Hz vs 1000Hz).
7. Ensure pixel response not the limiting factor, or acknowledge it in your Error Margins section
8. Fast motion speeds, that are still eye trackable
9. Test that force eye tracking (if testing via motion blur weak link)

In this extreme test, retina refresh rates hits 5-digits (>10,000 Hz).

The problem is technology does not exist, but we can at least experimentally confirm this:
- Doubling the resolution can double the required retina refresh rate (when doing 2x-4x refresh rate differential blind tests).
- Keeping pixels within human angular resolving resolution
- Make FOV wider, such as in VR, to produce more time to eyetrack objects, for humans to immediately notice Hz limitations.

Retina refresh rates of desktop displays are lower, but far closer to ~4000 Hz based on my research. When this deep into the diminishing curve of returns, you need toi compare 4x Hz differences (1000Hz vs 4000Hz) using VSYNC ON framerate=Hz with fast motion speeds (4000 pixels/sec on 4K), so for this test, you need a 4K 1000Hz vs 4K 4000Hz display for a 24-27" form factor (retina rez on the 24" or 27" form factor). You can't blind-test 1.5x-2x differences in refresh rates when you're close to the vanishing point of the diminishing curve of returns -- we have experimentally confirmed that you need to begin comparing more massive differences (e.g. 4x, possibly more).

Displays of identical size now exist for all four:
- 1080p 60Hz
- 1080p 240Hz
- 4K 60Hz
- 4K 240Hz

Disable strobing, keep them sample-and-hold;

This provides the basis for multiple blind tests that determines ease of detecting refresh rate limitations. We use a 4x differential to make blur easier to tell apart (much like 1/60sec SLR photograph and 1/240sec SLR photograph -- though GtG error margin will diminish 60Hz vs 240Hz difference), while we do a 2x resolution differential to see how motionspeeds affect detectability of Hz limitations.

For example scrolling at quarter screenwidth per second, half screenwidth per second, and one screenwidth per second. We can use a 4K photo of fine text, and a downconverted photo of the same text (to keep the test pattern dimensions identical). We test motionspeeds until motion blur is noticed by the human. We observe that 4K makes it easier for humans to see Hz limitations when test is structured this way.

Based on the results of the above tests, we easily reliably extrapolate retina refresh rate is far in excess of 1000Hz -- guaranteed, at least when we're comparing large differentials (e.g. 1000Hz vs 4000Hz at framerate=Hz and GtG=0 to avoid GtG error margin on blur), since such differences are needed when we're in the diminishing-curve-of-returns in the refresh rate stratospheres. Just like a 1/1000sec SLR photo and a 1/4000sec SLR photo, you need really fast ultra-sharp (e.g. 70mm or full frame) photos of fast sports, to see the difference between the two photos. Likewise, this is the same for display motion blur. Easier if it's full-FOV (like 180-degree VR)

The important point is that this is easily and reliably extrapolated, by scientifically testing resolution effect's on raising the retina refresh rate, with now newly-available displays, when focussing on variables that maximize the human-visibility of refresh rates.

See the first post of this thread at the top:

Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Certainly this isn't classical esports use case, we're testing future use cases, not current use cases -- such what Hz we need to max out VR quality for humankind. And other theoretical use cases.

This is already experimentally confirmable and extrapolatable, e.g. 1080p 240Hz versus the brand new 4K 240Hz displays at same display size. 240Hz limitations is much easier to see at double the resolution. Combining ultra high resolution increases the maximum Hz of humankind benefit. When extrapolated from this, it's confirmable research because of motion blur and stroboscopics.

VERY IMPORTANT NOTE: For retina refresh rate testing, you have to use a zero-temporals display too (e.g. LCD, LCoS, MicroLED, MiniLED, OLED) because the temporal dithering of DLP is a major error margin in retina refresh rate testing. However, there are some (super-expensive) workarounds for this (e.g. multiple concurrent DLP projectors -- e.g. 24 simultaneous 1-bit monochrome 1920 Hz DLP chips stacked-projecting onto the same screen to produce 24-bit color with zero DLP temporals (8 per primary color doing only black or fullintensity R or G or B) -- true native 24bit with zero temporals). For 30-bit linear color depth, you need 30 DLP projectors (10 per 1-bit monochrome to each primary color), and so on.
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!

User avatar
Chief Blur Buster
Site Admin
Posts: 11647
Joined: 05 Dec 2013, 15:44
Location: Toronto / Hamilton, Ontario, Canada
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Chief Blur Buster » 24 May 2022, 16:48

Also, I made a new discovery today, thanks to an inquiry by an ArsTechnica commentator on their new 500 Hz monitor article...

In the proper design of retana-refresh-rate test, I discovered that direction of motion can impact readability of text.

Crossposting here, because this affects test design.
Max A wrote:I was able to read Alice without too much difficulty at 480px/s on 6 different 60hz monitors that I tried, although I couldn't keep up with the speed of scrolling. 20fps was quite difficult, and even 30fps required a lot of concentration, but 60fps felt pretty natural.
Interesting error margin to consider -- your eyes may be capturing 1/60sec snippets of the display better than most humans, or has trained itself (over years of smartphone use) to read vertically-blurry text successfully.

Horizontally-blurry text is much harder to read than vertically-blurry text. It'd be interesting to design custom TestUFO tests specifically for you, to see if that's indeed the case (or not)

That being said, it's much blurrier or jitterier than stationary text. Everyone sees differently for different tests -- but I bet even the 60fps text doesn't look as stable and clear as stationary (0 pixels/sec "Pause").

Max A wrote:I did find the map to be unreadable down to 360px/s, however. At 240 it was doable, somewhat harder than reading Alice at 60fps, but dramatically easier than reading at 30fps.
Doubling Hz halves display motion blur, making scrolling text clearer. When doubling Hz, you will be able to read text at twice the motion speed. And quadrupling Hz, you will be able to read the text at quadruple the motion speed. At least up to your eyes' maximum eye tracking speed.

Based on your comment, I did an experiment -- and I made a new discovery that should have been obvious before today.

I did some motion blur style simulations of 480 pixels/sec on a 60 Hz LCD display. You are right, vertically blurred text is much easier to read than horizontally blurred text.

Vertically Blurred Text
(480 pixels/sec vertical, 60 Hz LCD)

Image

Horizontally Blurred Text
(480 pixels/sec horizontal, 60 Hz LCD)

Image

Both have exactly the same motion blur applied to it, but the first one of the two is definitely much more readable!

The readability fails because horizontal blur mashes text into adjacent characters.

Because text is perpendicular to the axis of scrolling motion in the first image, readability is affected much less because of the whitespace between lines of text.

I suspect this explains why you're able to read vertically scrolling text, but not horizontally scrolling text at similar motion speeds.

Thank you for making me think -- text readability in the horizontal and vertical directions have different motion thresholds caused by display motion blur -- and that many people are trained by years of smartphone use (flick scrolling) to be able to still read vertically-blurry text.

This insight will guide future retina-refresh-rate text that might be based on "please read this text like your life depends on it" type of experiments in virtual reality, for determining the vanishing point of retina refresh rate tests.

________________________


New additional scientific variable recommendation; if testing retina refresh rate by forcing people to read fast-panning tiny text; use horizontal motion instead of vertical motion. This will increase the max retina refresh rate.
Head of Blur Busters - BlurBusters.com | TestUFO.com | Follow @BlurBusters on Twitter

Image
Forum Rules wrote:  1. Rule #1: Be Nice. This is published forum rule #1. Even To Newbies & People You Disagree With!
  2. Please report rule violations If you see a post that violates forum rules, then report the post.
  3. ALWAYS respect indie testers here. See how indies are bootstrapping Blur Busters research!

User avatar
Discorz
VIP Member
Posts: 999
Joined: 06 Sep 2019, 02:39
Location: Europe, Croatia
Contact:

Re: Properly Designing A Blind Test That >90% Of Humans Can See 240Hz-vs-1000Hz (non-Game Use Cases Too!)

Post by Discorz » 26 May 2022, 03:38

For #3 control device jitter, in addition to 2000+ Hz mouse poll rate I'd note to use high DPI + low in-game sensitivity combo for reducing jitters even further. This only applies to in-game tests.
Compare UFOs | Do you use Blur Reduction? | Smooth Frog | Latency Split Test
Alienware AW2521H, Gigabyte M32Q, Asus VG279QM, Alienware AW2518HF, AOC C24G1, AOC G2790PX, Setup

Post Reply