valeriy l14 wrote: ↑26 Mar 2021, 10:16
Well I understood this, but still I have one question for you: how long does it take for the eye to capture a moving target (I call it saccadic input lag)?
500ms, 250, or maybe there are superhumans that can do this in 100ms?
This is an area that varies widely, with a lot of factors:
- Your abilities
- Your training
- Initial identification via direct vision area versus peripheral vision
- Brightness and contrast of the object
- Expectations (predictable vs random objects)
- How far your eye gaze is away from the object you need to lock gaze on
- The initial motion vectors (of your gaze motion & of the object motion)
- How badly it is obscured (including obscured by motion blur, obscured by stutter, etc)
- Etc.
The majority of time, for predictable objects, it is well less than a second. Human "target locking" (fixing your gaze on the object) of an object is a separate problem than the display flaws itself, and is not as much of a study that Blur Busters currently does as on displays. However, since the science/physics overlaps because of the Vicious Cycle Effect (wider-FOV, higher resolution, higher refresh rate, all feeds into amplifying the needs of each other). So out of necessity, I have to be adequately familiar with the human vision limitations, to properly define the limitations of a display relative to human ability.
Remember, identifying whether the display has motion blur, requires multiple steps -- looking for the object (e.g. finding a moving UFO), locking eyes on the object (putting your eyes directly on the UFO), and finally inspecting the object for motion artifacts such as blur/stroboscopics/ghosting/etc (as you track a UFO in TestUFO). These stages typically take separate hundred(s) of milliseconds each, and adds up to perhaps a full half a second (or so), total.
So we're covering multiple separate sciences here:
- The human brain
- The human vision system
- The display itself
- The interactions between the two that generate human visible artifacts (such as display-enforced eye tracking motion blur)
The above animation is predictable:
- Moving UFO keeps reappearing at left edge
- Speed smoothly varies, rather than randomly
This makes it easy to observe the weird display behaviors (lines blending to motion blur) caused by eye-tracking-enforced display motion blur. This animation is cleverly designed to be easy to show off a display behaviors -- such as teaching about the stutter-to-blur continuum (like slow-vibrating music strings vibrate visibly, and fast-vibrating music strings vibrates in a blur), which is also demonstrated on framerate-ramping animations on variable refresh rates (e.g.
www.testufo.com/vrr ...)
Now *within* my area of expertise, cherrypicking the variables to help the user easily identify display flaws: With training/coaching on predictable objects (similiar to teaching people to notice artifacts such as 3:2 pulldown effect that they can't unsee later)...... My experience is most people only need 0.5 to 1 second to identify the existence (or lack) of display motion blur.
Variables:
- Framerate=Hz
- Smooth horizontal motion (longest dimension of screen)
- Predictive appearance of object at edge of screen (like TestUFO)
- Locking gaze on object (tracking the UFO)
- Inspecting the object (checking if UFO has motion blur)
I am able to do this in approximately half a second or so,
TestUFO Panning Map Test at 3000 pixels/sec. A street name label is on the screen for only ~0.64 seconds on a 1920x1080. If I have ULMB enabled with ULMB Pulse Width 30-50, I can lock my eyes on one street name label long enough to read one street name label (e.g. Front Street or Yonge Street) since I'm familiar with Toronto, Canada map and know where to quickly lock my eyes on a familiar street and start trying to read its street name label. So eye "find-gaze-identify" is complete in less than 1 second if the motion is clear.
- ULMB default ULMB Pulse Width 100 = 1ms MPRT
- ULMB with ULMB Pulse Width 50 = about 0.5ms MPRT
At 3000 pixels/sec, 1ms = 3 pixels of motion blur, which can obscure 6-point street name label text with display motion blur.
Under the best ULMB circumstances (ULMB ON + ULMB Pulse Width 50, or a similar "0.5ms MPRT strobe backlight")
- Some people can do the "find-gaze-identify" on this TestUFO panning map test at 2000 pixels/sec (1 second)
- Some people can do the "find-gaze-identify" on this TestUFO panning map test at 3000 pixels/sec (0.64 second)
- Some people can do the "find-gaze-identify" on this TestUFO panning map test at 4000 pixels/sec (0.5 second)
I can do the "find-gaze-see blur" at 4000 pixels/sec but I can't read the text fast enough, so slowing it down to 3000 pixels/sec or 2880 pixels/sec, I am able to read one street label on the fast-panning street map -- about 0.6 seconds.
But remember, this is the whole process -- finding the object (one street), locking my eyes on the object (street label), and inspecting (reading the street name). Each step would take an average of 200 milliseconds each within the 0.6 seconds deadline (from appear at left-eft to disappear at right-edge). But it could be a different ratio such as 200-100-300 or 150-150-400 or 300-50-250 -- that I don't know, someone will have to point a high speed camera at my eyeballs and corroborate the first two parts of the "find-gaze-inspect" behavior -- probably this is an underresearched area.
Occasionally, Blur Busters does commission research (like
www.blurbusters.com/human-reflex ...) and this would be exactly the type of research up my alley if someone wanted to reach out. I've been thinking of writing some more research papers and getting it peer reviewed by others (say, NVIDIA researchers or others) -- but it's a time management thing too, because I don't get paid to write research papers, so I prefer to write "Popular Science" style articles on the main Blur Busters site to help de-mystify things for readers.
Informally, the Ballpark is 0.5-Second to 1-Second For 3-Step Eye Tracking "Find-Gaze-Identify"
Long-time TestUFO experience by millions of visitors indicate that the whole 3-step process of eye-tracking "find-gaze-identify" tends to take 0.5 to 1 second for predictable moving-objects (ala TestUFO), but that includes all 3 intermediate steps, finding the object, gazing the object, then identifying the details of the object, of which each may be surprisingly brief, since gaze is usually faster than initially finding the object, especially if object is unexpected.
It's hard to benchmark the separate stages of "find-gaze-identify", but easy to measure the whole thing together. For example, showing someone TestUFO's Panning Map Test and asking them to read one street name (any street name), once optimizing it to absolute best possible strobing (short strobe pulse widths). This provides a rough baseline of informal anecdotes, all of which is consistently within the 1-second ballpark. This provides the easiest basis for determining a retina refresh rate.
When I say
"retina" refresh rate (a refresh rate equivalent of retina resolutions) -- this is the the refresh rate (and frame rate) where further improvements can no longer be derived. This, also, of course, assumes 0ms GtG (or, minimally, at least real world GtG 100% pixel transitions occuring well under a refresh cycle).
General Rule of Thumb: The Resolution Of the Longest Dimension Of Display is Conveniently the Approximate Retina Refresh Rate*
For the Average Joe User, I use the 1-second eye-tracking benchmark as an easy guideline for calculating practical Retina Refresh Rates
-
1920x1080 displays = requires 1920 pixels/sec to zoom edge-to-edge in 1 seconds = likely ~1920fps at
~1920 Hz is retina refresh rate
-
3840x2160 displays = requires 3840 pixels/sec to zoom edge-to-edge in 1 seconds = likely ~3840fps at
~3840 Hz is retina refresh rate
These aren't hard numbers; there's a lot of fuzz depending on human ability.
For well trained users, i.e. the top 10% users, I would approximately double the "retina" refresh rate (~4000Hz for 1080p). And for the Grandma "I can barely tell apart DVD and HDTV" crowd, I would approximately halve the "retina" refresh rate (~1000Hz for 1080p). (All of them, in a TestUFO versus, are able to notice a difference between 240fps vs 120fps vs 60fps when I show family members, so they're easily trainable to see the differences between doublings at those lower refresh rates -- so most vision-abled people are still seeing 240Hz benefits *if* they pay attention).
*Caveat; some assumptions made:
- Pixels are big enough to distinguish; and
- Object movement is not beyond human eye-tracking speed abilities (smooth eye pursuit); and
Retina
resolution screens (like modern smartphone screens at 300-500+ dpi) where pixels are too tiny to individually identify, start to cap the retina refresh rate at some point. The point where the screen is just "almost retina resolution", probably also dictates the retina refresh rate thereafter too (for any higher resolution in the same-FOV screen). So a retina-resolution 4K smartphone screen will be bottlenecked by the tiny size of the screen & the tininess of pixels. Which would automatically lower the "retina" refresh rate -- it would be useless to have 3840 Hz on a 4K smartphone screen, for example, because you can't identify individual pixels (i.e. jaggies on a diagonal line) -- TestUFO rendered at 100% DPI scaling would yield extremely tiny UFOs that are hard to see details in, for example.
Easy Comparing For Average Users Usually Requires At Least Hz Doublings (or more)
While esports players can see small Hz improvements, the majority of average users can see framerate doublings when coached (e.g. showing A versus B). For example, 30fps vs 60fps, or 60fps vs 120fps, or 120fps vs 240fps (assuming GtG largely faster than Hz, since GtG overlapping multiple refresh cycles diminishes differences between refresh rates / frame rates). That's why the Blur Busters recommendation is the geometric curve upgrade -- 60Hz -> 120Hz -> 240Hz -> 480Hz -> 960Hz. The diminishing curve of returns requires more dramatic jumps in refresh rate & frame rate. So people also need to double frame rates while also doubling refresh rates if they're focussing on display motion blur (and/or stroboscopics). One can still benefit without upgraded refresh rates (e.g. 120fps at 240Hz is lower lag than 120fps at 120Hz) but human visible benefit (motion blur) requires frame rates to keep up with refresh rate.
Now, if you're starting to hit near the edge of the diminishing curve, you may even need to triple or quadruple refresh rate. For example, on a smartphone, 240Hz-vs-480Hz may be hard to see for most Average Joe Users, but 240Hz-vs-1000Hz may be easy to see for Average Joe Users (flick scrolling). So more dramatic geometric upgrades to frame rates and refresh rates may be required -- and that's a very hard engineering problem, especially for a battery-powered-constrained device.
In some cases, not everyone sees it until coached to notice (e.g. eye-tracking text while flick-scrolling the screen). When coached this way, most can identify the difference between a 60Hz iPad and a 120Hz iPad, even if they didn't notice before. (As long as they've got at least average brain function & average vision)
As a result, more realistically, we'll hit retina refresh rates on lower resolution desktop displays (1080p) long before this happens on VR displays (where retina refresh rates are really high due to really high resolutions) and long before they come to mobile screens (due to battery requirements, etc).
This is why high Hz is not just for esports anymore; it's got mainstream ergonomic benefits (e.g. more comfortable web browser scrolling). As long as higher Hz becomes cheap, it'll eventually get gradually mainstreamed (like 120Hz slowly is for interactive content -- phones, tablets, consoles, etc)
Obviously, this all needs to rolled into more formalized research -- and perhaps I'll collaborate with a researcher (feel free to contact me mark [at] blurbusters.com if any reader is a researcher). I've been cited in dozens of peer reviewed papers relating to display-temporal-topics...