You're the confirmed author of CLK (github) -- Acorn Electron, Amstrad CPC, Atari 2600, ColecoVision, Commodore Vic-20, MSX 1, Oric and ZX80/81
It's great that a third emulator author has joined this thread.
Your software is fantastic so I'm looking forward to seeing all these ideas being cherrypicked for your emulator too.
Neat. Funnily, it's sounds similar to the behaviour of my "NO RASTER REGISTER" raster guesser algorithm at startup. I'm bumping upwards/downwards an interval estimate based on which direction my error margin leans in, in a decaying fashion. If I don't give my algorithm a refresh rate (e.g. has to make a wild guess at the refresh rate), often my tearlines rolls 1 or 2 cycles for the first 1/2 to 1 second (like a VHOLD knob miss) and then suddenly disappears (stabilizes) as a good history of timestamps is built. I guess it's a figurative digital version of a flywheel spinning up. If I have a genuine raster hook, this doesn't happen.Tommy wrote:When it receives a sync trigger from the feeding signal that prompts a comparison with what the flywheel already believes. The flywheel can't change phase but it can change frequency. So it shifts itself in the direction of the error proportionally to the error subject to a cap.
I need to refine the VSYNC listener startup stabilization more, but this will be an open source project soon (title "Tearline Jedi Demo")...so other people may be able to stabilize the startup better. I'm prioritizing on raster stability, but sometimes algorithms for continued stability is 100% contrary to goals of quick sync (in under 500ms).
Neat -- My main development machine is a Mac.Tommy wrote:Also mine is a proper native app on the Mac at least, which makes a common-enough use case being to have multiple emulated machines all over your desktop, and something like 90% of Macs are laptops nowadays
(It makes a good Mac + Windows + Linux machine too -- all three cakes in one!)
Though I program mainly in Windows, the beam racing stuff also works on Mac too.
Fortunately, the concepts here are 100% cross platform -- fundamentally, VSYNC OFF tearlines are simply rasters.
Make sure you turn off Mac "BeamSync" to get beam racing working on Mac -- you have to enable tearlines on a Mac before one gets the ability to have incredible fun beam-racing the tearlines out of the way. I haven't begun doing that, but there's an API call to do that...
Hmmm....Tommy wrote:I'm not sure that a busy loop will be acceptable to most of my users. Which makes for a bunch of extra factors
Most of the time, my Mac laptop is plugged in, so I wouldn't care during those moments.
One possible solution? Maybe give the users a choice between....
-- Low-precision beam racing (4 frame slices can work with millisecond sleeping)
-- High-precision beam racing (10, 20+ frame slices, requires busy waiting).
You'll just simply need to use larger chase margins/offsets. (e.g. 2ms or 3ms chase distance between real-raster and emu-raster). If you use millisecond sleeping though. As long as you do the "forgiving method" (full refresh cycle jitter margin technique), you can gently use millisecond sleepers with beam racing. Now that I think about it, I suppose adding 3-4ms of extra lag to save lots of battery power is worth it! With a refresh cycle being 16.7ms, that gives you lots of play margin to tolerate an imprecise sleep.
I am doing some research on more power-efficient sleep methods, some platforms have a very precise "sleep-until" technique that is almost microsecond accurate. I've not quite unlocked this feature, but what seems to be happening is some platforms have microsecond-accurate nanosleep() -- it doesn't always go that accurate but it apparently gets really accurate on at least one of my systems! Unfortunately, not all platforms have such accurate microsecond sleeping.
If you gain access to 0.1ms sleep, the good news is that 0.1ms is good enough for 10-frameslice operation. You might need ~0.01ms accuracy for 100-frameslice-per-refresh-cycle operation with a tight beam-racing margin. The determining factor is the horizontal scan rate -- 67.5 KHz for 1080p 120Hz
Perhaps I was too hasty in recommending busywaits only because it is the only 100% reliable way to do it on all platforms (slow laptops included). A "trust it and forget it" mechanism. But if you're targetting Mac only, I think there's a Mac microsleeper (thanks to Apple's religious approaches to power management and all) but I haven't researched that far yet. Macs are (usually) predictable & consistent so if you've found an Apple sub-millisecond microsleeper, it probably works on all or almost all of them. Need to figure out.
You could simply have a toggle for power-priority versus precision-priority. On some platforms, they're the same (thanks to practically microsecond-accurate sleeper), but on some platforms they diverge (busysleep method during precision-priority).
Einstein is relative: It doesn't add audio latency relative to joystick button. That actually decreases audio latency slightly, bTommy wrote:EDIT: and re: whether to handle 120Hz display of 60Hz as a double-speed burst for the first frame followed by a repeat or a blank, or to abandon raster racing, I think I'd prefer the latter because latency is my overriding concern.
You must mean blurry in the spatial dimension (CRTs) -- rather than blurry in the temporal dimension (CRTs were blur-free for 60 frames per second).Tommy wrote:I'm emulating early-'80s machines so the real experience would have been blurry but lag free
For readers familiar with motion blur reduction strobe backlights -- non-strobed 60fps@120Hz doesn't reduce blur unless you add black frame insertion, and you can still do 60Hz with traditional strobing (ala BenQ XL2720Z) but most gaming monitors only do 120Hz CRT clarity via 120Hz because they don't want to enable painful 60Hz flicker to end users.
While everyone has their legitimate goals & preferences that I certainly respect..... For me, my motto is usually giving users a choice of what faithfulness priorities to target is sometimes useful given ULMB actually can make emulators more faithful looking to original CRTs, but ULMB adds slight amount of lag. But so does HLSL shader and fuzzy-scanlines renderers too! More frametime lag means you're adding lag to become more faithful in a different area in a pick-your-poison way. Einstein is relative -- add less faithfulness in one area (lag) to improve faithfulness in another area (looks)
Agreed. User needs choice.Tommy wrote:EDIT2: oh, and audio latency too if you tried to fit 60Hz to 120Hz by taking every other frame off. I think that, in summary, my perspective is that there are at least three latency factors at work here: input, video and audio, and I don't agree that any one trumps the other two.
But audio lag doesn't increase. It's visual lag being reduced (by the ultra-fast-scanout) to the point where audio may feel lag relative to visual stimuli.
However, absolute audio lag never gets longer between joystick FIRE button and the audio. It's just photons hitting eyes sooner.
Yes, you're having to buffer and dejitter the audio, but the absolute time between Joystick FIRE button and the audio stimuli never, ever gets longer, right? Yep. Now you get it -- lag is only because photons hit the eyes sooner, thanks to the fast-scanout cheat.
But that's fixable (see below)
Remember some TVs have unavoidable input lag, so the technique of speeding up frame delivery can somewhat compensate for a laggy TV to be more faithful (lag-wise). So the fast-scanout-method can help some laggier 120Hz-compatible TVs overcome television-buffer lag a little bit. 120Hz doesn't always mean lower absolute lag. So you can use the fast-scan-beam-racing to compensate for the television handicap. And more closely align lag to original machine. The cheat compensated for the handicapped television. Faithfulness restored, and the audio lag actually realigns itself to the display-electronics-lagged photons (no audio lag!).
As Einstein says, it is all relative -- so personally, my approach is simply to be 100% compatible with all scanout velocities, if possible -- Flexibility to remain faithful to original machine becomes widest.
For unexpected future things thrown in our direction (e.g. a laggy 120Hz display that has a buffer on the monitor side, or a laggy 60Hz HDMI 2.1 Quick Frame Transport compatible display (where the new HDMI spec creates a fast-scanout "60Hz" cycle to help overcome other things like HDR display panel lag...). 60Hz fixed-hz modes aren't always slow-scanout, I've seen 60Hz signals with big-VBI too, sometimes.
Tomorrow in year 2020, you might use HDR to improve electron gun emulation better (e.g. fringing artifacts, or making shadowmask dots brighter on blacker backgrounds), but find that HDR sometimes adds lag. Then, you know, you'll possibly be reconsidering the scanout velocity problem and suddenly finding yourself facing 60Hz HDMI 2.1 Quick Frame Transport (QFT). (Basically a 60Hz signal with humongous VBI sizes).
Some MAME arcade cabinet makers sometimes use a 31.5KHz VGA CRT to simulate 15.3KHz NTSC scanout by doing double-refresh (120Hz 240p works fine on 31.5KHz VGA-only CRT). That creates a double-image effect during 60fps scrolling, as explained by this diagram (from my 1000Hz journey article). Recently, from the software black frame insertion (that got added on my advice -- as a GroovyMAME patch a few years back; Calamity here in these forums would probably remember...).
Then sometime after that was done to improve LightBoost displays originally -- Someone clever suggested enabling software-based black frame insertion with the arcade CRT! Very clever. To black out the 2nd repeat refresh cycle to eliminate the double-image effect. Viola! Now it looks just like a perfect NTSC 15.3KHz CRT even though the VGA CRT can only do VGA 31.5KHz.
But it's a fast-scanout signal (1/120sec). Yup. Gotta beam race that fast-scanout if you want better "original-lag" faithfulness. Having audio 4ms behind video is lesser evil than audio 30ms ahead of video....Isn't that more faithful indeed?
The audio lag never worsens relative to joystick input.
Now, say, you combine software BFI with fast-beam-racing to allow a low-lag 15.3KHz emulation on a 31.5KHz CRT -- you've only simply reduced video lag to less than original machine because of fast-scanout. Yep. Video lag less than original arcade machine! Meaning audio is slightly lagged relative to the now too-fast video. But that's fixable, see below:
TIP: You can calibrate the chase distance between emuraster + realraster if you want a "video delay" ...
Nitpick fixed, eh? Could be a slider in an on-screen popup menu in your emulator.
(Remember, when rasterplotting on top of the existing emulator framebuffer, without clearing between emulator refresh cycles, we have a full refresh cycle minus one frameslice worth of jitter margin. That gives you an optional "video delay" adjustment with, just by adjusting the chase margin between realraster + emuraster -- it's fully wraparound -- chase margin varies depending on variables but can be up to ~16ms of video delay adjustment.
So sometimes actually deciding to be compatible with 1/120sec scanout actually improve faithfulness in many ways (by user configurable choice) when you're thrown imperfect hardware in your direction. (Convinced yet? ...)
No worries -- if you only ever do 60Hz slow-scan beam racing, all good -- that's the most faithful way to do so. I agree. (We all have our own different approaches to how we give users a choice to abusing settings & modes to reproduce certain faithfulness aspects)
From what you said, it should be easy to add beam racing to your specific emulator.
I'm looking forward to seeing more emulators implement beam racing techniques!