While it's quite possible RunAhead and beamracing won't be as useful together as I might have expected, but what it does indeed mean -- it's still possible to combine the two. Just with slightly (or lot) less benefits than I might have thought.
Which is fantastic. I'll take your word for it that it's now being done regularly.Calamity wrote:In GroovyMAME, we've been routinely achieving next frame response since 2013, thanks to frame delay and direct vblank polling.
Agreed. That's where my current beamracing focus is, the 8-bit and 16-bit systems of the raster-interrupt and raster-coprocessor era.Calamity wrote:The fact is that when emulating frame buffered systems, frame delay and beam racing are EQUIVALENT. It's only for beam-raced systems (e.g. Amiga) where beam racing makes a difference.
Hmmm, I didn't think of that part -- yes, this could still be a benefit there.Calamity wrote:The beauty of beam racing is that it's a more natural approach than frame delay. Besides of virtually no latency, it has other added benefits. It increases input granularity, which might be of help even for frame buffered systems (fight games: combos).
RunAhead requires more performance than beam racing. Beam racing will work on Raspberry Pi systems, modern Android and Pi GPUs have the GPU memory bandwidth to do 4-to-10-slice beamracing, especially with low-resolution framebuffers designed for low-resolution screens (e.g. CRT outputs). Frameslice beam racing is mostly GPU-bandwidth dependant (unless you do shader full-screen re-renders per frameslice)Calamity wrote:With all honesty, I believe run-ahead will prevail over beam racing, at least in the short term. I don't see both hybridizing. Run-ahead is easier and effective enough for the people. It's still the wrong approach, you already know my opinion. It will be an incentive for people to stick with crappy hardware.
Beam racing is much easier for lower-performance systems at the low-frameslice granularity since the bandwidth of the 2017/2018 embedded GPUs far exceeds a few-year-old midrange desktop GPU, which is frankly amazing. Thanks to phone/tablet miniaturization, RAM is embedded into the GPU which helped a lot. The highest-performing Android devices can spray more frameslices per second than the Intel 4000 GPU that did 6-frameslices with WinUAE.
Although many frameslices demands fast GPU memory, it is very scaleable all the way down to low-end systems from my findings if you do the generous-jittermargin technique. Raster can jitter almost a refresh cycle safely if using the generous jittermargin technique, wider race margins makes it much more lowend-friendly.
That allows emulators to only need to execute in real time, with no surge execution. Lower latency on cheaper hardware.
Actually, I think it's doable if we just take our time. I looked at the Retroarch API, and it generally mainly requires adding one optional raster callback function to the emulator modules that supports beamracing. Basically called after every raster:Calamity wrote:RA's architecture makes it very difficult to implement beam racing. RA is made up of different kernels. Each kernel is designed differently. The interaction between frontend and kernels is based on full frames.
(1) Centralizes beam raced renderers (hides complexity)
(2) Same arguments as the full-frame-delivery callback
(3) Gives centralized beam raced renderers an early peek at frame buffer
(4) Can be ignored if you're doing full frame mechanisms
(5) Can be acted upon (every scanline for front buffer rendering, every X scanlines for frameslice rendering)
(6) Centralized beam raced renderers will do its own busywaits if needed before returning from callback. This delays the emulator to stay closer behind the realraster.
(7) Minimizes footprint of modification for some emulator modules (in some cases, as little as 5-line mods)
(8) The only responsibility the emulator module does is to call the callback module every raster. That's it. The central beamracer module handles the rest (deciding if it's time to frameslice, deciding on nanosleeps)
(9) The workflow stays compatible with future beamracing creativity (e.g. front buffer delivery, future line-based HLSL systems like AddRaster() CRT emulator workflows, future combo of RunAhead+beamracing if still desirable/wanted, VRR, 60Hz, 120Hz, or any workflow)
(10) Do it in a staged way.
.......Add the API as an optional per-raster callback function
.......Emulator modules that has this callback, are the beamraceable modules
.......We can then begin with one module at a time.
I imagine that we could begin with one module, like the one they suggested in their forums.
There is nothing stopping the core module from dynamically entering/exiting beamracing, by ignoring rastercallbacks and only acting upon the final full frame (e.g. when screen is rotated on a tablet computer where real scan direction is mismatched from emulator scan direction -- which can easily automatically disable beamracing if not in native screen orientation), or occasionally deciding to suddenly do a larger delay during scanline #1 to realign refresh cycles for misalignments (e.g. beam-racing 60 out of 75 refresh cycles (for whatever reason user kept it 75Hz). Stuttery but still lower-lag and prevents a beamracing failure for odd-Hz, it looks just like VSYNC ON stuttering at odd-Hz). For situations where beamracing is fully unwanted (e.g. offscreen RunAhead frames, wrong screen rotation, windowed-mode switch that doesn't support beamracing), the callbacks would continue, just simply return(), and other times, the callbacks can also optionally instead be used to monitor emulator raster-rendering performance to make inputdelay predictions a little more accurate, etc.
I have offered some technical help to RetroArch programmers for this. Two developers behind RetroArch has expressed tentative interest, and I have stressed it is only up to them, if they ever want to implement beamracing.
I am thinking I'd like to put up a 3-figure BountySource prize for this for any multi-emulator system that inludes a large 8-bit footprint (MAME/NES/etc). We're latency crazy people on Blur Busters anyway. Personal funds. Opinions? Gotchas/Problems? Etc? Interested?