Progress Report October 2022
Just like the nine months before it; October has slid gently into the rearview mirror and, by our estimates, shouldn’t be closer than it appears for at least another year.
NieR, Persona, Bayonetta, and a new contender for the coveted “why did this get a sequel?” award: Mario + Rabbids! While that last one is a bit of a horror story as far as emulating the damn thing goes, we’re thrilled with just how great most new titles ran this month with almost no or minimal fixes from the development team. Less time fixing jank means more time in the lab cooking up new features and improvements for your emulating pleasure.
Let’s take a pit stop for a moment and review our patreon goals and incentives. As your monthly reminder, these features are not locked behind a paywall; all features mentioned below will be implemented eventually regardless. However if a goal is reached, then priority is shifted to focus on implementing that feature straightaway.
Patreon Goals:
ARB Shaders - Goal reached in April 2021.
We’d like to provide an update on this goal before anything else this month. It’s closing in on over a year and half since this goal was reached and we believe that some transparency is deserved for those who donate to us.
To cut to the chase, work on ARB (Assembly) shaders has been put on hold indefinitely, until the core development team reassesses its value to the project. Back when these goals were being dreamt up in 2020, the landscape for Ryujinx was completely different; OpenGL was the sole graphical backend and shader stutter was the single largest issue that most users were facing. Over time, this changed. Shader caching, multithreaded shader compile and finally a full Vulkan backend have all but eliminated the need for assembly shaders as the path forward. Due to their being NVIDIA-specific and fundamentally very limited in their capabilities, the decision was therefore made to shift attention onto areas, like Vulkan, that would have a larger impact on everyone.
We would like to reiterate that this does not mean the implementation is being killed or otherwise removed from our longer-term roadmap, but the limited development resources we have are likely better spent elsewhere at the current moment. All we can do for now is apologize to our patreon backers and promise that we have some awesome stuff to show you all before the year is out; stay tuned!
$2000/month - Texture Packs / Replacement Capabilities - getting close!
This will facilitate the replacement of in-game graphics textures which enables custom texture enhancements, alternate controller button graphics, and more.
ETA once the goal is sustained: ~3-4 weeks
$2500/month - One full-time developer - Not yet met.
This amount of monthly donations will allow the project's founder, gdkchan, to work full-time on developing Ryujinx. All our contributors currently only work on the project in their spare time!
$5000/month - Additional full-time developer - Not yet met.
This amount of monthly donations will allow an additional Ryujinx team developer to work full-time on the project.
3….2….1…. GO!
GPU:
To ease us into the GPU section let’s start with a classic from most comedy shows: a one-liner. The avid gaze of Xenoblade fans highlighted a discrepancy in Xenoblade Chronicles 2 between OpenGL and Vulkan in cutscene playback. We’re told that Rex doesn’t usually ask the barber for a back-and-sides fade but it rather suits him, does it not?
Either way, this was an inaccuracy and it turned out that, while we were creating a custom border color, there was no code line to actually pass it to the driver. As such all we can say is: our bad.
With the (then) upcoming release of a new Mario + Rabbids game there was a push to get the first Mario + Rabbids game actually booting in Vulkan on Nvidia. In a rare moment, AMD was not affected by this particular bug due to already being forced to use a different code path for blits. It just goes to show that in some cases, it helps to be so consistently buggy that developers write you a special path! For Nvidia, there was a DeviceLoss error being caused by some driver arguments being inverted and thus attempting to access data that was out of bounds. Flipping these parameters allows the game to finally boot on Nvidia Vulkan.
A recent regression from a refactor of the shader decoder had caused large graphical bugs to begin presenting themselves in both Sea of Solitude and Shadowrun Returns. Bindless elimination was failing to trigger due to the wrong instruction register being consistently used as the check flag. Fixing which register was being read resolves the regressions in both titles.
After some changes covered last month concerning the conversion of quads (which Vulkan does not support natively) to triangles, Luigi’s Mansion 3 was having trouble keeping its minimap in one piece.
The map is meant to have six vertices with the final two being ignored, thus forming a quad. Unfortunately this was rounded up, instead of ignored, and ultimately formed two quads instead of one. By fixing this calculation, the minimap and some other miscellaneous issues found along the way, this could be quickly resolved.
Mario Kart 8 Deluxe was behaving naughtily when using Vulkan and creating an unbound number of graphics pipelines when blend constants were being used. The game seems to blend between colors at various stages of each track, and this was resulting in an inflated number of pipelines being generated. Limiting this behavior by using a reduced number of dynamic states reduces the cost of pipeline creation and can also reduce RAM usage by a small but noticeable amount.
Let’s talk tessellation once again. If you’re getting déjà vu then don’t worry; we also covered it last month! Shortly after the release of Bayonetta 3 it was almost immediately noticed that, shock-horror, AMD GPUs were crashing the title. The fix involved taking a simple look at how OpenGL and its shader language GLSL was handling things. Tessellation control shaders are always indexed in GLSL in the same way using ‘gl_InvocationID’, which was not being done on the Vulkan SPIR-V control shader. What was more irritating is that this oversight wasn’t causing validation errors, and only AMD Windows drivers seemed to care. Not Nvidia, not Intel, not even AMD on Linux using either their proprietary or open-source RADV driver. Regardless, the outputs are now indexed with the same ‘gl_InvocationID’ method and we haven’t seen any more complaints from the driver.
Have a break, here’s a quick-fire round.
- Allocating unmanaged strings for shaders is now actively avoided and has a cleaner implementation.
- Buffer texture storage will now update when its handle is reused in Vulkan. Possible fix to a bug in Unreal Engine 4 games where models would occasionally swap back to an old animation frame periodically.
- Buffer ranges with a size of 0 are now handled correctly. Fixes an index out of range crash in Fire Emblem Warriors: Three Hopes.
- TextureStorage is now disposed only when views hit 0 on Vulkan. Attempts to address the memory leak in Super Mario Odyssey when using resolution scaling with Vulkan. May also reduce memory usage in other titles or on GPUs that alias textures often.
- Various issues with the Vulkan CacheByRange function were resolved. Fixed a regression that caused some broken or missing geometry in eBaseball Powerful Pro Yakyuu 2022, and possibly some other games that had issues with quads under Vulkan.
- Textures and their samplers are now both checked to see if they’ve been disposed before being updated on the texture binding manager. This resolves a crash in Crash Team Racing when using Vulkan. This does unfortunately kill the irony…
- Code generation for BRX shader instructions was improved which helps compilers generate more efficient and succinct outputs. Could help some drivers compile shaders slightly more efficiently, although no difference was seen for NVIDIA and AMD.
- The indirect buffer barrier was fixed in Vulkan which addresses some DeviceLoss crashes on newer Nvidia drivers in Monster Hunter Rise (not Sunbreak!).
- Replace VK_EXT_debug_report usage with VK_EXT_debug_utils to conform to more modern debugging and error standards. May help better Vulkan error reporting!
Rounding out the GPU section let’s speak in a language everyone understands. Performance!
NieR: Automata got a small improvement at release by passing SpanOrArray for Texture SetData to avoid a mass of copies that would bog frame rates, but otherwise still has some fairly large performance deficits to overcome in other areas.
The elephant in the room was actually presented by Mario + Rabbids Kingdom Battle; the first of the series, for those who don’t follow the franchise. By using a bitmap to track buffer modified flags instead of a MultiRegionHandle, games that bind HUGE buffers, in the region of 10MB and beyond, see enormous improvement as on paper the lookup becomes 64x faster. More games than expected actually exhibited this behavior so it ended up impacting a whole host of old and newer titles, including both Mario + Rabbids: Sparks of Hope and Bayonetta 3!
This table quickly shortlists a few titles that we instantly saw large gains in. However, as showcased by Zombie Army 4, there are probably a swathe of niche titles that are impacted that just haven’t been found yet.
CPU:
October’s CPU section is certainly more concise than last month's blistering streak, but some of the changes here are just as interesting.
Owners of either Intel’s Icelake (or beyond) and AMD’s new Zen 4 CPUs will be interested in how any of the newer instruction sets these architectures support fair if and when Ryujinx can take advantage of them. The largest and most well-known of these is of course AVX-512, but there are a few other interesting instructions that bleeding edge architectures can exploit, including ‘Galois New Field Instructions’; GFNI for short. While originally intended for cryptography, they can heavily accelerate general-purpose bit-shuffling operations, which are of great use in emulation. Initial support for these instructions have thus been implemented into the recompiler and on paper are generating much improved assembly.
We only have a couple of 32-bit implementations this month in the form of VCVTT and VCVTB. With these in place, Radiant Silvergun can finally head in-game and to no-one's surprise, it renders and plays great! Great, if you love retro titles that is.
Moving onto something a little more modern both in the instruction set and the games it affects, fast paths were added for… deep breath… A32: Vcvta_RM, Vrinta_RM and Vrinta_V and for A64: Fcvtas_Gp/S/V, Fcvtau_Gp/S/V and Frinta_S/V. Jargon aside, Super Smash Bros. Ultimate and Mario Strikers: Battle League both make extensive use of the 64-bit instructions included in these optimizations, with Mario Party Superstars possibly being impacted too. While there weren’t any obvious changes in our testing, the new fast paths could remove a bottleneck for lower-end CPUs or particularly tough emulation spots.
Mopping up some smaller changes before moving on; the rejit queue will no longer clear under certain edge conditions and IDisposable (the interface used to tell .NET that something can be disposed) was added to the Unicorn CPU test module.
Kernel/Services:
The battle against accurately emulating Horizon OS and its seemingly endless services and oddities continues this month, with a few notable additions.
The aptly-named ‘fatal’ service finally saw the light of day after nearly five years; a staggering amount of time considering that, internally, it’s service number one. While not as important for an emulator (we can already glean all the crash information needed via loggers/debuggers), it’s essential for future implementations, such as full error applets and guest error handling.
A memory corruption in the BCAT and FS read methods that was causing a crash in SWORD ART ONLINE: Alicization Lycoris was fixed, and the game now progresses beyond the title screen and into gameplay. No other bugs with this title were immediately obvious (other than being a disgustingly low resolution), but as usual we’re sure you’ll all let us know!
Some other more general changes include:
- The state of NfcManager is now handled correctly and resolves the crashes Hyrule Warriors: Definitive Edition would experience when scanning an amiibo.
- SurfaceFlinger can now enqueue even after the process has exited. This prevents an exception crash that could occur on stopping emulation.
- ‘GetSessionCacheMode’ was implemented in the SSL services.
- Bound sockets are now checked before calling RecvFrom(). This fixes a crash that consistently occurred in Overpass if Guest Internet was enabled.
- ‘StopImageProcessorAsync’ was stubbed which prevents a crash in any title, such as Game Builder Garage, that may attempt to use the JoyCon IR camera.
- ‘SetRecordVolumeMuted’ was stubbed which avoids a crash in Bayonetta 3 in cutscenes.
Filesystem services are the central pillar which allow titles to boot, save their data and gracefully interact with the Switch in general. ‘OpenDataStorageWithProgramIndex’ was a service we had up to this point been missing, and its partial implementation allows both Rollercoaster Tycoon 3 and MLB The Show 22 to both boot into in-game. The change doesn't currently support accessing any data outside the current program index, but there is no infrastructure there regardless; when we eventually find a game or homebrew that makes use of that functionality then the service can be fully explored.
We do not condone playing RCT3 and let alone the Switch port. Go out, buy Rollercoaster Tycoon 2 on PC and then get the OpenRCT2 patch. Thank me later.
This title still has issues. It runs extremely slowly on OpenGL and will crash on any GPUs with a local size of 1024 or less (any Nvidia GPU beyond 1000 series). If you have a 1000 or 900 series GPU then knock yourself out of the park with Vulkan.
Moving onto random bugs, a favorite of any software developer, we’ll start small and then work our way big. The old kernel implementation memory allocation method used to randomly try to find an empty region and allocate; if that failed, it would use a linear allocation. The issue was that the variable used to store the random address was being read as temporary storage within the allocation loop, and as such wasn’t the value zero when the random allocation failed. This could mean that the loop may actually be able to validly return an address in active use, causing a crash. In practice, the random allocation fails so infrequently that this isn’t a huge concern. Preemptively nipping this one in the bud reduces the crashes caused; more often in 32-bit titles such as ‘DoDonPachi Resurrection’ due to their smaller address space.
Let’s move away from the small-fry and onto a fully cooked tuna with side salad and complimentary open-bar, shall we? This month finally saw the death of some of the longest standing random graphics bugs, boot crashes and gameplay crashes possibly on record for Ryujinx.
It turns out the NvMap ID allocation service isn’t written with any level of normality by using an ever-incrementing counter for the ID. If you’re wondering “isn’t that super dumb because it could eventually overflow?”, you're not alone; what possessed Nintendo to intentionally create this potential point of failure is anyone's guess. It actually gets even worse because this allocation service increments by a value of 4 each time; effectively taking 4x less time for the counter to run out of valid IDs. However, in practice, this would require someone to leave their game running for months/years to constantly increment this ID to an overflow point. If anyone has a couple of years to kill and a spare Switch then may we suggest an experiment?
Either way, as an emulator we have to match hardware behavior even if we think it’s stupid… Luckily there are benefits! Let’s talk about some of those bugs I mentioned earlier:
- Animal Crossing: New Horizons no longer crashes randomly on boot without a save file!
- Various random graphical glitches in Animal Crossing: New Horizons were resolved. There truly has never been a better time to start a new island.
- The Legend of Zelda: Breath of the Wild will no longer randomly crash. Not much more to say about this one except for how annoying it was. Yours truly rode around on a horse for over an hour to test this one was fully gone!
- Random crashes when entering/exiting Pokémon Centers in Pokémon Sword/Shield are also tentatively fixed. This one is still very hard to test but we couldn’t replicate it after around 15 minutes of walking back and forth through the door, and we’ve received no further reports. Sanity only goes so far.
MISC/GUI:
As seems customary, we’ll finish off some of the changes happening in the outer orbitals of Ryujinx. Not everything is about that try-hard low-level nerd stuff like GPUs and CPUs!
We communicated last month that the Avalonia test builds finally had their auto-updaters fixed, but this was part of some more widespread efforts to turn a few of the pop-ups and windows into overlay dialogs instead of dedicated windows.
The same treatment was given to the controller applet dialog to reduce issues on Windows and various Linux distros when displaying transparency.
On the topic of the controller dialog: for many years, users have reported that it was appearing even when they had a seemingly valid control configuration. This was caused when the GUI signals to tell Horizon that a user had ‘disconnected’ controllers were passing incorrect data about the input state and, as such, the emulated Switch still believed there were other players connected. Passing the correct filtered data through our HLE input system should prevent this from happening and save us all a lot of stress finding phantom controllers.
Some smaller blitz changes:
- Avalonia had its Italian and Polish translations updated.
- A bug where using the keyboard to scroll through timezones in Avalonia removed all options but the first was resolved.
- The ‘About’ window now displays its localizable title in Avalonia.
- Command-line arguments will no longer break after an update.
- Some mapping leaks were fixed on Linux due to invalid flag combinations.
- Allocations in .Parse methods are now avoided to reduce array allocations.
One of Ryujinx’s earliest contributors, mageven, helped solve one of the more annoying issues our cheat system had this month, in the form of conditional inputs. Translated into English, that means cheats that require you to press a button combination. By correcting a simple logic error these types of cheats should work now!
Finishing with a quality-of-life change, support for volume hotkeys has also been added. Like the resolution scaling hotkeys before them they are, by default, not bound to any key press. ‘How do I use them then?’ you may be wondering. Our Avalonia UI has configurable hotkeys via a menu, but did you know all of our hotkeys have always been configurable anyway? Minus the GUI part of-course.
If you don’t mind getting your hands dirty and wish to map any of these “unbound” hotkeys without going through the hassle of downloading other builds, then you can simply:
- File -> Open Ryujinx Folder.
- Open the Config.json file in a text editor of your choice.
- Find the “hotkeys” section and add/edit to your heart's content!
Closing Words:
That’s all from us for October, but we have a sneaking suspicion that November is going to be one you should keep your eye on…
If any of you wonderful people reading this have an interest in helping develop on the cutting-edge of Switch emulation, then we’re always open for new contributors in our Discord or on our GitHub page! C# is the language and we’re told it’s somewhat like if C, Java and Microsoft all had premarital relations. If any of this sounds familiar, fun or something that could look cool on your GitHub page then we’d love to have you!
Until next time!