Progress Report June 2023
Wooooooahhh we’re halfway thereeee… 🎶
No matter how the year has gone so far, the midpoint always feels weird. Wasn’t it like January a couple weeks ago? Compared to May, while no genre-defining blockbusters graced us with their presence, Pikmin fans had a demo to sink their teeth into, a new entry: Master Detective Archives: RAIN CODE from the development team of Danganronpa, and a few other titles that we’re sure folk who are more cultured in video games than us are aware of. Both titles mentioned above also had day 1 compatibility, which bodes nicely for the full release of Pikmin 4 in particular!
Enough chit, let’s chat.
For those who are sick of hearing about Tears of the Kingdom, we apologize. A couple of fixes spilled over into June, so we can all talk about it for another couple of paragraphs.
Intel GPU owners were not particularly thrilled that they were actually bottom class citizens behind Mac users when it came to booting this game. The Intel Vulkan driver has a rather hilarious bug in that, if a render barrier is placed after a return, it will simply hang. Removing these barriers on Intel drivers if the flow is potentially divergent allows TotK to finally head in-game.
Another check we now have to make on the long list of vendor-specific bugs…
And lastly for Zelda, many users across the board were noticing a random issue with gloom deposits, where the ground texture would get itself into a completely broken state.
The compressed 3D texture for gloom was undergoing an incorrect layout conversion when decompressed and causing the rather spindly effort above. On Vulkan, this issue was seemingly random depending on whether it was allowed to form copy dependencies or not, while in OpenGL it was consistently broken and hence very simple to reproduce. Fixing this layout conversion when the 3D depth value equals 1 resolves the issues in both backends. OpenGL once again coming in clutch, regardless of the haters.
Interestingly enough, this change also fixed a long-standing issue in Spiritfarer where the character and NPC sprites were entirely missing; bucking the usual trend of indie titles fixing AAA issues, not the other way around.
We mentioned last month that gdkchan had been working on a fairly huge refactor to the GPU emulator in order to reduce as much backend-specific code as possible, the end goal of which is further unification of all the backends we currently target: OpenGL, Vulkan and Metal (via MoltenVK) to reduce our maintenance commitments and technical debt. Two changes to this end in June were the implementation of shader storage buffer, load/store, local/shared and atomic shared operations using the new global load/store methods.
With the rumours, trailer leak and then trailer of a Persona 3 remake, some folk finally noticed that Persona 4 Golden looked a little strange on Ryujinx when the SMAA filter was applied. The issue was a byproduct of some BGRA/RGBA swaps occurring when the filtering is applied to the original image. There was an attempt to correct this for non-storage image operations, but it was failing in this case.
The fix here involves simply binding the original BGRA as the storage image, rather than creating an RGBA copy, which almost all vendors support.
Let’s move onto some cool stuff related to macOS. The refactoring mentioned above isn’t just for codebase aesthetics, we’ve got plenty of that coming later, but also to make annoying stuff… less annoying.
Using all those new shader operations, transform feedback emulation has finally been upstreamed and is used on any devices without native support, not just Apple silicon. This implementation is substantially cleaner than that which was used to ship macOS1 and comes in at just under 400 lines. Some very notable games will now run and render on our master builds for Mac.
Some GPU vendors do not support float64 shader operations, including both Apple and Intel. Use of these is relatively limited across the Switch library but there are a few instances where they’re used. For Intel, adding a mechanism to convert from float64 (double) operations to supported operations prevents a device loss in Tears of the Kingdom, while for Mac its most notable fix is to Rune Factory 4.
A final, much simplified upstreamed component of macOS1 is a SPV-Cross (library used to convert SPIR-V shaders to Metal MSL) workaround to avoid a stack overflow. Due to the very deep nesting and recursion that SPV-Cross seems to use, the default stack size is simply not large enough for some games, notably Splatoon 3 and Mortal Kombat 11. In macOS1 this was countered by using a custom thread pool rather than the threading resources that .NET provides. This wasn’t ideal and was up there as one of the messier workarounds we had to settle for to get something out the door.
While gdk initially opened the pull request anyway, a user noted that default stack size can be set as an environment variable within the Application plist. This meant that we didn’t need to tell users to manually increase the stack size, and it meant we could avoid any workarounds! What could have been a 200 line, vendor-specific section of code was reduced to 7 lines of environment variable adjustment.
Moving onto some quick-fire changes:
- Address space workarounds were implemented for userlands with less than 39-bits of available address space. Improves ARM64 support where the kernel uses bits 63 to 39.
- Fixes support for Windows ARM64 in the CPU recompiler. Step closer to allowing Ryujinx to boot on Windows for ARM systems.
- A fast path for AES crypto instructions was added for the ARM64 JIT. Reduces the large hang on saving in games that use these crypto instructions such as Animal Crossing: New Horizons.
- The HID GetBusHandle service was stubbed, allowing newer versions of NES online and Starlink: Battle for Atlas to boot again.
- The auto-updater will now ignore files that the user has introduced to the directory. Allows reshade and other tools which place hooks in the executable folder to remain after an update.
While all of that certainly is a lot, the majority of June was monopolized by one thing… DOTNET… FORMAT.
To give a little background, Ryujinx is of course open source and hence we get external contributions from lots of different people in a few areas of the project. While at surface level this sounds great, it isn’t that simple. Every time someone external offers a contribution, our core development staff can either continue working on whatever they’re doing, or spend hours reviewing these external changes to make sure that they do what they say they do, don’t break anything and also conform to the standards we expect of our codebase. No one working on Ryujinx currently does so full time, and asking them to devote the spare time they do have into what boils down to marking homework, isn’t always the most appealing prospect.
This is 2023 though. Surely we can automate some of the more trivial stuff into a bot or something and let the reviewers actually focus on functionality, rather than having to leave hundreds of comments like “remove extra spacing” and “why did you add an extra line here”. The answer is yes, but there were a few things we needed to do first.
.NET has a nifty little built in tool just called Format which, as you would expect, formats code to conform to the standard C# code style. The first time this was run it created a rather monstrous difference of over 30,000 lines of code that needed updating, changing so many parts of the emulator that it would basically be impossible to review as a homogenous lump. The decision was made to format the codebase per-project and this 30K monster was split into around 50 different pull requests. Followers of our changelogs may have gotten rather bored of the repetitive line “Code cleanup. No expected changes in games”, but this is the explanation.
The final goal of all of this is to add a bot workflow that automatically reviews code style, and ultimately makes the review process easier for everyone involved. While this isn’t particularly flashy for a progress report, it took up a very large chunk of the month and we think it’s important to discuss. It’s something every open source project of a certain size has to think about; how to make the time balance of external contribution versus core development tip in everyone’s favour.
Closing words
Well after a rather word-heavy end, we won’t continue to prattle for very long.
As is standard we’d like to thank everyone who supports us on Patreon, tests and contributes code via GitHub, and even those of you who give up some of your own time troubleshooting with others on our Discord. We hope to have iterated it above but time really is our most valuable resource, all of you are giving us more of it in various ways and for that we’re forever grateful!
Until next time. Live on a prayer.