Progress Report June 2022
Half-way through 2022 already and time sure flies when we have some good games to play! Speaking of games, some real bangers were released this month and we’re happy to say that most of them work, either out of the box or with some small workarounds. Despite coming from Koei Tecmo (a name all emulator developers fear), Fire Emblem Warriors: Three Hopes, both the demo and full-game, ran flawlessly on Day 1 and only Mario Strikers stole our thunder!
We try not to advertise games being playable if a mod is required to bypass the intro, but the option is there for anyone interested, and the rest of the game is running great.
Here’s to more awesome releases in the second half, and without further ado, let’s jump into our patreon goals!
Patreon Goals:
Vulkan GPU Backend - still in progress.
A public test build is delivered and is available here!
ARB Shaders - Goal reached in April 2021.
Work is ongoing alongside Vulkan, please wait a little while longer until we are able to deliver this update into a state we are happy with.
ARB shaders will further reduce stuttering on the first run by improving the shader compilation speed on NVIDIA GPUs using the OpenGL API.
$2000/month - Texture Packs / Replacement Capabilities - almost there!
This will facilitate the replacement of in-game graphics textures which enables custom texture enhancements, alternate controller button graphics, and more.
ETA once the goal is sustained: ~3-4 weeks
$2500/month - One full-time developer - Not yet met.
This amount of monthly donations will allow the project's founder, gdkchan, to work full-time on developing Ryujinx. All our contributors currently only work on the project in their spare time!
$5000/month - Additional full-time developer - Not yet met.
This amount of monthly donations will allow an additional Ryujinx team developer to work full-time on the project.
Let’s get started.
GPU:
Switch emulation is not something any group nor project has monopoly over, and the Skyline team are certainly putting their share of work to prove it! Co-lead developer Bylaws this month pushed a couple of fixes to Ryujinx; first of which resolved a possible race-condition, and the second fixed a long-standing bug in none other than The Elder Scrolls V: Skyrim. While possibly the most resilient game of all time to platform ports, it had never once got past the menus since it booted in 2020, and it turned out that a counter type used by Skyrim actually expects a semaphore (an alternative data structure used to help multi-threaded tasks), not just a constant zero to be released.
While still not in a perfect state, it’s always cool to see some of the more visually complex Switch games rendering and actually performing remarkably well!
Not content with just fixing one of the best selling games of all time, this change actually also completely resolves the abysmally slow speeds in Ys IX: Monstrum Nox and resolves a black screen graphical issue in Giana Sisters: Twisted Dreams. One very patient user noted of Ys IX ‘It took me 5 hours to get ingame’ , if anyone wanted a more concrete idea of just how slow we’re talking.
Continuing this month on his war against shovelware titles using endless GPU black magic, gdkchan put the finishing touches in place to fully fix both Perky Little Things and Genkai Tokki Moero Crystal H. Both games would simply present black screens due to a missed case in how multisampled and non-multisampled textures were handled. By allowing non-multisampled textures to inherit the same data as multisampled textures, Ryujinx will no longer read garbage data if this condition occurs.
Of course GTMCH (I am not typing the full name again) isn’t looking much better. Unless you’re a huge fan of the color… Sepia Skin? Luckily a small fix to *deep breath* instanced indexed inline draws, try saying that one quickly, actually allows the game to render something other than a single color.
A follow-up change related to indexed draws fixes the lackluster performance some games using them could have. Sometimes the emulator would draw multiple times for the same result, which added a huge amount of render-time before a frame was ready. By passing the index count to only a single instance, the excess draws were eliminated as shown below (far right column is the draw count).
A video crash in the newly released LOOPERS was resolved by restricting the output rectangle to sit within a defined surface. Previously, if there was mismatch between the input and output surfaces, any data outside of the input range was technically ‘undefined’ and would randomly crash with an access violation.
Continuing on our theme of fixing visual novels, an adjustment was made to the draw texture fallback used on AMD and Intel GPUs so that certain games would not render their viewport upside down. As only NVIDIA supports the NV_draw_texture extension, Ryujinx needs to ignore the current ClipControl settings as they aren’t valid on non-NVIDIA GPUs.
Alright for everyone who doesn’t care about visual novels, this month also had some changes and fixes for you too! The first of which affected one of the most hotly anticipated (and honestly disappointing) games of this year: the new Mario Strikers. Glossing over the as-yet unfixed crash due to the intro cinematic, the game mostly ran and rendered pretty well at launch outside of the animated 3D crowd. gdkchan jumped to the rescue and added support for some new forms of depth-stencil render targets (array and 3D texture), alongside fixing a bug that caused Ryujinx to ignore render target clears. With both changes in place the crowds now actually render and gameplay isn’t so lonely!
A Hat in Time was another game that used to crash before the title screen, but weirdly enough only once the player had progressed a certain way through the story. A texture ID may not be valid when a shader compile occurs for a number of reasons, and so by checking this case before accessing the descriptor, we can avoid any unmapped memory crashes related to this.
Shader Cache 2.0 has been a largely net positive on playability of a lot of notorious games, but that doesn’t mean it didn’t come with some drawbacks. Due to the new shader specialization support, this property needed to be checked on every draw; this sounds costly and while overall it isn’t as bad as it first appears, there was a performance hit associated with it in multiple games, including Super Mario Odyssey and Xenoblade Chronicles: Definitive edition. riperiperi took it upon himself to have a crack at optimizing texture binding and shader specialization checks.
SMO and XCDE saw their performance return to pre-new cache levels, and while BoTW is performance limited by other factors, and hence was almost identical, there was a fairly large drop in FIFO, which is indicative of the emulated GPU being less loaded. Once the other bottlenecks the game experiences are shifted, this should see a nice payoff in the future. Feel free to check any games that felt slower after the new cache, hopefully they’re back to normal or at least close!
Nothing this good comes for free and it came at the cost of breaking resolution scaling for a couple of hours, before the new texture binding method was updated to take scaling into account, and certain titles like Super Zangyura required accounting for a complete pool change in the cache.
CPU/KERNEL:
Our CPU section this month starts on some CS:101. For anyone not familiar with data types and more specifically how numbers are stored, there are a lot of ways to do it: integer, short, long, float etc. Previously Ryujinx used an unsigned (must be positive) short to store the operand uses count, which takes up to the number 65535. If you try and store a value higher than this, you get what’s called an “integer overflow”, where everything will go back to 0 again! Limiting yourself like this is mainly just best practice, as data types that store higher values usually cost more in terms of memory. Unfortunately, some games actually do require this extra data, and so the type was switched with an unsigned integer which caps out at a fairly ridiculous number of 2147483647, so there is little chance of ever needing higher!
Taiko Risshiden V DX now heads in-game and potentially others too (the Switch has so many gaammmmesss!).
Stopping emulation is currently a bit like playing Russian Roulette but with your task manager. The problem is that there isn’t a single cause of the issue, and as games have got more complicated and are doing different things, a lot of recent releases will deadlock on close. Two such problems were isolated and resolved in the CPU/Kernel space this month, one which was caused by an invalid access event while a memory mapping was taking place, and the second caused by a bit of a paradox! When the ‘TerminateProcess’ function is called it will try and kill all running threads. The issue here is that TerminateProcess itself is being triggered on a thread of its own. Has anyone spotted the issue yet? This bug prevented the thread that called TerminateProcess from being unscheduled, and deadlocked itself in an infinite cycle.
gdkchan closes us out of this section with a regression fix from the large memory aliasing change a couple of months ago that was causing memory crashes on windows. These could be triggered most often when attempting to use or switch between games after running another. Finally, the entire kernel memory allocator was rewritten to be a bit cleaner and more readable for our contributors and maintainers. There are no expected bug fixes or performance improvements here, but as always there may be some $5 JRPG that now boots. Remember kids, write clean code!
SERVICES:
Diablo II: Resurrected is a weirdly popular title that has been in limbo since the recent networking overhaul. After those changes, the game would crash on boot as the newer methods handled all read and write calls on the same thread, causing a deadlock if these were needed at the same time. By allowing the service to increase its thread count to 2, the game once again will consistently boot.
However this fix wasn’t all that was needed to prevent other problems. By nature of allowing a process to be multi-threaded, you need to then handle the cases where one thread is processing while the other is trying to respond. This exact issue was causing other games that made use of the socket services, like Pokémon Sword/Shield, to crash on boot. The solution here was to return to a single-threaded approach from these requests, but to add a flag to prevent the blocking issue that caused Diablo to deadlock. In the future, returning to a multi-threaded approach will be the more accurate way to handle this, but the changes required to make everything play nicely would be large and time-consuming. For the time being this solution meets every game’s needs!
‘TimeZoneRule’ in the system time services got some love this month, as its use around the codebase was highly un-optimal and required the use of copies everywhere it was used. By making this ‘blittable’ (giving it a common representation that requires no special handling between managed and unmanaged code) it can reduce JIT overhead in a few cases and give a potential boost to any areas where this may have been a performance bottleneck. This was followed up with a minor bug fix and a fix for how time zones were displayed on the UI.
MISC:
Everyone’s favorite section of quickfire changes!
- All XAML files of the Avalonia project were formatted and elements should now be consistent and aligned.
- Simplified Chinese translations were added to the Avalonia project.
- Tying into last month, the build scripts for the project were finally switched to target exclusively Windows 10.
VULKAN PROGRESS:
As stated in previous reports, the work on Vulkan itself is mostly complete, and if you are a proud owner of an NVIDIA GPU, then it’s a damn fine experience! However, as outlined from the offset, one of the major goals of implementing a Vulkan backend is to make sure it plays (somewhat) nicely with both AMD and Intel’s graphics cards and drivers. This is not a small task, and trying to fix certain bugs that are occasionally limited right down to a certain generation of graphics card, especially when all the developers can do is guess given they don’t own the cards themselves, progress in this front has been difficult to say the least. Ironically enough, it’s actually Intel here who should take a bow, because while there have been some bugs they tend to be consistent across architectures, a far cry from the frequency that AMD seem to be able to conjure them.
But…. before I spin myself into another AMD hate rage, let’s look at some stuff that has been tracked down and fixed for you AMD people!
The final three issues were all Polaris-exclusive (anything RX 400/RX 500 and below), which was a real pain in the ass to track down. It turns out these cards just completely break 2D array textures with mipmaps when using the ‘VK_IMAGE_CREATE_CUBE_COMPATIBLE_BIT’ flag. By forcing a copy and not using this feature, the issue can be resolved at the cost of slightly slower cubemap creation. The performance hit is not significant and should be relatively unnoticeable in most games tested. Thanks, AMD!
There are of course about twenty-thousand other bugs that Team Red™ have exclusive monopoly over but we are, hopefully, approaching the endgame.
CLOSING WORDS:
The first half of 2022 has gone and left us all too soon, but there sure are some killer games releasing in the second half! A new Xenoblade, Splatoon 3, Nier (somehow), PERSONA (!!!), a new Sonic game and yet another Pokémon. Should be an action packed few months, eh? Once again, thank you to all our contributors for keeping us going over the years! It’s thanks to you guys that we can hopefully see all of the above working on Day 1.
As always, it’s the HR recruitment time of the report! If you know some C#, .NET, 3D-graphics or low-level engineering, you too can help this year be as smooth and bug-free as possible. If that's all wizardry to you, then donating to our patreon or being active in testing and bug-reporting really does help out a bunch.
See you all next time!