July/August Progress Report

August is over, and like July, it was packed with great updates. Let's take a look at the improvements that Ryujinx received these last two months!

July/August Progress Report

August is already over, and like July, it was packed with great updates. Let's take a look at all the improvements that Ryujinx received these last two months!

GPU improvements

Support for inline index buffer data:

Inline index buffers are index buffers that are sent to the GPU directly on the command stream. While NVN never makes use of this functionality, it is required by the OpenGL API. gdkchan implemented support for the functionality, which fixed an issue where games had corrupt or non-existent index buffer data. This allows a few of those games to now render; one of which is NinNin Days.

This also fixed issues on a few homebrew applications that use Nouveau (open-source NVIDIA GPU driver for Linux, ported to the Switch).

Buffer to 3D texture copies:

Since its launch, Persona 5 Scramble had a litany of problems that made the game unplayable on Ryujinx. One of them was a GPU issue that prevented most of the 3D scenes on this game to be rendered properly, if at all. In the image below, we can see that the main menu is not properly rendered.

As you can see above, the van and some of the characters are completely black, while others have incorrect colors. The issue was caused by an incorrect 3D texture that was being used as a color remapping texture. To put it in simpler terms, the texture was basically a color palette. Since the palette was incorrect, the final render also had incorrect colors. The issue turned out to be an incorrect buffer to texture copy. Fortunately, it was pretty easy to fix! Now the main menu renders perfectly:

So, with this change, it should properly render in-game scenes too, right? Well...

Looks like we have a problem here.

After fixing the 3D texture issue, developer gdkchan decided to investigate the other GPU related issues affecting this game. It turns out the above issue was caused by the LEA (Load Effective Address) shader instruction not being implemented. There was also another problem related to the handling of bindless textures on the shader translator. One shader instruction and a small enhancement later, the game now renders much better:

Not perfect, but a step closer. There are also other recent improvements that affected this game, and we'll be talking about them later in the post.

Logical operations:

Xenoblade Chronicles: Definitive Edition had a number of issues on the emulator at launch, as we mentioned on our last progress report. One of them was a strange lighting issue that we can see on the screenshot below:

Xenoblade Chronicles: Definitive Edition - Before

This was caused by the lack of support for logical operations on the emulator a GPU feature that is rarely used. Xenoblade Chronicles: Definitive Edition is one of the few titles that makes use of it.

Luckily, implementing it was simple, and fixed this particular lighting issue:

Xenoblade Chronicles: Definitive Edition - After

This issue was fixed by riperiperi.

Depth buffer copy improvements:

For a long time, Pokémon Sword and Shield had an issue on this emulator where fog and other similar effects were not being rendered properly. This made some parts of the game very difficult to play.

On the screenshot below, you can see how fog previously rendered in this game:

The game also had an issue with depth of field blur, wherein blur was being applied to the wrong regions of the image, as you can see below:

The issue was caused by a broken copy of a depth texture. Developer gdkchanfixed the copy issue, making the fog render properly on this game, while also fixing the depth of field blur issue, seen in the screenshot below:

Transform feedback:

In our last progress report, we talked about the most annoying graphical bug exhibited by Xenoblade Chronicles: Definitive Edition in the emulator since its launch. The issue was caused by the lack of transform feedback support and also affected, to some extent, Xenoblade Chronicles 2. Transform feedback can be used to write the output of the GPU vertex transformation stage to a buffer, in addition to rendering it to the framebuffer. The feature is essentially useless in today’s graphics with the advent of compute shaders, but since the Xenoblade engine is fairly dated, it still makes use of this feature.

On the screenshot below, we can see the broken grass on Xenoblade, which looks more like blades coming out of the ground.

With the transform feedback implementation made by gdkchan, the grass renders properly on this game:

With this fix, the game is now much more playable.

Point size and other parameters:

Most of the time, the GPU renders triangles. But it is also possible to do point rendering, where the GPU draws points on the destination framebuffer. Super Mario Odyssey uses this feature to generate fog effects in some kingdoms, such as the Cap Kingdom. The support for this mode in the emulator was incomplete which caused issues with the fog. Thanks to the recent improvements made by mageven, it now renders and functions properly:

Before the fix, it was possible to enter inside the fog. Now it functions properly, fading away as Mario approaches. Water ripples and other similar effects were also fixed.

Fast constant buffer updates:

Constant buffers are GPU buffers that holds values that will be accessed by the shader. They store parameters used on the shader. Said buffers are read-only for the shader (hence why they are called "constant"), and they also can't be modified during shader invocations. NVIDIA has a special command that allows updating data on any constant buffer. In the emulator, that was done one word (a 32-bits value) at a time which is, as you can imagine, painfully slow. It was not a big issue for games that make minor to no use of this functionality, but for some games that leverage it heavily to transfer large amounts of data, it caused a severe slowdown.

gdkchan optimized those transfers by copying whole chunks of data at a time, instead of one word at a time. This improved the performance of some games significantly. One of those games is Yume Nikki: Dream Diary; you can see the performance difference in the videos below.

Before:

Yume Nikki: Dream Diary - Before

After:

Yume Nikki: Dream Diary - After

Other games, like Diablo III Eternal, also had dramatically improved performance with this change.

BGRA texture format:

OpenGL, unlike Vulkan, Direct3D, Metal, well... pretty much any other graphics API, unfortunately has no support for BGRA textures. For this reason, when games used a BGRA texture, the emulator would use a RGBA format instead. And, as expected, this caused issues on games that actually render something to the BGRA textures, and the red and blue components would be inverted.

This caused many games to have incorrect colors, as you can see in the screenshots below:

We also mentioned those issues on our previous progress reports. gdkchan solved the issue by converting the texture to the correct format before use as needed. The RGBA format is still used on OpenGL, but now shaders have special code to swap the components before they are written to the framebuffer. Additionally, for sampling, they are swapped using component swizzle, and for copies that should convert from BGRA to RGBA, it now uses a pixel buffer copy to perform the conversion. None of this is "free", of course; it comes at a performance cost. However in our tests, the games didn't have a noticeable performance impact due to the conversion.

Without further ado, below you can see how these games look with the BGRA issue fixed:

Alpha testing:

Lack of alpha testing caused transparency issues on many games. Objects that were supposed to be transparent were rendered with a solid color instead. The affected games included Astral Chain, Monster Hunter Generations Ultimates and MegaMan 11, to name a few. Below you can see a screenshot of MegaMan 11 that highlights the issue:

Alpha testing is supported on OpenGL, however this feature was deprecated, which means that it may not be supported at all when using newer versions of OpenGL. For now, we implemented it using this legacy feature, and it should work on most driver/GPU configurations. However, we plan to also add an alternative implementation in case it is not supported later. But for now, we can see that improves rendering in those games:

Some of them are indistinguishable from a real switch after this fix!

Other notable improvements:

gdkchan implemented a Macro JIT. Macros are basically small programs that can read and write GPU registers, are then uploaded, and can be later used by inserting macro calls on the command stream that is sent to the GPU. The Macro JIT can speed up the execution of those programs, by translating them to code that the CPU can execute directly. This can give a modest speed boost to some games.

mageven improved OpenGL debugging support. It is now possible to enable debugging directly from the GUI, without special builds or code modifications. Errors and performance hints from the driver are printed to the console directly with it enabled. This improvement makes the life of developers a bit easier.

Resolution Scaling

Developer riperiperi added support for resolution scaling. Anyone that used modern emulators should be aware of what this feature does. It increases the internal resolution that the game uses to render, improving the resolution and quality of the final image that is presented on the screen. This feature can be used to make Nintendo Switch games render at much higher resolutions, like 4K, something that the original console is not capable of doing.

Just look at how beautiful Xenoblade renders at 4K:

Be sure to check out the post we made about resolution scaling, if you haven't already: https://blog.ryujinx.org/introducing-resolution-scaling/

NVDEC & VIC

The NVIDIA Decoder, or just NVDEC, is the name of the NVIDIA video decoder that can be found on their desktop graphics cards, and also on the mobile Tegra SoC used by the Nintendo Switch. It is the piece of hardware responsible for video decoding on the console. While some games use software decoding (where all video decoding is done on the CPU), most games will use the hardware decoder since it is far more efficient.

Since the emulator had no NVDEC support before, videos were rendered completely black.

With NVDEC, they now render properly, as we can see in the screenshots below:

Yes, the last one is also a video! Thanks to NVDEC support, Zelda Link's Awakening title screen also now renders properly. Super Mario Odyssey also makes use of videos in a few places. One of them is the small control tutorials that pop up during the game that shows how to perform certain actions. Before, they were rendered as a black square. Now, as you can see in the screenshot below, it renders properly:

We made a blog post about NVDEC recently, so be sure to check it out if you haven't already: https://blog.ryujinx.org/introducing-nvdec-support/

CPU improvements

Fix PPTC on Windows 7:

A few months ago, we introduced a feature that we call "PPTC". This feature allows saving JIT generated CPU code on the disk. It is basically like a shader cache, but for the CPU. It allows reducing the load times of several games, since it no longer needs to spend time recompiling the ARM code to something the host CPU can execute (usually x64).

Some Windows 7 users reported that enabling the feature caused crashes. Thanks to the investigation conducted by LDj3SNuD, we discovered that an optimization was causing the issue. Due to the way memory allocation works on Windows 7, a very low memory address was being used for the guest page table, which caused an optimization to kick in, and later, the relocation step for the cache to fail. This caused the crash later on due to an invalid memory access.

LDj3SNuD fixed the issue. Windows 7 users can also enjoy PPTC now!

CRC32 support:

riperiperi added support for the CRC32 instructions on the 32-bit mode, and improved the generated code for the implementation on the 64-bit mode. This instruction is used to compute a value that is used to check the integrity of data. Implementing these instructions allowed the game Monster Hunter Generations Ultimate to go in-game for the first time.

The game requires a save to get that far, though. There is still a problem preventing a new save from being created, which seems to be related to the software keyboard applet implementation. However, the game is fully playable with a save.

More CPU instructions:

LDj3SNuD implemented a few missing instructions (FMAXV, FMINV, SSHL and USHL), required by games like Trine 4 and FATE Extella/LINK. Thanks to this, games like FATE Extella can also now go in-game, and as far as we can tell, it's fully playable:

He also implemented a few other 32-bit instructions (VADD and VSUB wide variants) that allowed games like Duke Nukem 3D: 20th Anniversary World Tour to progress further.

External contributor valx76 also implemented a few 32-bit instructions (VBIC, VTST and VSRA) that allowed a few games to boot further.

32-bit instruction fixes:

Developer LDj3SNuD fixed a bug on the VNEG 32-bit instruction. This fixed a black screen issue on a few games, including Blaster Master Zero, which now renders properly, as you can see on the screenshot below:

He also fixed a bug in some fused 32-bit instructions, among others, that allowed games like Duke Nukem 3D to go in-game.

Most notably, this also fixed a bug on Tokyo Mirage Session #FE Encore that caused the game to essentially hang on specific areas: namely, one of the very first parts of the game. With this fix, the issue no longer occurs and the game can now be considered playable:

Other notable improvements:

External contributor FICTURE7 has been providing some minor fixes and optimizations to the JIT. They bring small but welcome speed improvements to the CPU emulation.

LDj3SNuD also implemented a few more CPU instructions, not mentioned here, required by a homebrew application.

HLE improvements

Internet connection:

On the last progress report, we mentioned the game Burnout Paradise Remastered. It was not able to go in-game before due to an unimplemented service function. The service seems to be used to check for online connection. Developer mageven found a workaround that allowed those games to work without the service (nor internet connection). This allows Burnout Paradise Remastered, Dead or Alive Xtreme 3 Scarlet and many others to go further, some being fully playable. Below you can see a screenshot of Burnout Paradise Remastered, which also renders properly now thanks to the BGRA fix:

Modding support:

mageven introduced support for game modding. This allows replacing game data (including files inside the ROM, and patching code) with custom data. This enables users to modify their games easily, and do things like inserting new characters, changing some game mechanic, changing the FPS cap, disabling post-processing effects, using fan translation (enabling users to play games in languages they were not officially localized to), and the list goes on.

Prepo service update:

Thog implemented new functions introduced on the Play Report (prepo) service on the latest Nintendo update (10.x), required by games like Animal Crossing after the 1.3.0 update. This update added a wet suit and support for diving into the game.

AM, audio and eShop related service functions:

Ac_K implemented and stubbed several functions from the Applet Manager, audio and eShop services. This allowed a large number of games to boot further, some reaching in-game status, and even becoming playable! One of the games that was fixed by the implementation of the eShop functions was Pokémon Café Mix.

Libhac improvements:

Libhac is the library that Ryujinx uses for everything related to filesystem (both the service that exposes filesystem related operation to the application, and also anything that performs filesystem access on the host). It is also written in C# and is maintained by Thealexbarney. He implemented save related functions required by Mortal Kombat 11 (which now boots further and shows the title screen). He also implemented other improvements to the library and the emulator, including a fix to a crash that could occur when a game attempted to delete a save file.

Other notable improvements:

mageven improved logging performance. Now the log should have no observable performance penalty when disabled. Before, there was still some cost for the string formatting and the like, even if the log was never written.

gdkchan improved the kernel implementation, fixing a bug that caused a "ResLimitExceeded" error spam in a few games, and also improved the accuracy of some system calls, that now matches what the Nintendo Switch kernel does.

New audio renderer

The highlight for August was the new audio renderer, a complete implementation made by Thog, codenamed Amadeus (a reference to the famous composer Wolfgang Amadeus Mozart), which is the result of several months of reverse engineering and development.

The new audio renderer brought audio improvements to numerous games, with a large number of audio bugs fixed: audio looping forever, not stopping when it should, missing effects or audio missing entirely (as was the case for some voices on Guilty Gear XX Accent Core Plus R, and music on Hatsune Miku Project DIVA Mega 39's), just to name a few of the issues that were fixed.

Perhaps more surprisingly, the new audio renderer not only fixed audio issues but also resolved other issues that, at least at a first glance, did not appear to be related to audio at all. These included broken dance animations on Super Mario Odyssey, a softlock after the first video on Persona 5 Scramble, and a softlock at the end of Splatoon 2 Marina and Pearl intro announcement.

Thanks to NVDEC, the video cutscenes (which the game has in a quite high amount) also work, and the game is basically playable, with only minor bugs left to be fixed.

We made a blog post about this new audio implementation; be sure to check it out if you haven't already: https://blog.ryujinx.org/introducing-complete-audio-rendering-support-amadeus/

GUI improvements

Software Keyboard dialog:

The Nintendo Switch OS has an applet called Software Keyboard, or just "swkbd". It is a touch dialog used to input text. Ryujinx lacked a proper UI for this, so it had a hardcoded text that was returned every time the game invoked this applet. This led to everything that required text input being called "Ryujinx". Thanks to mageven this is no longer an issue, as a dialog box was implemented that allows the user to type in the text they want (usually a prompt from the game to enter a character name or location).

This also allowed some games to progress further. For example, The Caligula Effect requires the player name to have at most 6 characters while the default text had 7, so it would always fail at this step. Now that submitting a custom name is possible, the game can now reach a in-game status!

Another game that was helped by this change was PriPara: All Idol Perfect Stage. It's a game only released in Japan that doesn't seems to like non-Japanese names. Being able to input custom names solves the issue, and the game is now playable:

And, of course, this helps several other games, including games like Pokémon, enabling players to give custom names to their Pokémon (and also to the main character, naturally).

Other notable improvements:

External contributor SeraUX added the ability to add multiple game directories at once. Before, it was only possible to add one at a time. This same contributor also added better PPTC cache management options to the GUI, allowing the user to open the folder where the cache files are stored, and also to purge (delete) the cache. This can be quite useful if there is some problem caused by PPTC that needs manual user intervention.

XploitR added an option to select the audio backend used. It is now possible to select "OpenAL" or "SoundIO". Before, it would use "OpenAL" by default, or "SoundIO" if the former was not installed. The latter does not work with all games, but it usually sounds better on the games where SoundIO does work. In the future we plan to fix these issues and have only one audio backend. But for now, this change allows the user to select the backend that gives them the best audio experience.

mageven improved the sorting of time zones on the settings window. This makes it easier for a user to find the correct time zone on the list.

Closing words and what's next

That's all for today! These last two months were packed full of interesting updates and we have more coming soon. As always, we would like to thank all our patrons and contributors; your support means a lot to us!

If you have been following our progress closely, you probably already know what's coming, but I'll mention it anyway. We have been working to enable multiplayer with players from all over the world, using the switch local play functionality. The feature is still in beta so if you are interested, be sure to check it out and report any problems on our Discord!

If you can write code, knows C# and find emulators as fascinating as we do, feel free to stop by on our Discord. Ask questions and contribute some code; new contributors are always welcome!