Gather once more at the three-quarter mark, for beyond this point, your days become dark.
September was an unusually quiet month for our friends publishing games with the only games of note being a rather underwhelming Baten Kaitos remaster, alongside a questionable Mortal Kombat 1. At least the latter managed to bring some comedy to its otherwise haunting visuals.
Beyond those we’ve got stuff to chat about from all over the project from LDN, Mac improvements and a boatload of service work.
Hop on down.
Our first stop is at the port of Delfino, where shines lag is the name of the game. The title screen of Super Mario Sunshine (part of the 3D All-Stars collection) heavily stresses a couple of our buffer conversion shaders, specifically those converting the stride; stride being effectively the gap between elements of the same ‘type’ in a vertex specification or other dataset. If you made a shopping list such as: Tomatoes, 1, Bread, 1, Apples, 6, the ‘stride’ between your items is one, or in a computer, however much memory the number of each item takes up!
Either way, to not bore everyone with theory, we need to convert some buffer formats that SMS uses into something a little more sensible for your real GPU via compute shaders. Nvidia does not need these conversions and works perfectly fine without them, but AMD (and when forcing Nvidia to use the conversion) struggles significantly.
This is what writing 230MB in compute each time a buffer needs conversion looks like!
To reduce the impact of this insanity, we can instead device map the converted vertex buffers (as they’re only ever accessed from GPU) and also allow the conversion shaders themselves to scale the work-group size. This lends itself well to most dedicated GPUs that have more cores to work with. Even together the problem is not entirely eliminated, but the difference is still stark.
On the topic of rough performance, Mortal Kombat 1 was an unexpected thorn early into September as even our users with the highest end of systems were struggling to reach the native frame cap of 60FPS.
Further inspection and profiling revealed that MK1 was creating over 100 buffer textures that would all overlap at once. MK1 exposed a corner case in the buffer cache implementation where many buffer textures could be created as a view of fellow overlapping buffer ranges. If all of this is jargon, the basic outcome is that the scenario that the buffer cache was checking is fundamentally impossible and as such just a waste of time its, and your, time.
A nice 46% improvement can be seen once this has been corrected. The game also seems to have a maximum framerate cap, so the value here is probably higher!
Not content with just the one game being affected, a second issue where the texture lookup array was being resized on every lookup return was corrected, yielding some nice gains in some FIFO limited UE4 titles and coincidentally improving frametime stability in Mortal Kombat 11.
The real winner though is R-TYPE FINAL 2, which sees a staggering 750% improvement from 8FPS all the way to its engine cap of 60FPS. If a side-scrolling space shooter is just what you were dying to play, then now is the time.
September also marked the arrival of a Baten Kaitos I&II remaster, the first of which, if you’re interested in trivia, holds the longest 100% speedrun world record, clocking in at fourteen real-world days.
As is apparently only fitting, it brought with it a whole new service class: `ngc`. This was a bit of a mystery for a couple of days as no one could actually tell what it did. BKI&II seemed to register the service but never actually made any calls to it. Upon further inspection however, it seems that in firmware 16.0.0 Nintendo have moved their profanity and general input filtering checks into a service of their own.
NGC, “No Good Content”, seems to have taken over the role that used to be provided by the general firmware word blacklist, which has been used since the 3DS/Wii U days and comes in at close to 5,000 lines for us.
There are four parts to this new service:
- GetContentVersion - Simply grabs the version of the bad word dictionary to use from a firmware file `version.dat`.
- Check - These methods actually perform the heuristics on any text to determine words or strings to flag. There is a common dictionary of terms to always flag, and then a per-region specific dictionary that can check specific strings that are problematic in certain regions.
- Mask - This method replaces any bad words within a string to be asterisks (*) up to the first 512 characters; beyond this the string will not be processed. Other than that there is a rather crude email-address check, and new abilities to both ‘normalize’ text according to Unicode standards and transform a string into `canonical` format.
- Reload - What it says on the tin. Unmounts and remounts the system archives. Unknown use, possibly just a failsafe.
On the whole, quite a lot for basic word checking. We can’t show the list of generated terms and sub-strings in the various dictionaries for obvious reasons, but some of them are… imaginative!
Two of our already implemented services: `lbl` which (prior to firmware 10.0.0) controlled the backlight and screen services, alongside `wlan` which manages the general LAN services, were both moved to our new horizon project. We highlighted this when it was first added, but the core premise is that the way we originally handled a lot of service implementations had a number of key flaws. As there are a lot of services that we’ve implemented over the last 5 years however, this is moving piecemeal with services being migrated over time. More about this specific change was covered in the first progress report of this year.
Alas, we obviously cannot continue without answering the question. Can it run Crysis?!
The answer, prior to September, was a resounding no! Luckily it isn’t some GPU insanity, or a one-use, custom CPU instruction, just some network checks… honestly quite disappointing considering the game’s legacy. By stubbing the remaining unsupported BSD socket options, everyone's cult classic PC killer can actually get back to business.
We haven’t had to touch the audio services in a while, but September blessed us with the release of Ys X: Nordics which amplified a few issues with the implementation of the compressor effect in the audio renderer. There is a whole list of small inaccuracies that were cleaned up, and this allowed the title to find its voice.
Super Bomberman R 2 is the final title this month that poked holes in our fake software Switch with its use of brand new services of the `friend` class. As almost all of these types of services are useless without a connection to Nintendo, they could be easily stubbed and allow the game to be fully playable! Albeit a little blurry.
If any of you are developing a game, please let people disable anti-aliasing filters!
Way back in 2021, when everyone was stuck inside and begging for some multiplayer, we released a preview build of a feature more generally called LDN. In reality that is just the name of the services that handle the Switch’s local wireless functionalities and is something to be reverse engineered and implemented just like any other.
The initial preview got pretty popular and was “good enough” for most people who wanted multiplayer to make use of, but not really in a clean or accurate enough state for us to merge it into the main codebase. We never really intended a couple of years to pass but stuff happens and priorities change.
Now at the end of 2023, there is once again some focus on getting all of this organized. The initial ldn:u, INetworkClient interface and DisabledLdnClient implementations were finalized which is a huge piece of the puzzle, even if they don’t yet provide any of the framework for actually making use of the local wireless features. Stay tuned for the follow-up work which will implement the actual bridging of these “local” connections over a more useful target… the internet for instance.
For our macOS users, specifically those on M1/M2 chipsets, there were a number of titles that could very easily get stuck on boot, on loading screens, or almost anywhere else in gameplay. The issue was isolated to some skipped VCPU interrupts that now have their own dedicated VTimer in order to periodically interrupt execution, if the full call is missed. This allows titles such as Persona 5: Strikers, Bravely Default and Life is Strange: True Colors to be playable beyond a brief two minute session.
To round off this month, let’s jump through some quick-fire changes to the miscellaneous side of emulation.
- Async keyword usage was refactored around the Avalonia project to remove dependence on `async void` in favor of `async Task`. This also allows the game-list to load without blocking the program launch once again.
- macOS releases now contain a Headless workflow for those who prefer to run their emulators via command line, or via a launcher. Those can be found on our Github release page.
- Additional settings were implemented to the aforementioned headless builds such as scaling filter selection, anti-aliasing options and an option to make use of SDL’s Exclusive Fullscreen mode, offering the lowest possible latency for those who don’t need a GUI.
As September has already faded, we hope that everyone will have an excellent autumn period. We ourselves are juggling a number of different tasks, some of which are finally catching up with us! Our resident LDN fiend is deep in the weeds of finally getting all our multiplayer work collected, and those with a Steam Deck are pulling their hair out trying to fix whatever quarrel Avalonia and Gamescope are fighting over.
As usual we’d like to thank all our supporters, no matter what form that may take. If you’d like to chip in in any way you can, we always welcome code contributions on GitHub, donations to our Patreon and helping fellow users via our Discord. Until next time friends!