I recently got around to improving the performance of Relight, a simulator for a smartphone game, which includes both an interactive gameplay simulator and a Monte Carlo bulk simulator that runs thousands of simulations and outputs statistics for a given configuration. This application is largely written in Kotlin/Multiplatform and features a Kotlin/JS frontend client and both Kotlin/JVM and Kotlin/JS backends which contain the actual logic. In this article, I will be covering improvements to the Kotlin/JS portions.
All of the improvements I made were based on profiling, using both browser profilers (Chrome and Firefox) to identify what specifically affected the JS backend's performance, as well as the IntelliJ profiler for identifying hot paths in general. Doing so was essential to ensuring focus on the areas which had the most impact and revealing non-obvious sources of overhead, especially those specific to Kotlin/JS.
General Kotlin/JS Improvements
In this section, I list the optimizations I made which apply more generally to Kotlin/JS applications and not just Relight. These changes also primarily benefit bulk simulations and not interactive simulations, where the performance overhead comes from other places.
Relight makes heavy use of
HashSet objects, particularly for the implementation of buffs, which need to be queried by the type of effect they have and support removal in an arbitrary order. After profiling bulk simulations, it became clear that a lot of time was spent on
HashSet related operations (
contains(), etc.), making them clear targets for optimization.
Set types, though they do not support configurable equality comparators. For
String instances equality is by their content, and for
Set into Kotlin/JS code, I wrote wrappers around them which implemented the respective Kotlin interfaces.
Use Doubles where possible
For some operations, particularly damage calculation, there was a risk of integer overflow, so I had previously switched to using the
Long type instead of the
Int type for calculations in several places. In the JVM this was not a big deal and the performance impact was minimal. However, it turns out that they have a fairly noticeable performance impact in Kotlin/JS.
Number which is a 64-bit (double precision) floating point number type, and
BigInt which is an arbitrary precision integer type. The 32-bit Kotlin
Number, so the implementation is relatively simple. However, the 64-bit Kotlin
Long type does not, so it is implemented as its own class, which has a greater performance impact than
Int does, since the implementation of arithmetic operations is more complex and involves two
Number instances under the hood.
In the case of damage calculations, the requirements were that the numeric type used was large enough to store intermediate values and that intermediate results had integer values. The Kotlin
Long type would be good for this purpose if not for the aforementioned performance issues. On the other hand, the Kotlin
Number type, incurring essentially no additional overhead compared to other numeric types. Flooring after operations sufficed to ensure intermediate results were integers. I wrote an infix extension function for floor division for notational convenience.
Combined with the
Set changes, end performance was about 3-5x where it had started for bulk simulations (completing an equal number of simulation iterations took about 1/3rd to 1/5th as much time as before).
Avoid Kotlin Serialization
The official Kotlin Serialization (
JSON.parse() is much faster at the cost of feature richness. To make use of it, rather than defining
JSON.stringify() on the result.
Using the native
Relight Specific Improvements
In addition to the preceding optimizations which are applicable to Kotlin/JS applications in general, I also made some improvements more specific to the functionality of Relight. Coincidentally, these all applied to the interactive simulation mode.
Cache formatted strings
Even after excluding the performance overhead of any client-side display functionality in interactive simulations, they ran much slower than bulk simulations. Typical devices can run hundreds of individual simulations per second in bulk. Interactive simulations ran orders of magnitude more slowly. Profiling revealed the answer to why: log strings. Formatting log strings, which bulk simulations skip for all but one iteration, was by far what took the most time in interactive simulations.
Interactive simulations work by storing the sequence of actions (commands) previously taken and re-running the simulation each time a new action is taken. Storing just the sequence of actions makes implementing save-state and undo functionality quite straightforward and lightweight. However, this meant doing the expensive work of formatting log strings each time a new action was taken.
The solution was to cache log strings and skip computing them when it was unnecessary. Since taking another action only added more entries to the end of the existing log, doing so was as simple as skipping as many log calls as there were in the previous log and reusing the existing entries. Adding undo functionality just meant storing the history of the length of the log at each point in time and truncating the log to match its previous length whenever an undo happened.
Avoid recreating identical HTML elements
In the client webpage, interactive simulations display a log of past events which basically appear as a bunch of
<div> elements. Originally, all of these elements were cleared and recreated whenever the log was updated, which is obviously inefficient. Retaining old log entries that did not need to be changed after each update substantially reduced the overhead of updating the displayed log.