published on June 29, 2026 in devlog

Pruning the Past - Development Log #536

In this week's devlog Michi talks about tackling a performance issue related to the database.

Avatar Michi

Michi (molp)

Summer has us in its grip. Europe is reeling from a heat wave. Thinking is hard in this heat, but we made some progress nonetheless. But before I dive into it, let me remind you that PRO licenses are 15% off until the end of the month! There is not much time left.

In last week's devlog #535 I talked about dynamic inflation adjustments. I did start working on this, but there is nothing to see yet, so I'll postpone that to another devlog.

Instead, I want to quickly highlight what I also worked on this week. If you are a regular reader of the devlogs, you know that I regularly look into performance issues and try to fix them. The release of the last maintenance update uncovered a performance degradation in our data persistence system. That is a bit unusual, so it piqued my interest.

A quick reminder, we don't have a relational data model where entity data is stored in tables in rows, but rather an event-based one. Each "thing" in the game is an entity and for each entity we store a stream of events that alter its state. We do regular snapshots every 10k events, which greatly helps with restoring the state from the database. Only the events after a snapshot have to be applied to get to the current state.

The current universe is now 1952 days old. In the past, if we have been generous with the size or number of events, it hasn't been a problem. Now, some entities have collected so many events that the database runs into issues. The pacemaker is a prime example. The pacemaker is a central entity in the system that reminds entities to check if they need to update their state. If a company starts a new production order for example, the production behavior goes through the list of all production orders, finds the one which will finish next and sends a simple message to the pacemaker: wake me up once we reach this specific timestamp. The pacemaker persists these "ticks", as we call them. It now has well over 720 million of them. The pacemaker also uses snapshots, but I noticed that sometimes they would not be persisted. That was an easy fix, but the events themselves, even the ones before a successful snapshot, are not deleted. The main reason for that is that, if we ever introduce an update that renders the snapshots invalid, we can simply ignore the snapshot and restore the entity from the full event stream. If we had deleted the events, that would not be possible and fixing such an issue sounds like a nightmare.

Now we reached the point where we cannot ignore having millions of events for a single entity anymore. So we decided to implement a new system that not only keeps the latest snapshot, but also every snapshot in a retention window of 10 days. Every event before the oldest snapshot gets deleted, creating some breathing room for the database. In case we run into an issue, we can fall back on an older snapshot. In the beginning we will not roll this out to every type of entity, but only a few selected ones, like the pacemaker. The change is deployed on the test server and I am currently monitoring if we run into any issues.

I admit, that was pretty technical... but sometimes game development is just that. Thanks for reading if you made it that far :D

As always, we'd love to hear what you think: join us on Discord or the forums!

Happy trading!