published on April 14, 2025 in devlog
Michi talks about the recent performance issues with two commodity exchange brokers and how he fixed them.
Michi (molp)
In software development, the issue with performance problems is that they show up when you least expect them and have other plans for the day. It's no difference for Prosperous Universe.
Last week players notified me, that the commodity exchange broker for STL fuel SF.NC1
is slow, and sometimes it wouldn't load at all. Since this broker handles an essential commodity, I stopped what I was doing and started looking into it. It quickly turned out, that the broker was consuming too much memory. It has grown so large, that the snapshot persister didn't want to write it back to the database. This would cause the broker to restart frequently and while doing so, it wasn't available for business.
The brokers work by accepting orders from players and keeping them in their order book. When two orders match, the trading partners are notified, money and commodities are exchanged and the order is removed from the order book. The order is not deleted immediately though, because the traders might want to have a look at it later. Naturally the list of orders the broker has grows over time. This is exactly what happened with this broker as well.
In the past, we had this exact problem before, which is why we have two mechanisms to reduce the number of orders. The first one is simply to go over all orders once in a while and delete the ones marked as deleted. This gets rid of a small percentage of the orders.
We can't delete the filled ones, because, as I said before, the players might want to have a look at them. To automatically delete the filled orders (mechanism number 2) we introduced a cut-off time: After 90 days the players' companies automatically delete the orders older than the cut-off time.
So, to fix the current performance problem I reduced the cut-off time from 90 to 60 days and moved on. Just a few days later, another broker ran into the same issue. It was clear, that reducing the cut-off time wouldn't do the trick. After analyzing a heap dump, I noticed that the reduced cut-off time didn't really do much at all. Something was off. So I dug through the orders of one of the broker, and noticed quickly that there are plenty of order data that is 2+ years old. This shouldn't be possible!
After quite a bit of staring at the code, it finally dawned on me! The deletion of the order was working, but it didn't delete the accompanying buys and sells data. Each buy and sell is identified by an id, that was used when creating the corresponding transaction between the broker and the company. We have a check in place that first checks if the transaction is complete before deleting any order data to prevent data corruption. The transactions are under strict housekeeping though, because there are so many of them. Their cut-off time is less than an hour.
So every time an order got deleted, the buys and sells data remained. There was no data about the transaction that created them left, and instead of interpreting that as a "go ahead, you're safe to delete the data" we interpreted it as "no can do". So, after fixing that, the brokers deleted an incredible amount of data and are up and running. The two offending brokers SF.NC1
and DW.NC1
saw a > 90% reduction in size!
Maybe you’ve already seen it on the forums: lowstrife had a look at the market makers and put in a lot of work and number crunching. Counterpoint validated his work and adjusted where necessary. He's now preparing a patch. We're planning on releasing these changes in May.
As always, we'd love to hear what you think: join us on Discord or the forums!
Happy trading!