ChessSolve
ChessSolve
All posts

March 23, 2025

How We Cut Our Stockfish API Response Time from 15 Seconds to Under 400ms

Our Stockfish analysis API was taking anywhere from 2 to 15 seconds to respond. We had no idea why — until backend analytics showed us exactly where the time was going.

behind-the-scenesperformancebackenddevelopment

When you're building a real-time chess analysis tool, latency isn't just a nice-to-have metric — it's the product. If a user makes a move and waits three seconds for an engine suggestion, the experience breaks. If they wait fifteen seconds, they close the tab.

For most of ChessSolve's early life, our Stockfish API responses were taking somewhere between 2,000ms and 15,000ms. We knew it was slow. We didn't know why.

The Problem With Guessing

Our API endpoint takes a board position and returns the best move from Stockfish. The pipeline has a few moving parts: parsing the FEN, spawning or reusing a Stockfish process, waiting for analysis at the requested depth, formatting the response. Any one of those steps could be the bottleneck — or all of them could be, depending on the request.

Without visibility into the backend, we were guessing. We added some console.log timestamps, deployed, checked logs manually. It worked but it was slow to iterate and gave us no aggregate view — just individual request traces buried in a wall of output.

What we needed was a clear picture of response time distribution across all requests: were most requests fast with a long tail of outliers? Or was everything slow? The answer changes what you fix first.

Getting Backend Visibility

We added Statvisor to the backend. It tracks response times, request volumes, and error rates across our API routes — no custom event configuration needed, just instrumentation at the route level. Within a day we had a real response time distribution in front of us.

The picture was immediately clear: it wasn't a long tail problem. A large portion of requests were genuinely slow — 5–10 seconds regularly, with spikes past 15 seconds. And the pattern pointed directly at Stockfish process startup time.

What Was Actually Slow

Each API request was spawning a fresh Stockfish process, waiting for it to initialize, sending the position, waiting for the engine to reach the target depth, then killing the process. At low depth settings this was tolerable. At higher depths, combined with cold process startup, the time ballooned unpredictably.

The fix was persistent Stockfish processes with a simple pool — keep a process warm and ready, send positions to it, reuse it across requests. Process startup went from happening on every request to happening once at server boot.

The second issue the analytics surfaced was queue behavior under concurrent load. When multiple requests arrived simultaneously, they were all trying to spin up processes at the same time, creating contention. A request queue with a small pool of persistent workers solved this — requests wait for a free worker instead of competing for resources.

The Result

After both changes, backend analytics showed the new distribution clearly:

  • Typical response time: 200–400ms
  • P95 under 600ms
  • Spikes above 1,000ms became rare and isolated

That's roughly a 10–40x improvement depending on where in the old distribution you were measuring from. The user-facing difference is obvious — analysis feels instant rather than delayed.

What Good Backend Analytics Actually Shows You

The value wasn't in a single metric. It was in having a continuous picture of how the API was performing across all requests, so we could:

  • Confirm the problem was real and widespread (not just us imagining it)
  • Identify which part of the pipeline was responsible
  • Verify the fix actually worked after deployment (the distribution shifted immediately and visibly)
  • Catch regressions early if they appear

None of that is possible with ad-hoc log inspection. You need aggregate data over time, and you need it to be automatic — not something you set up manually for each endpoint.

Statvisor gave us that without adding much overhead to the stack. For a small team working on a performance-sensitive product, having backend response time visibility from day one would have saved us weeks of operating blind.

If you're building an API with real latency requirements and you're not measuring it yet, start now. The data will tell you something you don't already know.


Back to all posts