Platform API and processing
May 7, 2025 11:42 CEST
May 7, 2025 09:42 UTC
[Investigating] We are investigating elevated failure and rates and processing delays.
We're focusing on stabilizing the system first, and will then proceed to reset failed documents.
May 7, 2025 11:59 CEST
May 7, 2025 09:59 UTC
[Monitoring] The system seems to have stabilized, and we'll increase throughput again to wok on the processing backlog.
May 7, 2025 13:49 CEST
May 7, 2025 11:49 UTC
[Monitoring] We're still throttling to avoid overloads while processing the backlog. We've also disabled some write-ahead caching of ground truth data to reduce pressure on certain components, and are looking into more ways to throttle upstream.
As to a cause, we've seen slowly increasing response times on some queries that we're looking into. That database looks healthy otherwise.
May 7, 2025 14:14 CEST
May 7, 2025 12:14 UTC
[Monitoring] The processing backlog has been worked through, and we'll reset failed documents of the past hours so they get processed too.
Formulation of preventative measures is ongoing.
May 7, 2025 14:53 CEST
May 7, 2025 12:53 UTC
[Monitoring] System has been behaving well since 14:25. Failed docs have been reset.
We'll continue to monitor closely and look into the root cause.
May 8, 2025 08:13 CEST
May 8, 2025 06:13 UTC
[Resolved] No more issues. Upstream throttling implemented to stabilize will remain to reduce the likelihood of this issue going forward.