Forecasting rate limits in claumon

June 2026 · Go · MIT-licensed

claumon is a small, fast dashboard for Claude Code - a single binary, zero config, one browser tab. It shows rate-limit gauges, per-session token and cost breakdowns, and historical trends. This page covers one piece: the forecaster behind the rate-limit gauges.

claumon session-window forecast showing projected utilization at reset with an 80% credible interval — The session-window forecast: projected utilization at reset with an 80% credible interval.

The forecaster reads the time series of utilization snapshots in each open rate-limit window (session, weekly, per-model weekly) and produces, per gauge, a point estimate of utilization at reset, an 80% credible interval, and a median ETA to a threshold (with its own interval) when that threshold is reachable before reset.

The design goal is honesty under uncertainty: the gauge never promises more headroom than the data supports, and the interval tells you how much to trust the number.

The model

Four pieces carry the forecast. The full derivation, calibration, and a worked example live in the spec.

The path law. Inside the open window, utilization accumulates as a Gamma process - a non-decreasing jump process. Conditioned on a mean rate $r$ , the increment over a horizon $s$ is

u (t_{now} + s) - u_{now} \sim Gamma (shape = \frac{r^{2} s}{σ_{session}^{2}}, scale = \frac{σ_{session}^{2}}{r})

so its mean and variance grow linearly in time:

𝔼 [u (t_{now} + s) - u_{now}] = r s, Var [u (t_{now} + s) - u_{now}] = σ_{session}^{2} s

Every increment is non-negative, so simulated paths are monotone and floored at the value now.

The rate, by empirical Bayes. The unknown rate $r$ combines the recent OLS slope ( $r_{OLS}$ , with standard error ${SE}_{OLS}$ ) and a prior fit on past sessions (mean $μ_{0}$ , variance $τ_{0}^{2}$ ) through a normal-normal conjugate update:

\frac{1}{τ_{post}^{2}} = \frac{1}{τ_{0}^{2}} + \frac{1}{{SE}_{OLS}^{2}}, r_{post} = τ_{post}^{2} (\frac{μ_{0}}{τ_{0}^{2}} + \frac{r_{OLS}}{{SE}_{OLS}^{2}})

The prior dominates early, when the slope is noisy; the data takes over as snapshots accumulate.

The forecast and its spread. The point forecast extrapolates the posterior rate to reset,

F = u_{now} + r_{post} Δ t_{rem}

and the law of total variance splits its spread into rate uncertainty (quadratic in the remaining horizon) and path noise (linear in it):

σ_{F}^{2} = \underset{rate uncertainty}{\underset{⏟}{Δ t_{rem}^{2} \max (τ_{post}^{2}, {\overline{τ}}^{2})}} + \underset{path noise}{\underset{⏟}{Δ t_{rem} σ_{session}^{2}}}

The forecast's spread is the sum of two pieces: path noise grows linearly with the time to reset and leads early, while rate uncertainty grows quadratically and takes over as more time remains. The marked point is where they contribute equally.

The reported 80% interval is read off the Monte Carlo terminal quantiles rather than $σ_{F}$ , so it keeps the right skew and the floor at the current value instead of forcing a symmetric Gaussian band.

The ETA. Because the paths are monotone, a threshold $C_{thr}$ is crossed once and stays crossed, so the first-passage time is well defined:

T^{*} = \inf {t > t_{now} : u (t) \geq C_{thr}}

The mean trajectory gives an early deterministic anchor,

{\tilde{T}}^{*} = t_{now} + \frac{C_{thr} - u_{now}}{r_{post}}

but the reported median ETA and its interval come from the same Monte Carlo. The Gamma skew pushes the median past this anchor, while near-zero rate draws - which may never reach the threshold before reset - stretch the upper tail.

The method is defined by a normative, versioned spec: it fixes the generative model and its calibration so implementations must match, with retired versions archived alongside a changelog.

View on GitHub · Download the spec (PDF)