Alerts

Alerts are threshold rules evaluated continuously against a rolling window of your traffic. When a metric breaches its threshold the rule starts firing and sends an email; when it recovers it returns to ok and sends a resolve email. A rule that stays firing re-notifies on a cooldown.

Anatomy of a rule

Field	Description
Name	A label for the rule (1–200 chars).
Metric	What to measure — see the table below.
Condition	Comparison operator: `>`, `≥`, `<`, `≤`.
Threshold	The number the metric is compared against.
Window	The rolling lookback: 5 min, 15 min, 1 hour, or 24 hours.
Filters	Optional — narrow to a `modelId` and/or `agentName`.
Notify	An email address to notify.
Enabled	Toggle the rule on/off without deleting it.

Metrics

Metric	Meaning
`cost`	Total cost (USD) over the window.
`latency_p50` / `latency_p95` / `latency_p99`	Step-latency quantiles (ms).
`ttft_p95`	Time-to-first-token, p95 (ms).
`error_rate`	Fraction of spans that errored (0–1).
`token_usage`	Total tokens over the window.
`request_count`	Span count over the window.
`eval_avg_score`	Average score for a chosen eval.
`eval_pass_rate`	Pass ratio (0–1) for a chosen eval.

The two eval_* metrics require selecting a specific eval — the rule scores against that eval’s results rather than raw span metrics.

How evaluation works

A background evaluator sweeps every enabled rule on an interval (ALERT_EVAL_INTERVAL_MS, default 60s). For each rule it queries the metric over the window and compares it to the threshold:

ok → firing — threshold breached. Records a fired event (with the observed value and threshold) and emails the channel.
firing → ok — metric recovered. Records a resolved event and emails.
still firing — re-notifies at most once per ALERT_RENOTIFY_MS (default 1 hour) so you aren’t paged every minute.

Each transition is stored as an alert event, giving every rule an auditable history of when it fired and resolved.

Alert emails require email to be configured on the deployment (RESEND_API_KEY). Without it, rules still evaluate and change state, but no notifications are sent. See self-hosting configuration.

List view

The Alerts page lists every rule with its metric, condition, threshold, window, current status (an ok or firing badge), and an inline enable switch. Plans cap how many rules an org can create; the Usage panel shows your current count against the limit.

Explore

Operate

Anatomy of a rule

Metrics

How evaluation works

List view

​Anatomy of a rule

​Metrics

​How evaluation works

​List view

Anatomy of a rule

Metrics

How evaluation works

List view