Alerts are threshold rules evaluated continuously against a rolling window of your traffic. When a metric breaches its threshold the rule starts firing and sends an email; when it recovers it returns to ok and sends a resolve email. A rule that stays firing re-notifies on a cooldown.Documentation Index
Fetch the complete documentation index at: https://docs.foglamp.dev/llms.txt
Use this file to discover all available pages before exploring further.
Anatomy of a rule
| Field | Description |
|---|---|
| Name | A label for the rule (1–200 chars). |
| Metric | What to measure — see the table below. |
| Condition | Comparison operator: >, ≥, <, ≤. |
| Threshold | The number the metric is compared against. |
| Window | The rolling lookback: 5 min, 15 min, 1 hour, or 24 hours. |
| Filters | Optional — narrow to a modelId and/or agentName. |
| Notify | An email address to notify. |
| Enabled | Toggle the rule on/off without deleting it. |
Metrics
| Metric | Meaning |
|---|---|
cost | Total cost (USD) over the window. |
latency_p50 / latency_p95 / latency_p99 | Step-latency quantiles (ms). |
ttft_p95 | Time-to-first-token, p95 (ms). |
error_rate | Fraction of spans that errored (0–1). |
token_usage | Total tokens over the window. |
request_count | Span count over the window. |
eval_avg_score | Average score for a chosen eval. |
eval_pass_rate | Pass ratio (0–1) for a chosen eval. |
The two
eval_* metrics require selecting a specific eval — the rule scores
against that eval’s results rather than raw span metrics.How evaluation works
A background evaluator sweeps every enabled rule on an interval (ALERT_EVAL_INTERVAL_MS, default 60s). For each rule it queries the metric
over the window and compares it to the threshold:
- ok → firing — threshold breached. Records a
firedevent (with the observed value and threshold) and emails the channel. - firing → ok — metric recovered. Records a
resolvedevent and emails. - still firing — re-notifies at most once per
ALERT_RENOTIFY_MS(default 1 hour) so you aren’t paged every minute.

