Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.foglamp.dev/llms.txt

Use this file to discover all available pages before exploring further.

Evals score your real traffic automatically. An eval is a rule that samples matching traces or spans after ingest and scores them — either with a fast, deterministic code check or with an LLM judge. Scores show up on the trace they came from, roll into the Overview pass-rate KPI, and can drive alerts.

Creating an eval

Evals are created in a short wizard:
  1. Target — what to run on. Choose the level: trace (score the whole run) or span (score individual steps). Optionally filter by agent, trace name, and — for span-level — span type and model.
  2. Check — what to verify. Pick a preset (below).
  3. Score — how to score. For LLM judges, pick a judge model; for parameterized checks, set the parameter (a substring, pattern, or max length). Set a sample rate (1%–100%) to control how much matching traffic is scored.

Code checks

Deterministic, free, and run with no external calls:
PresetChecks
No PIIOutput is free of emails, phone numbers, SSNs, cards, IPs.
No secret leakOutput contains no API-key / token / private-key shapes.
Valid JSONOutput parses as JSON.
No refusalOutput isn’t a refusal.
Non-emptyOutput isn’t empty.
Max lengthOutput is within a character budget.
Contains / Excludes textOutput does (or doesn’t) contain a substring.
Regex matchOutput matches a pattern.
Tool args valid(span-only) A tool call’s input is a valid JSON object.

LLM judges

Judges send the input/output to a model that returns a 1–5 score or a pass/fail verdict with a reason. Presets cover relevance, helpfulness, coherence, conciseness, instruction-following, completeness, toxicity/safety, tool selection, and RAG-oriented checks (faithfulness, context relevance, correctness vs. a reference).
LLM judges are bring-your-own-key. Add a provider key (below) before creating one. An eval with no usable key shows the status needs key and doesn’t score until a key is added. The set of available judge models is defined by the deployment.

Provider keys

The Provider Keys page stores the LLM-provider API keys your judges use, encrypted at rest and scoped per project. Keys are write-only — once saved, the value is never shown again; the page only indicates which providers are configured. Add or replace a key with upsert, or delete it.
Provider-key encryption requires FOGLAMP_SECRETS_KEY (32+ chars) to be set on the server. Without it, the page shows “Encryption not configured” and judge evals can’t run.

Eval detail

Opening an eval shows its recent activity: scored count, average score, pass rate, and judge spend over the last 7 days, plus a table of recent scores (target, pass/fail or numeric score, the reason, and when). Each enabled eval also has a status — ok, needs key, or error — and an inline toggle.