A gateway that can't show you what happened is a black box. v1.8 ships the four-pillar observability surface — request logs, external callbacks, alerts, and a data policy that puts you in control of retention.

Request logs

/{org}/observability is a paginated log viewer over LiteLLM_SpendLogs. Filter by model, status, virtual key, time range. Expand any row to see the request body, response body, token counts, cost, latency, and which guardrails ran. 90-day retention on Tier 1–3; configurable longer on Enterprise.

Free-text search hits prompt content (subject to your data policy — see below).

Callbacks — ship logs to your stack

/{org}/observability/callbacks configures where we forward request logs in addition to retaining them ourselves. Five sinks supported on day one:

Langfuse — full trace, prompt template metadata, A/B variant included
Datadog — structured logs with cost / latency / token metrics
Amazon S3 — raw JSONL for your data lake, partitioned by date
Slack — high-signal events only (errors, budget breaches), never per-request spam
Generic HTTP webhook — your endpoint, your schema, we POST one batch every 10 seconds

Stored in nemo.logging_callbacks. RLS-scoped, secrets encrypted at rest.

Alerts — eight types, one toggle

/{org}/observability/alerts lets you switch on alerting for: LLM errors, budget threshold, budget exceeded, daily-spend anomaly, slow response, outage, key compromise heuristic, and guardrail block burst. Each alert routes through your configured notification channels (email / Slack / Teams / webhook — see notification settings).

Stored in nemo.alerting_settings. Defaults are conservative — we'd rather miss a soft signal than wake you up at 3 a.m. for nothing.

Data policy — your call, not ours

/{org}/observability/data-policy picks one of four modes:

Zero logging — store no prompt or completion content. Metadata only (model, tokens, cost, status).
Metadata only — same as zero, kept for naming consistency.
Full logging — store prompt + completion verbatim. Default for new orgs.
PII-redacted — store prompt + completion with Presidio redaction applied. Useful for healthcare / fintech orgs that can't justify "full" but want enough content to debug.

The policy applies before the log is persisted, not at read time — so even a compromised viewer role can't pull data the policy excluded.