NemoRouter v1.0 is publicly available. This is the first generally-available release of our managed LLM gateway: one API key, one bill, every feature unlocked on every tier. Built on the open-source LiteLLM proxy core with a custom multi-tenancy, billing, and governance layer on top.

What ships at v1.0

Models

18 models live across Google Vertex AI (Gemini, Imagen, Veo, embeddings) on day one. OpenAI, Anthropic, AWS Bedrock, Azure OpenAI, Cohere, Mistral, and Meta integrations shipping in the weeks after launch — every new model lands in days, not quarters, because we extend the open-source core rather than fork it.

Smart routing

Pick a routing strategy (usage, latency, cost, simple-shuffle, least-busy). Configure fallback chains that transparently retry on backup models when a provider fails. Per-org retry counts, timeouts, and cooldowns. Tag-based filtering for capability-aware routing (vision, code, long-context). Every routing decision captured in observability.

AI guardrails

Five guardrails on every request, included on every plan from day one — no Enterprise paywall:

PII redaction (Microsoft Presidio)
Prompt-injection detection on adversarial corpora
API-key + secret scanning on prompts and completions
Abuse blocking
Response scanning

Configurable scope: organization > team > virtual key, with override semantics.

Prompt management

Server-side prompt templates with versioning, A/B testing, Jinja2 variables, and per-template cost tracking. Run experiments deterministically with traffic-split variants — same hash, same variant, every time.

Team management & budgets

Multi-org support with role-based access (Owner, Admin, Member, Viewer). Per-team and per-virtual-key spend visibility. Hard and soft budget limits at any scope. Invitation flow with single-org-per-user enforcement.

Credits as the product

Buy credits from $1. Tier 1 (PAYG, 4% platform fee) — $5 free credits on signup, no card required. Tier 2 ($100/mo, 2% fee). Tier 3 ($1,200/yr, 0% fee). Enterprise (0% fee, custom). Every dollar you commit goes to credits — platform fee is added on top, never deducted. Reserve+settle pattern under Postgres advisory locks means denied requests cost zero credits.

Observability

Request logs with model/status/time filtering and expandable detail rows. Logging callbacks to Langfuse, Datadog, S3, and Slack. Configurable per-org data policy: zero-logging, metadata-only, full-logging, or PII-redacted. 90-day request log retention; longer on Enterprise.

Security architecture (built-in, not gated)

Postgres Row-Level Security on every Nemo table — tenant isolation enforced at the database, not the application. Customer LLM traffic uses only virtual keys (sk-nemo-…), never master keys. Hashed virtual-key storage; plaintext shown once. Encryption: TLS 1.2+ in transit (HSTS preloaded), AES-256 at rest. Immutable audit trail on every administrative action. Reserve+settle credit ledger means negative balances are blocked at the database layer.

Compliance posture

SOC 2-aligned controls (active). ISO 27001-aligned controls (active). GDPR-compliant. HIPAA-eligible (BAA available on Enterprise). PCI delegated to Stripe. Underlying infrastructure (Google Cloud Run, Supabase) is SOC 2 Type II certified.

Get started

pip install nemoroutersdk      # or use the OpenAI SDK with our base URL