Inference, but actually cheap.

4× cheaper than OpenAI / Anthropic.
Same wire shape.

QGRE serves Qwen3 models on consumer Blackwell hardware. Drop-in replacement for OpenAI Chat Completions andAnthropic Messages — point your existing SDK at a different base URL and you're done. Pay with PIX, OXXO, or any major card.

Cost reduction
vs OpenAI / Anthropic at comparable quality
Wire-compatible
2
OpenAI + Anthropic SDKs work unmodified
Payment methods
PIX · OXXO · Boleto · cards
LATAM-first checkout

Built so it gets out of your way

Drop-in OpenAI compat

Set base_url to api.qgre.com/v1. Streaming, max_tokens, temperature, stop sequences — everything works.

Drop-in Anthropic compat

Same trick at api.qgre.com/anthropic. The router translates Messages JSON ↔ Chat Completions internally; SSE event shapes match.

Cost-tier routing

Pass model: "auto" + X-QGRE-Cost-Tier: cheap. The router picks the cheapest healthy upstream — your bill follows the workload shape, not a flat rate.

Prepaid credits, $5 minimum

No surprise invoices. Auto-topup when balance drops below a threshold; cancel any time via the Stripe portal.

Outbound webhooks

Subscribe to usage.completed events. Standard-Webhooks-signed (HMAC-SHA256), retries, replay, audit log — all of it managed via Svix.

Live tok/s + KV reuse

The chat demo shows real-time tok/s and KV-cache prefix reuse. The numbers don't lie — that's what makes the price possible.

Pricing

Free demo plan. $20/mo dev tier, $99/mo pro with 99.5% SLA, plus a coding-pro flat rate for predictable cost in currency-volatile markets.

See full plans →