New: Vision & Audio Support Available

Intelligence infrastructure
for the next generation.

Wizzx provides the high-performance primitives you need to build reliable AI products. Sub-50ms latency, global edge caching, and simple composability.

Documentation
example.ts
1400">class=400">class="text-amber-400">"text-zinc-500 italic">// Initialize the client with your secure key
2400">const wizzx = 400">new 400">WizzxClient({
3 key: 400">process.400">env.400">WIZZX_KEY,
4 region: 400">class="text-amber-400">'us-east-1'
5});
6
7400">class=400">class="text-amber-400">"text-zinc-500 italic">// Stream response with a single call
8400">const stream = 400">await wizzx.400">chat.400">completions.400">create({
9 model: 400">class="text-amber-400">'gemini-2.5-flash',
10 messages: [{ role: 400">class="text-amber-400">'user', content: 400">class="text-amber-400">'Analyze 400">this data...' }],
11 stream: 400">true,
12});
13
14400">for 400">await (400">const chunk of stream) {
15 400">process.400">stdout.400">write(chunk.400">choices[0].400">delta.400">content);
16}
Powering teams at

Designed for scale

Everything you need to move from prototype to production without changing your stack.

Global Edge Network

Intelligent routing sends requests to the nearest available data center, ensuring sub-50ms latency worldwide.

Model Orchestration

Route tasks to the cheapest capable model automatically.

Zero-Trust Security

SOC2 compliant infrastructure with per-request isolation.

Unified API Surface

Switch between models (Flash, Pro, Ultra) by changing a single string parameter. No code rewrites required.

Real-time Streaming

Native support for Server-Sent Events (SSE).

Cost Analytics

Token-level granularity on spend tracking.

Drop-in SDKs

Type-safe libraries for Node, Python, and Go.

Integration

One API to rule them all.

Stop managing multiple provider keys, varying response schemas, and retry logic. Wizzx unifies the best models under a single, typed interface.

  • Unified Type Safety across all models
  • Automatic retry with exponential backoff
  • Response caching at the edge
  • PII redaction before inference
Wizzx

Transparent pricing

Pay only for what you use. No hidden fees.

Hobby

$0/month

Billed monthly

  • 5,000 requests / month
  • 3 concurrent connections
  • Community Discord
  • Flash Model Access
Recommended

Startup

$49/month

Billed monthly

  • 500,000 requests / month
  • 50 concurrent connections
  • Email Support < 4hr
  • Fine-tuning (3 models)
  • 99.9% SLA

Scale

$399/month

Billed monthly

  • Unlimited requests
  • Dedicated GPU clusters
  • Slack Connect Channel
  • Custom Model Training
  • 99.99% Uptime SLA

FAQs

We support all major AI models including OpenAI GPT-4o, GPT-4, Claude 3.5 Sonnet, Claude 3 Opus, Gemini Pro, Llama 3, Mistral, and more. New models are added regularly as they become available.

We use a simple pay-as-you-go model. You only pay for the tokens you consume, with no monthly minimums or hidden fees. Prices vary by model and are transparently displayed in your dashboard.

Yes, security is our top priority. All API keys are encrypted at rest and in transit. We never store your request content, and you can rotate or revoke keys instantly from your dashboard.

Rate limits depend on your plan. Free tier allows 60 requests per minute. Paid plans start at 1,000 RPM and can scale to unlimited with enterprise plans. All limits are clearly documented in your dashboard.

Absolutely! Our infrastructure is designed for production use with 99.9% uptime SLA for paid plans. We handle load balancing, failover, and auto-scaling so you can focus on building your product.

Simply sign up with your email or Google account, grab your API key from the dashboard, and start making requests. Our API is OpenAI-compatible, so you can use existing SDKs with minimal code changes.

Yes! Free tier users get community support via Discord. Paid plans include email support with 24-hour response time. Enterprise customers get dedicated support with custom SLAs.

Yes, switching models is as simple as changing the model parameter in your API request. Our unified API means you don't need to rewrite code when trying different models.