Now supporting goff-code-1.0

An AI layer designed
for real usage.

Goff sits between your application and model providers, handling routing, usage tracking, and cost control so you can focus on shipping.

1import { createOpenAI } from 'ai';
2
3export const goffy = createOpenAI({
4apiKey: process.env.GOFFY_API_KEY!,
5baseURL: 'https://api.goffy.ai/v1',
6});
LINEARREPLICATEVERCELSCALEANTHROPIC
Platform

Infrastructure that
stays out of your way.

One SDK. Any model. Full visibility into cost and latency. Built for teams shipping AI to production.

Multi-model access

OpenAI, Anthropic, Llama, Mistral — one endpoint. Switch providers with a config change, not a refactor.

Global edge routing

Requests hit the nearest region. Typical TTFT under 200ms. P99 latency tracked per model.

Per-user rate limits

Set token budgets per user, per key, per project. Hard limits, soft limits, alerts — your call.

Native streaming

SSE out of the box. Consistent chunk format across providers. Graceful error propagation.

Workflow

Integrate once.
Route anywhere.

1

Drop-in SDK

OpenAI-compatible interface. Swap your base URL. No code changes required.

2

Model routing

Specify model in request. Goff routes to the provider with the lowest latency and cost.

3

Automatic failover

Rate limits and outages handled automatically. Your requests never drop.

API
POST /v1/chat/completions
Active
Model A
goff-code-1.0
Response time: 142ms
Pricing Philosophy

Simple pricing.
No surprises.

Pay for what you use. Token-level tracking. Full cost visibility from day one.

Developer

$29/mo

For side projects and early-stage products.

Up to 1M tokens/mo
Access to goff-code-1.0
Standard support
Single project
Most Popular

Pro

$149/mo

For teams shipping AI features to production.

Up to 10M tokens/mo
Access to goff-code-1.0
Priority support
Unlimited projects
Custom rate limits

Enterprise

Custom

For organizations with compliance and scale requirements.

Unlimited tokens
Access to Octo 1.0
24/7 SLA support
SOC2 Compliance
Dedicated VPC

SOC 2 compliant infrastructure. 99.9% uptime SLA.

Ready to optimize your AI infrastructure?

Stop wrestling with provider specific quirks. Get a unified, high-performance gateway for your entire AI stack today.

14ms
Router overhead
99.99%
Uptime SLA
25+
Models supported
Scalability