Create OpenAI-Compatible Gateway

User Intent

"How do I use OpenRouter or Vercel AI Gateway with Graphlit to access multiple AI models through one unified API?"

Operation

SDK Method: createSpecification() with custom OpenAI endpoint Use Case: Access multiple model providers via OpenAI-compatible gateways


What Are AI Gateways?

AI gateways provide a unified, OpenAI-compatible API that routes requests to multiple AI providers. Instead of managing separate API keys and code for each provider (OpenAI, Anthropic, Google, etc.), you configure one gateway endpoint and access hundreds of models.

Key Benefits:

  • Unified API: One integration for 200+ models

  • Cost Optimization: Compare and route to cheapest providers

  • Automatic Fallbacks: Retry failed requests with alternative providers

  • Observability: Track usage, costs, and performance

  • No Vendor Lock-in: Switch models without code changes

This guide covers two popular gateways:

  • OpenRouter - 200+ models, cost tracking, open-source access

  • Vercel AI Gateway - Enterprise observability, caching, Vercel ecosystem integration


OpenRouter Configuration

Complete Code Example (TypeScript)


SDK Adaptation Notes

Python

C#


Step-by-Step: OpenRouter Setup

Step 1: Get OpenRouter API Key

  1. Sign up or log in

  2. Navigate to API Keys in your dashboard

  3. Create a new API key

  4. Store it securely: OPENROUTER_API_KEY=sk-or-v1-...

Step 2: Choose Your Model

Browse available models at https://openrouter.ai/models

Model naming format: provider/model

Popular models:

  • anthropic/claude-4.5-sonnet - Best for RAG, citations ($3/$15 per M tokens)

  • google/gemini-2.5-flash - Fast, 1M context ($0.075/$0.30 per M tokens)

  • openai/gpt-4o - Balanced performance ($2.50/$10 per M tokens)

  • meta-llama/llama-3.3-70b-instruct - Open source, cost-effective ($0.59/$0.59 per M tokens)

  • deepseek/deepseek-chat - Extremely cheap ($0.14/$0.28 per M tokens)

Step 3: Create Specification

Use the code examples above to create your specification with:

  • endpoint: https://openrouter.ai/api/v1

  • key: Your OpenRouter API key

  • modelName: Your chosen model in provider/model format

Step 4: Use in Graphlit

The specification works with all Graphlit conversation methods:

  • promptConversation() - Synchronous Q&A

  • streamAgent() - Streaming with tool calling

  • promptAgent() - Synchronous with tool calling


OpenRouter Model Examples

Example 1: Claude 4.5 Sonnet (Best for RAG)

Best for: RAG with accurate citations, document analysis, complex reasoning

Example 2: Gemini 2.5 Flash (Fast + Long Context)

Best for: Fast responses, long documents (1M context), cost optimization

Example 3: Llama 3.3 70B (Open Source)

Best for: Open-source option, good balance of quality and cost

Example 4: DeepSeek Chat (Ultra-Cheap)

Best for: High-volume use cases, development/testing, cost-sensitive applications


OpenRouter Benefits

1. Cost Optimization

Compare pricing across providers and route to the cheapest option:

2. Access to 200+ Models

One API key gives you access to:

  • All major providers (OpenAI, Anthropic, Google, Meta, Mistral, Cohere, etc.)

  • Latest models (GPT-5, Claude 4.5, Gemini 2.5)

  • Open-source models (Llama, Qwen, Mixtral)

  • Specialized models (coding, vision, multilingual)

3. Automatic Fallbacks

OpenRouter handles provider outages automatically:

4. No Provider Lock-in

Switch models by changing one parameter:


Vercel AI Gateway Configuration

Complete Code Example (TypeScript)


SDK Adaptation Notes: Vercel

Python

C#


Step-by-Step: Vercel AI Gateway Setup

Step 1: Get Vercel AI Gateway API Key

  1. Navigate to your project or create one

  2. Go to Settings → AI Gateway

  3. Enable AI Gateway and create an API key

  4. Store it securely: VERCEL_AI_GATEWAY_KEY=...

Alternative: Use Vercel OIDC token for automatic authentication in Vercel deployments

Step 2: Choose Your Model

Vercel AI Gateway supports models in provider/model format:

Popular models:

  • anthropic/claude-sonnet-4 - Claude 4.5 Sonnet

  • openai/gpt-5 - Latest GPT model

  • google/gemini-2.5-flash - Gemini Flash

  • openai/gpt-4.1-mini - GPT-4 Mini

Step 3: Create Specification

Use the code examples above with:

  • endpoint: https://ai-gateway.vercel.sh/v1

  • key: Your Vercel AI Gateway API key

  • modelName: Your model in provider/model format

Step 4: Monitor in Vercel Dashboard

All requests are automatically logged in your Vercel AI Gateway dashboard:

  • Request/response logs

  • Token usage and costs

  • Latency metrics

  • Cache hit rates


Vercel AI Gateway Model Examples

Example 1: Claude with Caching

Benefit: Significant cost savings on repeated queries

Example 2: GPT-5 with Observability

Benefit: Enterprise-grade monitoring and analytics

Example 3: OIDC Token (Vercel Deployments)

Benefit: No API key management needed in Vercel environment


Vercel AI Gateway Benefits

1. Enterprise Observability

Full visibility into AI usage:

  • Request/response logs

  • Token usage by model, user, endpoint

  • Latency P50/P95/P99 metrics

  • Cost tracking and alerts

  • Error rates and debugging

2. Response Caching

Automatic caching reduces costs:

Savings: Up to 90% cost reduction on repeated queries

3. Multi-Provider Routing

Automatic fallbacks across providers:

4. Vercel Ecosystem Integration

Seamless integration with:

  • Vercel deployments (OIDC tokens)

  • Edge functions

  • Vercel KV/Postgres

  • Analytics dashboard


Configuration Comparison

Feature
OpenRouter
Vercel AI Gateway

Endpoint

https://openrouter.ai/api/v1

https://ai-gateway.vercel.sh/v1

Model Format

provider/model

provider/model

API Key

OpenRouter dashboard

Vercel AI Gateway settings

Models Available

200+ models

Major providers

Pricing

Pay per token

Pay per token

Response Caching

No

Yes (automatic)

Observability

Basic (via dashboard)

Advanced (Vercel console)

Analytics

Token usage, costs

Full request logs, latency, errors

Fallbacks

Provider-level

Multi-provider routing

OIDC Support

No

Yes (Vercel deployments)

Best For

Model variety, cost optimization

Enterprise, observability, caching


Common Use Cases

Use Case 1: Cost Optimization with OpenRouter

Route to cheaper models for simple queries:

Savings: Up to 95% cost reduction vs always using premium models

Use Case 2: Enterprise Observability with Vercel

Track all AI usage in production:

Benefit: Complete visibility for production monitoring and optimization

Use Case 3: Multi-Model Access

Easy switching between models:

Use Case 4: Development vs Production

Use OpenRouter for development, Vercel for production:


Common Issues & Solutions

Issue: "Model not found" error

Cause: Incorrect model name format or model not available on gateway

Solution:

  • Verify model name uses provider/model format (e.g., anthropic/claude-4.5-sonnet)

  • Check model availability:

    • OpenRouter: https://openrouter.ai/models

    • Vercel: Available models list in Vercel docs

  • Ensure no typos in provider or model name

Issue: Authentication errors

Cause: Using wrong API key or provider key instead of gateway key

Solution:

  • For OpenRouter: Use API key from OpenRouter dashboard (starts with sk-or-)

  • For Vercel: Use AI Gateway key from Vercel settings (not your OpenAI/Anthropic keys)

  • Gateway keys are different from provider keys

  • Check key is stored correctly in environment variables

Issue: Different responses than direct provider

Cause: Gateways may have different default parameters

Solution:

Issue: Rate limit errors

Cause: Gateway rate limits are separate from provider limits

Solution:

  • OpenRouter: Check your plan limits in dashboard

  • Vercel: Review rate limits in AI Gateway settings

  • Consider upgrading plan or implementing request queuing

  • Use multiple API keys for higher throughput

Issue: Higher latency than expected

Cause: Gateway adds routing overhead

Solution:

  • OpenRouter: ~50-100ms overhead

  • Vercel: ~20-50ms overhead (with caching benefits)

  • For ultra-low latency, use direct provider connection

  • Use Vercel caching to reduce latency on repeated queries

Issue: Costs higher than expected

Cause: Not utilizing caching or routing to cheaper models

Solution (OpenRouter):

Solution (Vercel):

Issue: model enum field confusion

Cause: Unclear which model enum value to use with custom endpoints

Solution:

Always use OpenAiModels.Custom when configuring OpenRouter, Vercel AI Gateway, or any other external endpoint. The modelName field determines which model is actually used.




Production Patterns

Pattern 1: Multi-Tier Routing (OpenRouter)

Pattern 2: Cached Production (Vercel)

Pattern 3: Multi-Gateway Fallback


Tags

[openrouter, vercel, ai-gateway, custom-endpoint, multi-model, specification, openai-compatible, cost-optimization, observability]

Last updated

Was this helpful?