Create OpenAI-Compatible Gateway

User Intent

"How do I use OpenRouter or Vercel AI Gateway with Graphlit to access multiple AI models through one unified API?"

Operation

SDK Method: createSpecification() with custom OpenAI endpoint Use Case: Access multiple model providers via OpenAI-compatible gateways


What Are AI Gateways?

AI gateways provide a unified, OpenAI-compatible API that routes requests to multiple AI providers. Instead of managing separate API keys and code for each provider (OpenAI, Anthropic, Google, etc.), you configure one gateway endpoint and access hundreds of models.

Key Benefits:

  • Unified API: One integration for 200+ models

  • Cost Optimization: Compare and route to cheapest providers

  • Automatic Fallbacks: Retry failed requests with alternative providers

  • Observability: Track usage, costs, and performance

  • No Vendor Lock-in: Switch models without code changes

This guide covers two popular gateways:

  • OpenRouter - 200+ models, cost tracking, open-source access

  • Vercel AI Gateway - Enterprise observability, caching, Vercel ecosystem integration


OpenRouter Configuration

Complete Code Example (TypeScript)

import { Graphlit } from 'graphlit-client';
import { 
  SpecificationTypes, 
  ModelServiceTypes,
  OpenAiModels 
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create OpenRouter specification
const openRouterSpec = await graphlit.createSpecification({
  name: 'Claude via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,  // Use Custom for external endpoints
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',  // Actual model used
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

console.log(`OpenRouter spec created: ${openRouterSpec.createSpecification.id}`);

// Use in conversation
const conversation = await graphlit.createConversation({
  name: 'OpenRouter Chat',
  specification: { id: openRouterSpec.createSpecification.id }
});

const answer = await graphlit.promptConversation({
  prompt: 'Explain semantic memory in one sentence',
  id: conversation.createConversation.id
});

console.log(answer.message.message);

SDK Adaptation Notes

Python

from graphlit import Graphlit
from graphlit_api.input_types import (
    SpecificationInput,
    OpenAiModelPropertiesInput,
    EntityReferenceInput
)
from graphlit_api.enums import (
    SpecificationTypes,
    ModelServiceTypes,
    OpenAiModels
)
import os

graphlit = Graphlit()

# Create OpenRouter specification (snake_case)
spec_input = SpecificationInput(
    name="Claude via OpenRouter",
    type=SpecificationTypes.COMPLETION,
    service_type=ModelServiceTypes.OPEN_AI,
    open_ai=OpenAiModelPropertiesInput(
        model=OpenAiModels.CUSTOM,  # Use CUSTOM for external endpoints
        endpoint="https://openrouter.ai/api/v1",
        key=os.environ.get('OPENROUTER_API_KEY'),
        model_name="anthropic/claude-4.5-sonnet",  # Actual model
        temperature=0.2,
        completion_token_limit=4000
    )
)

response = await graphlit.client.create_specification(spec_input)
spec_id = response.create_specification.id

print(f"OpenRouter spec created: {spec_id}")

# Use in conversation
answer = await graphlit.client.prompt_conversation(
    prompt="Explain semantic memory in one sentence",
    specification=EntityReferenceInput(id=spec_id)
)

print(answer.message.message)

C#

using Graphlit;
using Graphlit.Models;

var client = new Graphlit();

// Create OpenRouter specification (PascalCase)
var specInput = new SpecificationInput {
    Name = "Claude via OpenRouter",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Custom,  // Use Custom for external endpoints
        Endpoint = "https://openrouter.ai/api/v1",
        Key = Environment.GetEnvironmentVariable("OPENROUTER_API_KEY"),
        ModelName = "anthropic/claude-4.5-sonnet",  // Actual model
        Temperature = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await client.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

Console.WriteLine($"OpenRouter spec created: {specId}");

// Use in conversation
var answer = await client.PromptConversation(
    prompt: "Explain semantic memory in one sentence",
    specification: new EntityReferenceInput { Id = specId }
);

Console.WriteLine(answer.Message.Message);

Step-by-Step: OpenRouter Setup

Step 1: Get OpenRouter API Key

  1. Sign up or log in

  2. Navigate to API Keys in your dashboard

  3. Create a new API key

  4. Store it securely: OPENROUTER_API_KEY=sk-or-v1-...

Step 2: Choose Your Model

Browse available models at https://openrouter.ai/models

Model naming format: provider/model

Popular models:

  • anthropic/claude-4.5-sonnet - Best for RAG, citations ($3/$15 per M tokens)

  • google/gemini-2.5-flash - Fast, 1M context ($0.075/$0.30 per M tokens)

  • openai/gpt-4o - Balanced performance ($2.50/$10 per M tokens)

  • meta-llama/llama-3.3-70b-instruct - Open source, cost-effective ($0.59/$0.59 per M tokens)

  • deepseek/deepseek-chat - Extremely cheap ($0.14/$0.28 per M tokens)

Step 3: Create Specification

Use the code examples above to create your specification with:

  • endpoint: https://openrouter.ai/api/v1

  • key: Your OpenRouter API key

  • modelName: Your chosen model in provider/model format

Step 4: Use in Graphlit

The specification works with all Graphlit conversation methods:

  • promptConversation() - Synchronous Q&A

  • streamAgent() - Streaming with tool calling

  • promptAgent() - Synchronous with tool calling


OpenRouter Model Examples

Example 1: Claude 4.5 Sonnet (Best for RAG)

const claudeSpec = await graphlit.createSpecification({
  name: 'Claude 4.5 Sonnet via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',
    temperature: 0.1,  // Very factual for RAG
    completionTokenLimit: 4000
  }
});

Best for: RAG with accurate citations, document analysis, complex reasoning

Example 2: Gemini 2.5 Flash (Fast + Long Context)

const geminiSpec = await graphlit.createSpecification({
  name: 'Gemini 2.5 Flash via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'google/gemini-2.5-flash',
    temperature: 0.3,
    completionTokenLimit: 8000  // Can handle long responses
  }
});

Best for: Fast responses, long documents (1M context), cost optimization

Example 3: Llama 3.3 70B (Open Source)

const llamaSpec = await graphlit.createSpecification({
  name: 'Llama 3.3 70B via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'meta-llama/llama-3.3-70b-instruct',
    temperature: 0.5,
    completionTokenLimit: 3000
  }
});

Best for: Open-source option, good balance of quality and cost

Example 4: DeepSeek Chat (Ultra-Cheap)

const deepseekSpec = await graphlit.createSpecification({
  name: 'DeepSeek Chat via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'deepseek/deepseek-chat',
    temperature: 0.3,
    completionTokenLimit: 2000
  }
});

Best for: High-volume use cases, development/testing, cost-sensitive applications


OpenRouter Benefits

1. Cost Optimization

Compare pricing across providers and route to the cheapest option:

// Multi-specification strategy
const specs = {
  cheap: createOpenRouterSpec('deepseek/deepseek-chat'),      // $0.14/$0.28 per M
  balanced: createOpenRouterSpec('meta-llama/llama-3.3-70b'), // $0.59/$0.59 per M
  premium: createOpenRouterSpec('anthropic/claude-4.5-sonnet') // $3/$15 per M
};

// Route based on query complexity
const specId = queryComplexity === 'high' 
  ? specs.premium.id 
  : specs.cheap.id;

2. Access to 200+ Models

One API key gives you access to:

  • All major providers (OpenAI, Anthropic, Google, Meta, Mistral, Cohere, etc.)

  • Latest models (GPT-5, Claude 4.5, Gemini 2.5)

  • Open-source models (Llama, Qwen, Mixtral)

  • Specialized models (coding, vision, multilingual)

3. Automatic Fallbacks

OpenRouter handles provider outages automatically:

// If Claude is down, OpenRouter can fall back to GPT-4o automatically
const spec = await graphlit.createSpecification({
  name: 'Claude with Fallback',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',
    // OpenRouter handles fallbacks automatically
  }
});

4. No Provider Lock-in

Switch models by changing one parameter:

// Switch from Claude to GPT without code changes
openAI: {
  modelName: 'openai/gpt-4o'  // Was: 'anthropic/claude-4.5-sonnet'
}

Vercel AI Gateway Configuration

Complete Code Example (TypeScript)

import { Graphlit } from 'graphlit-client';
import { 
  SpecificationTypes, 
  ModelServiceTypes,
  OpenAiModels 
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create Vercel AI Gateway specification
const vercelSpec = await graphlit.createSpecification({
  name: 'Claude via Vercel Gateway',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,  // Use Custom for external endpoints
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,  // Or VERCEL_OIDC_TOKEN
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

console.log(`Vercel Gateway spec created: ${vercelSpec.createSpecification.id}`);

// Use in conversation (with automatic caching and observability)
const answer = await graphlit.promptConversation({
  prompt: 'Explain the key benefits of semantic memory',
  specification: { id: vercelSpec.createSpecification.id }
});

console.log(answer.message.message);
// Response is cached and logged in Vercel dashboard

SDK Adaptation Notes: Vercel

Python

from graphlit import Graphlit
from graphlit_api.input_types import (
    SpecificationInput,
    OpenAiModelPropertiesInput,
    EntityReferenceInput
)
from graphlit_api.enums import (
    SpecificationTypes,
    ModelServiceTypes,
    OpenAiModels
)
import os

graphlit = Graphlit()

# Create Vercel AI Gateway specification
spec_input = SpecificationInput(
    name="Claude via Vercel Gateway",
    type=SpecificationTypes.COMPLETION,
    service_type=ModelServiceTypes.OPEN_AI,
    open_ai=OpenAiModelPropertiesInput(
        model=OpenAiModels.CUSTOM,
        endpoint="https://ai-gateway.vercel.sh/v1",
        key=os.environ.get('VERCEL_AI_GATEWAY_KEY'),
        model_name="anthropic/claude-sonnet-4",
        temperature=0.2,
        completion_token_limit=4000
    )
)

response = await graphlit.client.create_specification(spec_input)
spec_id = response.create_specification.id

print(f"Vercel Gateway spec created: {spec_id}")

C#

using Graphlit;
using Graphlit.Models;

var client = new Graphlit();

// Create Vercel AI Gateway specification
var specInput = new SpecificationInput {
    Name = "Claude via Vercel Gateway",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Custom,
        Endpoint = "https://ai-gateway.vercel.sh/v1",
        Key = Environment.GetEnvironmentVariable("VERCEL_AI_GATEWAY_KEY"),
        ModelName = "anthropic/claude-sonnet-4",
        Temperature = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await client.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

Console.WriteLine($"Vercel Gateway spec created: {specId}");

Step-by-Step: Vercel AI Gateway Setup

Step 1: Get Vercel AI Gateway API Key

  1. Navigate to your project or create one

  2. Go to SettingsAI Gateway

  3. Enable AI Gateway and create an API key

  4. Store it securely: VERCEL_AI_GATEWAY_KEY=...

Alternative: Use Vercel OIDC token for automatic authentication in Vercel deployments

Step 2: Choose Your Model

Vercel AI Gateway supports models in provider/model format:

Popular models:

  • anthropic/claude-sonnet-4 - Claude 4.5 Sonnet

  • openai/gpt-5 - Latest GPT model

  • google/gemini-2.5-flash - Gemini Flash

  • openai/gpt-4.1-mini - GPT-4 Mini

Step 3: Create Specification

Use the code examples above with:

  • endpoint: https://ai-gateway.vercel.sh/v1

  • key: Your Vercel AI Gateway API key

  • modelName: Your model in provider/model format

Step 4: Monitor in Vercel Dashboard

All requests are automatically logged in your Vercel AI Gateway dashboard:

  • Request/response logs

  • Token usage and costs

  • Latency metrics

  • Cache hit rates


Vercel AI Gateway Model Examples

Example 1: Claude with Caching

const cachedClaudeSpec = await graphlit.createSpecification({
  name: 'Cached Claude via Vercel',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.1,
    completionTokenLimit: 3000
  }
});
// Repeated queries are cached automatically by Vercel

Benefit: Significant cost savings on repeated queries

Example 2: GPT-5 with Observability

const gpt5Spec = await graphlit.createSpecification({
  name: 'GPT-5 via Vercel Gateway',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'openai/gpt-5',
    temperature: 0.3,
    completionTokenLimit: 4000
  }
});
// All requests logged in Vercel dashboard with full observability

Benefit: Enterprise-grade monitoring and analytics

Example 3: OIDC Token (Vercel Deployments)

// When deployed on Vercel, use OIDC token
const vercelOidcSpec = await graphlit.createSpecification({
  name: 'Claude with OIDC',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_OIDC_TOKEN,  // Automatic in Vercel deployments
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

Benefit: No API key management needed in Vercel environment


Vercel AI Gateway Benefits

1. Enterprise Observability

Full visibility into AI usage:

  • Request/response logs

  • Token usage by model, user, endpoint

  • Latency P50/P95/P99 metrics

  • Cost tracking and alerts

  • Error rates and debugging

2. Response Caching

Automatic caching reduces costs:

// First request: hits provider, costs money
const answer1 = await graphlit.promptConversation({
  prompt: 'What is semantic memory?',
  specification: { id: vercelSpec.id }
});

// Second request: cache hit, free
const answer2 = await graphlit.promptConversation({
  prompt: 'What is semantic memory?',
  specification: { id: vercelSpec.id }
});

Savings: Up to 90% cost reduction on repeated queries

3. Multi-Provider Routing

Automatic fallbacks across providers:

// Vercel tries providers in order until one succeeds
const spec = await graphlit.createSpecification({
  openAI: {
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    modelName: 'anthropic/claude-sonnet-4',
    // If Anthropic is down, Vercel can route to alternative providers
  }
});

4. Vercel Ecosystem Integration

Seamless integration with:

  • Vercel deployments (OIDC tokens)

  • Edge functions

  • Vercel KV/Postgres

  • Analytics dashboard


Configuration Comparison

Feature
OpenRouter
Vercel AI Gateway

Endpoint

https://openrouter.ai/api/v1

https://ai-gateway.vercel.sh/v1

Model Format

provider/model

provider/model

API Key

OpenRouter dashboard

Vercel AI Gateway settings

Models Available

200+ models

Major providers

Pricing

Pay per token

Pay per token

Response Caching

No

Yes (automatic)

Observability

Basic (via dashboard)

Advanced (Vercel console)

Analytics

Token usage, costs

Full request logs, latency, errors

Fallbacks

Provider-level

Multi-provider routing

OIDC Support

No

Yes (Vercel deployments)

Best For

Model variety, cost optimization

Enterprise, observability, caching


Common Use Cases

Use Case 1: Cost Optimization with OpenRouter

Route to cheaper models for simple queries:

// Create multiple specs for different complexity levels
const specs = {
  simple: await createOpenRouterSpec('deepseek/deepseek-chat'),       // $0.14/$0.28 per M
  medium: await createOpenRouterSpec('meta-llama/llama-3.3-70b'),     // $0.59/$0.59 per M
  complex: await createOpenRouterSpec('anthropic/claude-4.5-sonnet')  // $3/$15 per M
};

// Route based on query analysis
function getSpecForQuery(query: string) {
  if (query.length < 50) return specs.simple.id;
  if (needsReasoning(query)) return specs.complex.id;
  return specs.medium.id;
}

const answer = await graphlit.promptConversation({
  prompt: userQuery,
  specification: { id: getSpecForQuery(userQuery) }
});

Savings: Up to 95% cost reduction vs always using premium models

Use Case 2: Enterprise Observability with Vercel

Track all AI usage in production:

// Create Vercel Gateway spec
const enterpriseSpec = await graphlit.createSpecification({
  name: 'Production Claude',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

// All requests automatically logged in Vercel dashboard:
// - User ID, timestamp, latency
// - Token usage, cost per request
// - Error rates, debugging info
// - Cache hit rates, cost savings

Benefit: Complete visibility for production monitoring and optimization

Use Case 3: Multi-Model Access

Easy switching between models:

// Create specs for different providers
const models = {
  claude: await createSpec('openrouter', 'anthropic/claude-4.5-sonnet'),
  gpt: await createSpec('openrouter', 'openai/gpt-4o'),
  gemini: await createSpec('openrouter', 'google/gemini-2.5-flash'),
  llama: await createSpec('openrouter', 'meta-llama/llama-3.3-70b-instruct')
};

// Route based on task
function getModelForTask(task: string) {
  if (task === 'rag') return models.claude.id;      // Best citations
  if (task === 'creative') return models.gpt.id;     // Best creativity
  if (task === 'fast') return models.gemini.id;      // Fastest
  if (task === 'opensource') return models.llama.id; // No vendor lock-in
}

Use Case 4: Development vs Production

Use OpenRouter for development, Vercel for production:

const isDevelopment = process.env.NODE_ENV === 'development';

const spec = await graphlit.createSpecification({
  name: isDevelopment ? 'Dev: DeepSeek' : 'Prod: Claude',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: isDevelopment 
      ? 'https://openrouter.ai/api/v1'        // Cheap models for dev
      : 'https://ai-gateway.vercel.sh/v1',    // Observability for prod
    key: isDevelopment 
      ? process.env.OPENROUTER_API_KEY 
      : process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: isDevelopment 
      ? 'deepseek/deepseek-chat'              // $0.14 per M tokens
      : 'anthropic/claude-sonnet-4',          // Best quality
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

Common Issues & Solutions

Issue: "Model not found" error

Cause: Incorrect model name format or model not available on gateway

Solution:

  • Verify model name uses provider/model format (e.g., anthropic/claude-4.5-sonnet)

  • Check model availability:

    • OpenRouter: https://openrouter.ai/models

    • Vercel: Available models list in Vercel docs

  • Ensure no typos in provider or model name

Issue: Authentication errors

Cause: Using wrong API key or provider key instead of gateway key

Solution:

  • For OpenRouter: Use API key from OpenRouter dashboard (starts with sk-or-)

  • For Vercel: Use AI Gateway key from Vercel settings (not your OpenAI/Anthropic keys)

  • Gateway keys are different from provider keys

  • Check key is stored correctly in environment variables

Issue: Different responses than direct provider

Cause: Gateways may have different default parameters

Solution:

// Explicitly set all parameters
openAI: {
  endpoint: 'https://openrouter.ai/api/v1',
  modelName: 'anthropic/claude-4.5-sonnet',
  temperature: 0.2,        // Don't rely on defaults
  completionTokenLimit: 4000,
  probability: 0.9
}

Issue: Rate limit errors

Cause: Gateway rate limits are separate from provider limits

Solution:

  • OpenRouter: Check your plan limits in dashboard

  • Vercel: Review rate limits in AI Gateway settings

  • Consider upgrading plan or implementing request queuing

  • Use multiple API keys for higher throughput

Issue: Higher latency than expected

Cause: Gateway adds routing overhead

Solution:

  • OpenRouter: ~50-100ms overhead

  • Vercel: ~20-50ms overhead (with caching benefits)

  • For ultra-low latency, use direct provider connection

  • Use Vercel caching to reduce latency on repeated queries

Issue: Costs higher than expected

Cause: Not utilizing caching or routing to cheaper models

Solution (OpenRouter):

// Route to cheaper models when appropriate
const cheapSpec = createSpec('deepseek/deepseek-chat');  // vs claude

Solution (Vercel):

// Enable caching automatically applies
// Monitor Vercel dashboard for cache hit rates

Issue: model enum field confusion

Cause: Unclear which model enum value to use with custom endpoints

Solution:

openAI: {
  model: OpenAiModels.Custom,  // Always use Custom for external gateways
  modelName: 'anthropic/claude-4.5-sonnet'  // This determines actual model
}

Always use OpenAiModels.Custom when configuring OpenRouter, Vercel AI Gateway, or any other external endpoint. The modelName field determines which model is actually used.




Production Patterns

Pattern 1: Multi-Tier Routing (OpenRouter)

// Create tiered specifications
const tiers = {
  free: await createOpenRouterSpec('deepseek/deepseek-chat'),
  paid: await createOpenRouterSpec('anthropic/claude-4.5-sonnet')
};

// Route based on user plan
function getSpecForUser(userId: string) {
  const user = getUserPlan(userId);
  return user.plan === 'premium' ? tiers.paid.id : tiers.free.id;
}

Pattern 2: Cached Production (Vercel)

// Production spec with caching
const productionSpec = await graphlit.createSpecification({
  name: 'Production RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.1  // Low temp for consistent caching
  }
});

Pattern 3: Multi-Gateway Fallback

// Primary: Vercel (for caching)
// Fallback: OpenRouter (if Vercel down)
try {
  const answer = await promptWithSpec(vercelSpec);
} catch (error) {
  console.log('Vercel gateway down, falling back to OpenRouter');
  const answer = await promptWithSpec(openRouterSpec);
}

Tags

[openrouter, vercel, ai-gateway, custom-endpoint, multi-model, specification, openai-compatible, cost-optimization, observability]

Last updated

Was this helpful?