# Create OpenAI-Compatible Gateway

## User Intent

"How do I use OpenRouter or Vercel AI Gateway with Graphlit to access multiple AI models through one unified API?"

## Operation

**SDK Method**: `createSpecification()` with custom OpenAI endpoint\
**Use Case**: Access multiple model providers via OpenAI-compatible gateways

***

## What Are AI Gateways?

AI gateways provide a unified, OpenAI-compatible API that routes requests to multiple AI providers. Instead of managing separate API keys and code for each provider (OpenAI, Anthropic, Google, etc.), you configure one gateway endpoint and access hundreds of models.

**Key Benefits**:

* **Unified API**: One integration for 200+ models
* **Cost Optimization**: Compare and route to cheapest providers
* **Automatic Fallbacks**: Retry failed requests with alternative providers
* **Observability**: Track usage, costs, and performance
* **No Vendor Lock-in**: Switch models without code changes

This guide covers two popular gateways:

* **OpenRouter** - 200+ models, cost tracking, open-source access
* **Vercel AI Gateway** - Enterprise observability, caching, Vercel ecosystem integration

***

## OpenRouter Configuration

### Complete Code Example (TypeScript)

```typescript
import { Graphlit } from 'graphlit-client';
import { 
  SpecificationTypes, 
  ModelServiceTypes,
  OpenAiModels 
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create OpenRouter specification
const openRouterSpec = await graphlit.createSpecification({
  name: 'Claude via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,  // Use Custom for external endpoints
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',  // Actual model used
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

console.log(`OpenRouter spec created: ${openRouterSpec.createSpecification.id}`);

// Use in conversation
const conversation = await graphlit.createConversation({
  name: 'OpenRouter Chat',
  specification: { id: openRouterSpec.createSpecification.id }
});

const answer = await graphlit.promptConversation({
  prompt: 'Explain semantic memory in one sentence',
  id: conversation.createConversation.id
});

console.log(answer.message.message);
```

***

## SDK Adaptation Notes

### Python

```python
from graphlit import Graphlit
from graphlit_api.input_types import (
    SpecificationInput,
    OpenAiModelPropertiesInput,
    EntityReferenceInput
)
from graphlit_api.enums import (
    SpecificationTypes,
    ModelServiceTypes,
    OpenAiModels
)
import os

graphlit = Graphlit()

# Create OpenRouter specification (snake_case)
spec_input = SpecificationInput(
    name="Claude via OpenRouter",
    type=SpecificationTypes.COMPLETION,
    service_type=ModelServiceTypes.OPEN_AI,
    open_ai=OpenAiModelPropertiesInput(
        model=OpenAiModels.CUSTOM,  # Use CUSTOM for external endpoints
        endpoint="https://openrouter.ai/api/v1",
        key=os.environ.get('OPENROUTER_API_KEY'),
        model_name="anthropic/claude-4.5-sonnet",  # Actual model
        temperature=0.2,
        completion_token_limit=4000
    )
)

response = await graphlit.client.create_specification(spec_input)
spec_id = response.create_specification.id

print(f"OpenRouter spec created: {spec_id}")

# Use in conversation
answer = await graphlit.client.prompt_conversation(
    prompt="Explain semantic memory in one sentence",
    specification=EntityReferenceInput(id=spec_id)
)

print(answer.message.message)
```

### C\#

```csharp
using Graphlit;
using Graphlit.Models;

var client = new Graphlit();

// Create OpenRouter specification (PascalCase)
var specInput = new SpecificationInput {
    Name = "Claude via OpenRouter",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Custom,  // Use Custom for external endpoints
        Endpoint = "https://openrouter.ai/api/v1",
        Key = Environment.GetEnvironmentVariable("OPENROUTER_API_KEY"),
        ModelName = "anthropic/claude-4.5-sonnet",  // Actual model
        Temperature = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await client.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

Console.WriteLine($"OpenRouter spec created: {specId}");

// Use in conversation
var answer = await client.PromptConversation(
    prompt: "Explain semantic memory in one sentence",
    specification: new EntityReferenceInput { Id = specId }
);

Console.WriteLine(answer.Message.Message);
```

***

## Step-by-Step: OpenRouter Setup

### Step 1: Get OpenRouter API Key

1. Visit <https://openrouter.ai>
2. Sign up or log in
3. Navigate to **API Keys** in your dashboard
4. Create a new API key
5. Store it securely: `OPENROUTER_API_KEY=sk-or-v1-...`

### Step 2: Choose Your Model

Browse available models at <https://openrouter.ai/models>

**Model naming format**: `provider/model`

**Popular models**:

* `anthropic/claude-4.5-sonnet` - Best for RAG, citations ($3/$15 per M tokens)
* `google/gemini-2.5-flash` - Fast, 1M context ($0.075/$0.30 per M tokens)
* `openai/gpt-4o` - Balanced performance ($2.50/$10 per M tokens)
* `meta-llama/llama-3.3-70b-instruct` - Open source, cost-effective ($0.59/$0.59 per M tokens)
* `deepseek/deepseek-chat` - Extremely cheap ($0.14/$0.28 per M tokens)

### Step 3: Create Specification

Use the code examples above to create your specification with:

* `endpoint`: `https://openrouter.ai/api/v1`
* `key`: Your OpenRouter API key
* `modelName`: Your chosen model in `provider/model` format

### Step 4: Use in Graphlit

The specification works with all Graphlit conversation methods:

* `promptConversation()` - Synchronous Q\&A
* `streamAgent()` - Streaming with tool calling
* `promptAgent()` - Synchronous with tool calling

***

## OpenRouter Model Examples

### Example 1: Claude 4.5 Sonnet (Best for RAG)

```typescript
const claudeSpec = await graphlit.createSpecification({
  name: 'Claude 4.5 Sonnet via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',
    temperature: 0.1,  // Very factual for RAG
    completionTokenLimit: 4000
  }
});
```

**Best for**: RAG with accurate citations, document analysis, complex reasoning

### Example 2: Gemini 2.5 Flash (Fast + Long Context)

```typescript
const geminiSpec = await graphlit.createSpecification({
  name: 'Gemini 2.5 Flash via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'google/gemini-2.5-flash',
    temperature: 0.3,
    completionTokenLimit: 8000  // Can handle long responses
  }
});
```

**Best for**: Fast responses, long documents (1M context), cost optimization

### Example 3: Llama 3.3 70B (Open Source)

```typescript
const llamaSpec = await graphlit.createSpecification({
  name: 'Llama 3.3 70B via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'meta-llama/llama-3.3-70b-instruct',
    temperature: 0.5,
    completionTokenLimit: 3000
  }
});
```

**Best for**: Open-source option, good balance of quality and cost

### Example 4: DeepSeek Chat (Ultra-Cheap)

```typescript
const deepseekSpec = await graphlit.createSpecification({
  name: 'DeepSeek Chat via OpenRouter',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'deepseek/deepseek-chat',
    temperature: 0.3,
    completionTokenLimit: 2000
  }
});
```

**Best for**: High-volume use cases, development/testing, cost-sensitive applications

***

## OpenRouter Benefits

### 1. Cost Optimization

Compare pricing across providers and route to the cheapest option:

```typescript
// Multi-specification strategy
const specs = {
  cheap: createOpenRouterSpec('deepseek/deepseek-chat'),      // $0.14/$0.28 per M
  balanced: createOpenRouterSpec('meta-llama/llama-3.3-70b'), // $0.59/$0.59 per M
  premium: createOpenRouterSpec('anthropic/claude-4.5-sonnet') // $3/$15 per M
};

// Route based on query complexity
const specId = queryComplexity === 'high' 
  ? specs.premium.id 
  : specs.cheap.id;
```

### 2. Access to 200+ Models

One API key gives you access to:

* All major providers (OpenAI, Anthropic, Google, Meta, Mistral, Cohere, etc.)
* Latest models (GPT-5, Claude 4.5, Gemini 2.5)
* Open-source models (Llama, Qwen, Mixtral)
* Specialized models (coding, vision, multilingual)

### 3. Automatic Fallbacks

OpenRouter handles provider outages automatically:

```typescript
// If Claude is down, OpenRouter can fall back to GPT-4o automatically
const spec = await graphlit.createSpecification({
  name: 'Claude with Fallback',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    endpoint: 'https://openrouter.ai/api/v1',
    key: process.env.OPENROUTER_API_KEY,
    modelName: 'anthropic/claude-4.5-sonnet',
    // OpenRouter handles fallbacks automatically
  }
});
```

### 4. No Provider Lock-in

Switch models by changing one parameter:

```typescript
// Switch from Claude to GPT without code changes
openAI: {
  modelName: 'openai/gpt-4o'  // Was: 'anthropic/claude-4.5-sonnet'
}
```

***

## Vercel AI Gateway Configuration

### Complete Code Example (TypeScript)

```typescript
import { Graphlit } from 'graphlit-client';
import { 
  SpecificationTypes, 
  ModelServiceTypes,
  OpenAiModels 
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create Vercel AI Gateway specification
const vercelSpec = await graphlit.createSpecification({
  name: 'Claude via Vercel Gateway',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,  // Use Custom for external endpoints
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,  // Or VERCEL_OIDC_TOKEN
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

console.log(`Vercel Gateway spec created: ${vercelSpec.createSpecification.id}`);

// Use in conversation (with automatic caching and observability)
const answer = await graphlit.promptConversation({
  prompt: 'Explain the key benefits of semantic memory',
  specification: { id: vercelSpec.createSpecification.id }
});

console.log(answer.message.message);
// Response is cached and logged in Vercel dashboard
```

***

## SDK Adaptation Notes: Vercel

### Python

```python
from graphlit import Graphlit
from graphlit_api.input_types import (
    SpecificationInput,
    OpenAiModelPropertiesInput,
    EntityReferenceInput
)
from graphlit_api.enums import (
    SpecificationTypes,
    ModelServiceTypes,
    OpenAiModels
)
import os

graphlit = Graphlit()

# Create Vercel AI Gateway specification
spec_input = SpecificationInput(
    name="Claude via Vercel Gateway",
    type=SpecificationTypes.COMPLETION,
    service_type=ModelServiceTypes.OPEN_AI,
    open_ai=OpenAiModelPropertiesInput(
        model=OpenAiModels.CUSTOM,
        endpoint="https://ai-gateway.vercel.sh/v1",
        key=os.environ.get('VERCEL_AI_GATEWAY_KEY'),
        model_name="anthropic/claude-sonnet-4",
        temperature=0.2,
        completion_token_limit=4000
    )
)

response = await graphlit.client.create_specification(spec_input)
spec_id = response.create_specification.id

print(f"Vercel Gateway spec created: {spec_id}")
```

### C\#

```csharp
using Graphlit;
using Graphlit.Models;

var client = new Graphlit();

// Create Vercel AI Gateway specification
var specInput = new SpecificationInput {
    Name = "Claude via Vercel Gateway",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Custom,
        Endpoint = "https://ai-gateway.vercel.sh/v1",
        Key = Environment.GetEnvironmentVariable("VERCEL_AI_GATEWAY_KEY"),
        ModelName = "anthropic/claude-sonnet-4",
        Temperature = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await client.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

Console.WriteLine($"Vercel Gateway spec created: {specId}");
```

***

## Step-by-Step: Vercel AI Gateway Setup

### Step 1: Get Vercel AI Gateway API Key

1. Visit <https://vercel.com>
2. Navigate to your project or create one
3. Go to **Settings** → **AI Gateway**
4. Enable AI Gateway and create an API key
5. Store it securely: `VERCEL_AI_GATEWAY_KEY=...`

**Alternative**: Use Vercel OIDC token for automatic authentication in Vercel deployments

### Step 2: Choose Your Model

Vercel AI Gateway supports models in `provider/model` format:

**Popular models**:

* `anthropic/claude-sonnet-4` - Claude 4.5 Sonnet
* `openai/gpt-5` - Latest GPT model
* `google/gemini-2.5-flash` - Gemini Flash
* `openai/gpt-4.1-mini` - GPT-4 Mini

### Step 3: Create Specification

Use the code examples above with:

* `endpoint`: `https://ai-gateway.vercel.sh/v1`
* `key`: Your Vercel AI Gateway API key
* `modelName`: Your model in `provider/model` format

### Step 4: Monitor in Vercel Dashboard

All requests are automatically logged in your Vercel AI Gateway dashboard:

* Request/response logs
* Token usage and costs
* Latency metrics
* Cache hit rates

***

## Vercel AI Gateway Model Examples

### Example 1: Claude with Caching

```typescript
const cachedClaudeSpec = await graphlit.createSpecification({
  name: 'Cached Claude via Vercel',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.1,
    completionTokenLimit: 3000
  }
});
// Repeated queries are cached automatically by Vercel
```

**Benefit**: Significant cost savings on repeated queries

### Example 2: GPT-5 with Observability

```typescript
const gpt5Spec = await graphlit.createSpecification({
  name: 'GPT-5 via Vercel Gateway',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'openai/gpt-5',
    temperature: 0.3,
    completionTokenLimit: 4000
  }
});
// All requests logged in Vercel dashboard with full observability
```

**Benefit**: Enterprise-grade monitoring and analytics

### Example 3: OIDC Token (Vercel Deployments)

```typescript
// When deployed on Vercel, use OIDC token
const vercelOidcSpec = await graphlit.createSpecification({
  name: 'Claude with OIDC',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_OIDC_TOKEN,  // Automatic in Vercel deployments
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});
```

**Benefit**: No API key management needed in Vercel environment

***

## Vercel AI Gateway Benefits

### 1. Enterprise Observability

Full visibility into AI usage:

* Request/response logs
* Token usage by model, user, endpoint
* Latency P50/P95/P99 metrics
* Cost tracking and alerts
* Error rates and debugging

### 2. Response Caching

Automatic caching reduces costs:

```typescript
// First request: hits provider, costs money
const answer1 = await graphlit.promptConversation({
  prompt: 'What is semantic memory?',
  specification: { id: vercelSpec.id }
});

// Second request: cache hit, free
const answer2 = await graphlit.promptConversation({
  prompt: 'What is semantic memory?',
  specification: { id: vercelSpec.id }
});
```

**Savings**: Up to 90% cost reduction on repeated queries

### 3. Multi-Provider Routing

Automatic fallbacks across providers:

```typescript
// Vercel tries providers in order until one succeeds
const spec = await graphlit.createSpecification({
  openAI: {
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    modelName: 'anthropic/claude-sonnet-4',
    // If Anthropic is down, Vercel can route to alternative providers
  }
});
```

### 4. Vercel Ecosystem Integration

Seamless integration with:

* Vercel deployments (OIDC tokens)
* Edge functions
* Vercel KV/Postgres
* Analytics dashboard

***

## Configuration Comparison

| Feature              | OpenRouter                       | Vercel AI Gateway                  |
| -------------------- | -------------------------------- | ---------------------------------- |
| **Endpoint**         | `https://openrouter.ai/api/v1`   | `https://ai-gateway.vercel.sh/v1`  |
| **Model Format**     | `provider/model`                 | `provider/model`                   |
| **API Key**          | OpenRouter dashboard             | Vercel AI Gateway settings         |
| **Models Available** | 200+ models                      | Major providers                    |
| **Pricing**          | Pay per token                    | Pay per token                      |
| **Response Caching** | No                               | Yes (automatic)                    |
| **Observability**    | Basic (via dashboard)            | Advanced (Vercel console)          |
| **Analytics**        | Token usage, costs               | Full request logs, latency, errors |
| **Fallbacks**        | Provider-level                   | Multi-provider routing             |
| **OIDC Support**     | No                               | Yes (Vercel deployments)           |
| **Best For**         | Model variety, cost optimization | Enterprise, observability, caching |

***

## Common Use Cases

### Use Case 1: Cost Optimization with OpenRouter

Route to cheaper models for simple queries:

```typescript
// Create multiple specs for different complexity levels
const specs = {
  simple: await createOpenRouterSpec('deepseek/deepseek-chat'),       // $0.14/$0.28 per M
  medium: await createOpenRouterSpec('meta-llama/llama-3.3-70b'),     // $0.59/$0.59 per M
  complex: await createOpenRouterSpec('anthropic/claude-4.5-sonnet')  // $3/$15 per M
};

// Route based on query analysis
function getSpecForQuery(query: string) {
  if (query.length < 50) return specs.simple.id;
  if (needsReasoning(query)) return specs.complex.id;
  return specs.medium.id;
}

const answer = await graphlit.promptConversation({
  prompt: userQuery,
  specification: { id: getSpecForQuery(userQuery) }
});
```

**Savings**: Up to 95% cost reduction vs always using premium models

### Use Case 2: Enterprise Observability with Vercel

Track all AI usage in production:

```typescript
// Create Vercel Gateway spec
const enterpriseSpec = await graphlit.createSpecification({
  name: 'Production Claude',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});

// All requests automatically logged in Vercel dashboard:
// - User ID, timestamp, latency
// - Token usage, cost per request
// - Error rates, debugging info
// - Cache hit rates, cost savings
```

**Benefit**: Complete visibility for production monitoring and optimization

### Use Case 3: Multi-Model Access

Easy switching between models:

```typescript
// Create specs for different providers
const models = {
  claude: await createSpec('openrouter', 'anthropic/claude-4.5-sonnet'),
  gpt: await createSpec('openrouter', 'openai/gpt-4o'),
  gemini: await createSpec('openrouter', 'google/gemini-2.5-flash'),
  llama: await createSpec('openrouter', 'meta-llama/llama-3.3-70b-instruct')
};

// Route based on task
function getModelForTask(task: string) {
  if (task === 'rag') return models.claude.id;      // Best citations
  if (task === 'creative') return models.gpt.id;     // Best creativity
  if (task === 'fast') return models.gemini.id;      // Fastest
  if (task === 'opensource') return models.llama.id; // No vendor lock-in
}
```

### Use Case 4: Development vs Production

Use OpenRouter for development, Vercel for production:

```typescript
const isDevelopment = process.env.NODE_ENV === 'development';

const spec = await graphlit.createSpecification({
  name: isDevelopment ? 'Dev: DeepSeek' : 'Prod: Claude',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: isDevelopment 
      ? 'https://openrouter.ai/api/v1'        // Cheap models for dev
      : 'https://ai-gateway.vercel.sh/v1',    // Observability for prod
    key: isDevelopment 
      ? process.env.OPENROUTER_API_KEY 
      : process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: isDevelopment 
      ? 'deepseek/deepseek-chat'              // $0.14 per M tokens
      : 'anthropic/claude-sonnet-4',          // Best quality
    temperature: 0.2,
    completionTokenLimit: 4000
  }
});
```

***

## Common Issues & Solutions

### Issue: "Model not found" error

**Cause**: Incorrect model name format or model not available on gateway

**Solution**:

* Verify model name uses `provider/model` format (e.g., `anthropic/claude-4.5-sonnet`)
* Check model availability:
  * OpenRouter: <https://openrouter.ai/models>
  * Vercel: Available models list in Vercel docs
* Ensure no typos in provider or model name

### Issue: Authentication errors

**Cause**: Using wrong API key or provider key instead of gateway key

**Solution**:

* For **OpenRouter**: Use API key from OpenRouter dashboard (starts with `sk-or-`)
* For **Vercel**: Use AI Gateway key from Vercel settings (not your OpenAI/Anthropic keys)
* Gateway keys are different from provider keys
* Check key is stored correctly in environment variables

### Issue: Different responses than direct provider

**Cause**: Gateways may have different default parameters

**Solution**:

```typescript
// Explicitly set all parameters
openAI: {
  endpoint: 'https://openrouter.ai/api/v1',
  modelName: 'anthropic/claude-4.5-sonnet',
  temperature: 0.2,        // Don't rely on defaults
  completionTokenLimit: 4000,
  probability: 0.9
}
```

### Issue: Rate limit errors

**Cause**: Gateway rate limits are separate from provider limits

**Solution**:

* OpenRouter: Check your plan limits in dashboard
* Vercel: Review rate limits in AI Gateway settings
* Consider upgrading plan or implementing request queuing
* Use multiple API keys for higher throughput

### Issue: Higher latency than expected

**Cause**: Gateway adds routing overhead

**Solution**:

* OpenRouter: \~50-100ms overhead
* Vercel: \~20-50ms overhead (with caching benefits)
* For ultra-low latency, use direct provider connection
* Use Vercel caching to reduce latency on repeated queries

### Issue: Costs higher than expected

**Cause**: Not utilizing caching or routing to cheaper models

**Solution** (OpenRouter):

```typescript
// Route to cheaper models when appropriate
const cheapSpec = createSpec('deepseek/deepseek-chat');  // vs claude
```

**Solution** (Vercel):

```typescript
// Enable caching automatically applies
// Monitor Vercel dashboard for cache hit rates
```

### Issue: `model` enum field confusion

**Cause**: Unclear which model enum value to use with custom endpoints

**Solution**:

```typescript
openAI: {
  model: OpenAiModels.Custom,  // Always use Custom for external gateways
  modelName: 'anthropic/claude-4.5-sonnet'  // This determines actual model
}
```

Always use `OpenAiModels.Custom` when configuring OpenRouter, Vercel AI Gateway, or any other external endpoint. The `modelName` field determines which model is actually used.

***

## Related Use Cases

* [Create Custom Model Specification](/api-guides/use-cases/specifications/specification-create-custom-model.md) - Basic model configuration
* [Create Completion Specification](/api-guides/use-cases/specifications/specification-create-completion.md) - LLM parameters and settings
* [Create Embedding Specification](/api-guides/use-cases/specifications/specification-create-embedding.md) - Vector embedding configuration

***

## Related Documentation

* [Specifications Reference](/platform/specifications.md) - Complete specs documentation
* [OpenRouter Models](https://openrouter.ai/models) - Browse 200+ available models
* [Vercel AI Gateway Docs](https://vercel.com/docs/ai-gateway) - Official Vercel documentation
* [OpenRouter Pricing](https://openrouter.ai/docs#pricing) - Model pricing comparison

***

## Production Patterns

### Pattern 1: Multi-Tier Routing (OpenRouter)

```typescript
// Create tiered specifications
const tiers = {
  free: await createOpenRouterSpec('deepseek/deepseek-chat'),
  paid: await createOpenRouterSpec('anthropic/claude-4.5-sonnet')
};

// Route based on user plan
function getSpecForUser(userId: string) {
  const user = getUserPlan(userId);
  return user.plan === 'premium' ? tiers.paid.id : tiers.free.id;
}
```

### Pattern 2: Cached Production (Vercel)

```typescript
// Production spec with caching
const productionSpec = await graphlit.createSpecification({
  name: 'Production RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Custom,
    endpoint: 'https://ai-gateway.vercel.sh/v1',
    key: process.env.VERCEL_AI_GATEWAY_KEY,
    modelName: 'anthropic/claude-sonnet-4',
    temperature: 0.1  // Low temp for consistent caching
  }
});
```

### Pattern 3: Multi-Gateway Fallback

```typescript
// Primary: Vercel (for caching)
// Fallback: OpenRouter (if Vercel down)
try {
  const answer = await promptWithSpec(vercelSpec);
} catch (error) {
  console.log('Vercel gateway down, falling back to OpenRouter');
  const answer = await promptWithSpec(openRouterSpec);
}
```

***

## Tags

\[openrouter, vercel, ai-gateway, custom-endpoint, multi-model, specification, openai-compatible, cost-optimization, observability]


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.graphlit.dev/api-guides/use-cases/specifications/specification-create-openai-compatible.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
