> For the complete documentation index, see [llms.txt](https://docs.graphlit.dev/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.graphlit.dev/api-guides/use-cases/specifications/specification-create-completion.md).

# Create Completion Model

## Specification: Create Completion Model

### User Intent

"I want to configure which LLM model to use for RAG conversations or extraction"

### Operation

* **SDK Method**: `graphlit.createSpecification()` with completion type
* **GraphQL**: `createSpecification` mutation
* **Entity Type**: Specification
* **Common Use Cases**: Configure RAG model, set extraction model, customize LLM parameters

### TypeScript (Canonical)

```typescript
import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create GPT-4o specification for RAG
const specificationInput: SpecificationInput = {
  name: 'GPT-4o for RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    probability: 0.2,
    completionTokenLimit: 4000
  }
};

const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;

console.log(`Specification created: ${specId}`);

// Use specification in conversation
const conversation = await graphlit.createConversation({
  name: 'RAG Chat',
  specification: { id: specId }
});

// Or use specification in promptConversation
const answer = await graphlit.promptConversation({
  prompt: 'Explain the API',
  specification: { id: specId }
});

console.log(answer.message.message);
```

## Create specification (snake\_case)

spec\_input = SpecificationInput( name="GPT-4o for RAG", type=SpecificationTypes.Completion, service\_type=ModelServiceTypes.OpenAi, open\_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Gpt4OMini\_128K, temperature=0.1, probability=0.2, completion\_token\_limit=4000 ) )

response = await graphlit.createSpecification(spec\_input) spec\_id = response.create\_specification.id

## Use in conversation

answer = await graphlit.promptConversation( prompt="Explain the API", specification=EntityReferenceInput(id=spec\_id) )

````

**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Create specification (PascalCase)
var specInput = new SpecificationInput {
    Name = "GPT-4o for RAG",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Gpt4O_128K,
        Temperature = 0.1f,
        Probability = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await graphlit.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

// Use in conversation
var answer = await graphlit.PromptConversation(
    prompt: "Explain the API",
    specification: new EntityReferenceInput { Id = specId }
);
````

### Parameters

#### SpecificationInput (Required)

* **`name`** (string): Specification name
* **`type`** (SpecificationTypes): Must be `COMPLETION`
* **`serviceType`** (ModelServiceTypes): Model provider
  * `OPEN_AI` - OpenAI models
  * `ANTHROPIC` - Anthropic Claude models
  * `GOOGLE` - Google Gemini models
  * `GROQ` - Groq (fast inference)
  * `MISTRAL` - Mistral models
  * `COHERE` - Cohere models
  * `DEEPSEEK` - DeepSeek models

#### Provider-Specific Configuration

**OpenAI** (`openAI`):

* **`model`** (OpenAiModels): Model name
  * `GPT_4O` - Best overall (recommended)
  * `GPT_4O_MINI` - Faster, cheaper
  * `O1` - Reasoning model
* **`temperature`** (float): Randomness (0-2, default 0.5)
* **`probability`** (float): Top-p sampling (0-1, default 1)
* **`completionTokenLimit`** (int): Max response tokens

**Anthropic** (`anthropic`):

* **`model`** (AnthropicModels): Model name
  * `CLAUDE_3_7_SONNET` - Best balance (recommended)
  * `CLAUDE_3_7_OPUS` - Most capable
  * `CLAUDE_3_5_HAIKU` - Fastest

**Google** (`google`):

* **`model`** (GoogleModels): Model name
  * `GEMINI_2_0_FLASH` - Fast, good quality
  * `GEMINI_2_0_PRO` - Most capable

### Response

```typescript
{
  createSpecification: {
    id: string;                            // Specification ID
    name: string;                          // Specification name
    state: EntityState;                    // ENABLED
    type: SpecificationCOMPLETION;   // COMPLETION
    serviceType: ModelServiceTypes;        // Provider
    openAI?: OpenAIModelProperties;        // OpenAI config
    anthropic?: AnthropicModelProperties;  // Anthropic config
    google?: GoogleModelProperties;        // Google config
  }
}
```

### Developer Hints

#### Completion vs Other Specification Types

| Type          | Purpose              | Used By                             |
| ------------- | -------------------- | ----------------------------------- |
| `COMPLETION`  | RAG conversations    | `promptConversation`, `streamAgent` |
| `EXTRACTION`  | Entity extraction    | Extraction workflows                |
| `PREPARATION` | PDF/audio processing | Preparation workflows               |
| `EMBEDDING`   | Vector embeddings    | Content ingestion                   |

**Important**: Use `COMPLETION` for RAG conversations, not for workflows.

#### Temperature Settings by Use Case

```typescript
// Factual Q&A (low temperature)
const factualSpec = {
  temperature: 0.1,  // Very deterministic
  probability: 0.2   // Focused on top tokens
};

// Creative writing (high temperature)
const creativeSpec = {
  temperature: 1.0,  // More random
  probability: 0.9   // Broader token selection
};

// Balanced (default)
const balancedSpec = {
  temperature: 0.5,  // Medium randomness
  probability: 0.8   // Reasonably focused
};
```

#### Choosing the Right Model

**Best for RAG Accuracy**:

* **Claude Sonnet 3.7** - Best citation accuracy
* **GPT-4o** - Great balance of speed/quality
* **Gemini 2.0 Flash** - Fast, good quality, lower cost

**Best for Speed**:

* **GPT-4o-mini** - Fastest OpenAI model
* **Claude Haiku 3.5** - Fastest Anthropic model
* **Groq** - Ultra-fast inference (various models)

**Best for Cost**:

* **GPT-4o-mini** - Cheapest capable model
* **Gemini 2.0 Flash** - Free tier available
* **Claude Haiku 3.5** - Low cost, good quality

#### Reusable Specifications

```typescript
// Create once, reuse across conversations
const defaultSpec = await graphlit.createSpecification({
  name: 'Default RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    completionTokenLimit: 2000
  }
});

// Use in multiple conversations
const conv1 = await graphlit.createConversation({
  name: 'Support Chat',
  specification: { id: defaultSpec.createSpecification.id }
});

const conv2 = await graphlit.createConversation({
  name: 'Product Q&A',
  specification: { id: defaultSpec.createSpecification.id }
});
```

### Variations

#### 1. Basic GPT-4o Specification

Simplest completion spec:

```typescript
const spec = await graphlit.createSpecification({
  name: 'GPT-4o',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K
  }
});
```

#### 2. Claude Sonnet for High Accuracy

Best citation accuracy:

```typescript
const spec = await graphlit.createSpecification({
  name: 'Claude Sonnet 3.7',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Anthropic,
  anthropic: {
    model: AnthropicModels.Claude_3_7Sonnet,
    temperature: 0.1,
    maxTokens: 4000
  }
});
```

#### 3. Budget-Friendly with GPT-4o-mini

Lower cost:

```typescript
const spec = await graphlit.createSpecification({
  name: 'GPT-4o-mini',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4OMini_128K,
    temperature: 0.2,
    completionTokenLimit: 1000
  }
});
```

#### 4. Groq for Ultra-Fast Inference

Fastest responses:

```typescript
const spec = await graphlit.createSpecification({
  name: 'Groq Llama 3.3',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Groq,
  groq: {
    model: GroqModels.Llama_3_3_70B,
    temperature: 0.1
  }
});
```

#### 5. Gemini for Cost Efficiency

Google's models:

```typescript
const spec = await graphlit.createSpecification({
  name: 'Gemini 2.0 Flash',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Google,
  google: {
    model: GoogleModels.Gemini_2_0_Flash,
    temperature: 0.2
  }
});
```

#### 6. Long-Form Responses

Higher token limit:

```typescript
const spec = await graphlit.createSpecification({
  name: 'GPT-4o Long Form',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.3,
    completionTokenLimit: 8000  // Longer responses
  }
});
```

### Common Issues

**Issue**: `Specification not found` error\
**Solution**: Verify specification ID is correct. Check it wasn't deleted. Store IDs in database.

**Issue**: Wrong specification type used\
**Solution**: Use `COMPLETION` for RAG, not `EXTRACTION` or `PREPARATION`. Check `type` parameter.

**Issue**: Responses too short\
**Solution**: Increase `completionTokenLimit`. Default may be too low for long-form responses.

**Issue**: Responses too random/inconsistent\
**Solution**: Lower `temperature` (0.1-0.3 for factual responses). Lower `probability` for more focused outputs.

**Issue**: Model not available error\
**Solution**: Check model name enum matches available models. Some models require special access.

### Production Example

**Multi-model strategy**:

```typescript
// Create specifications for different use cases
const specs = {
  // High accuracy for customer support
  support: await graphlit.createSpecification({
    name: 'Claude Sonnet - Support',
    type: SpecificationTypes.Completion,
    serviceType: ModelServiceTypes.Anthropic,
    anthropic: {
      model: AnthropicModels.Claude_3_7Sonnet,
      temperature: 0.1,
      maxTokens: 2000
    }
  }),
  
  // Fast for internal queries
  internal: await graphlit.createSpecification({
    name: 'GPT-4o-mini - Internal',
    type: SpecificationTypes.Completion,
    serviceType: ModelServiceTypes.OpenAi,
    openAI: {
      model: OpenAiModels.Gpt4OMini_128K,
      temperature: 0.2,
      completionTokenLimit: 1000
    }
  })
};

// Use appropriate spec based on context
const answer = await graphlit.promptConversation({
  prompt: userQuestion,
  specification: { id: isCustomerFacing ? specs.support.createSpecification.id : specs.internal.createSpecification.id }
});
```

**Reusable specification pattern**:

```typescript
// Create default specification once
const defaultSpec = await graphlit.createSpecification({
  name: 'Default GPT-4o',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    probability: 0.2,
    completionTokenLimit: 2000
  }
});

// Store ID for reuse
await db.config.set('default_spec_id', defaultSpec.createSpecification.id);

// Reuse in all conversations
const specId = await db.config.get('default_spec_id');

const answer = await graphlit.promptConversation({
  prompt: question,
  specification: { id: specId }
});
```


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.graphlit.dev/api-guides/use-cases/specifications/specification-create-completion.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
