Create Completion Model

Specification: Create Completion Model

User Intent

"I want to configure which LLM model to use for RAG conversations or extraction"

Operation

  • SDK Method: graphlit.createSpecification() with completion type

  • GraphQL: createSpecification mutation

  • Entity Type: Specification

  • Common Use Cases: Configure RAG model, set extraction model, customize LLM parameters

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Create GPT-4o specification for RAG
const specificationInput: SpecificationInput = {
  name: 'GPT-4o for RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    probability: 0.2,
    completionTokenLimit: 4000
  }
};

const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;

console.log(`Specification created: ${specId}`);

// Use specification in conversation
const conversation = await graphlit.createConversation({
  name: 'RAG Chat',
  specification: { id: specId }
});

// Or use specification in promptConversation
const answer = await graphlit.promptConversation({
  prompt: 'Explain the API',
  specification: { id: specId }
});

console.log(answer.message.message);

Create specification (snake_case)

spec_input = SpecificationInput( name="GPT-4o for RAG", type=SpecificationTypes.Completion, service_type=ModelServiceTypes.OpenAi, open_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Gpt4OMini_128K, temperature=0.1, probability=0.2, completion_token_limit=4000 ) )

response = await graphlit.createSpecification(spec_input) spec_id = response.create_specification.id

Use in conversation

answer = await graphlit.promptConversation( prompt="Explain the API", specification=EntityReferenceInput(id=spec_id) )


**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Create specification (PascalCase)
var specInput = new SpecificationInput {
    Name = "GPT-4o for RAG",
    Type = SpecificationTypes.Completion,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Gpt4O_128K,
        Temperature = 0.1f,
        Probability = 0.2f,
        CompletionTokenLimit = 4000
    }
};

var response = await graphlit.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;

// Use in conversation
var answer = await graphlit.PromptConversation(
    prompt: "Explain the API",
    specification: new EntityReferenceInput { Id = specId }
);

Parameters

SpecificationInput (Required)

  • name (string): Specification name

  • type (SpecificationTypes): Must be COMPLETION

  • serviceType (ModelServiceTypes): Model provider

    • OPEN_AI - OpenAI models

    • ANTHROPIC - Anthropic Claude models

    • GOOGLE - Google Gemini models

    • GROQ - Groq (fast inference)

    • MISTRAL - Mistral models

    • COHERE - Cohere models

    • DEEPSEEK - DeepSeek models

Provider-Specific Configuration

OpenAI (openAI):

  • model (OpenAiModels): Model name

    • GPT_4O - Best overall (recommended)

    • GPT_4O_MINI - Faster, cheaper

    • O1 - Reasoning model

  • temperature (float): Randomness (0-2, default 0.5)

  • probability (float): Top-p sampling (0-1, default 1)

  • completionTokenLimit (int): Max response tokens

Anthropic (anthropic):

  • model (AnthropicModels): Model name

    • CLAUDE_3_7_SONNET - Best balance (recommended)

    • CLAUDE_3_7_OPUS - Most capable

    • CLAUDE_3_5_HAIKU - Fastest

Google (google):

  • model (GoogleModels): Model name

    • GEMINI_2_0_FLASH - Fast, good quality

    • GEMINI_2_0_PRO - Most capable

Response

{
  createSpecification: {
    id: string;                            // Specification ID
    name: string;                          // Specification name
    state: EntityState;                    // ENABLED
    type: SpecificationCOMPLETION;   // COMPLETION
    serviceType: ModelServiceTypes;        // Provider
    openAI?: OpenAIModelProperties;        // OpenAI config
    anthropic?: AnthropicModelProperties;  // Anthropic config
    google?: GoogleModelProperties;        // Google config
  }
}

Developer Hints

Completion vs Other Specification Types

Type
Purpose
Used By

COMPLETION

RAG conversations

promptConversation, streamAgent

EXTRACTION

Entity extraction

Extraction workflows

PREPARATION

PDF/audio processing

Preparation workflows

EMBEDDING

Vector embeddings

Content ingestion

Important: Use COMPLETION for RAG conversations, not for workflows.

Temperature Settings by Use Case

// Factual Q&A (low temperature)
const factualSpec = {
  temperature: 0.1,  // Very deterministic
  probability: 0.2   // Focused on top tokens
};

// Creative writing (high temperature)
const creativeSpec = {
  temperature: 1.0,  // More random
  probability: 0.9   // Broader token selection
};

// Balanced (default)
const balancedSpec = {
  temperature: 0.5,  // Medium randomness
  probability: 0.8   // Reasonably focused
};

Choosing the Right Model

Best for RAG Accuracy:

  • Claude Sonnet 3.7 - Best citation accuracy

  • GPT-4o - Great balance of speed/quality

  • Gemini 2.0 Flash - Fast, good quality, lower cost

Best for Speed:

  • GPT-4o-mini - Fastest OpenAI model

  • Claude Haiku 3.5 - Fastest Anthropic model

  • Groq - Ultra-fast inference (various models)

Best for Cost:

  • GPT-4o-mini - Cheapest capable model

  • Gemini 2.0 Flash - Free tier available

  • Claude Haiku 3.5 - Low cost, good quality

Reusable Specifications

// Create once, reuse across conversations
const defaultSpec = await graphlit.createSpecification({
  name: 'Default RAG',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    completionTokenLimit: 2000
  }
});

// Use in multiple conversations
const conv1 = await graphlit.createConversation({
  name: 'Support Chat',
  specification: { id: defaultSpec.createSpecification.id }
});

const conv2 = await graphlit.createConversation({
  name: 'Product Q&A',
  specification: { id: defaultSpec.createSpecification.id }
});

Variations

1. Basic GPT-4o Specification

Simplest completion spec:

const spec = await graphlit.createSpecification({
  name: 'GPT-4o',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K
  }
});

2. Claude Sonnet for High Accuracy

Best citation accuracy:

const spec = await graphlit.createSpecification({
  name: 'Claude Sonnet 3.7',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Anthropic,
  anthropic: {
    model: AnthropicModels.Claude_3_7Sonnet,
    temperature: 0.1,
    maxTokens: 4000
  }
});

3. Budget-Friendly with GPT-4o-mini

Lower cost:

const spec = await graphlit.createSpecification({
  name: 'GPT-4o-mini',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4OMini_128K,
    temperature: 0.2,
    completionTokenLimit: 1000
  }
});

4. Groq for Ultra-Fast Inference

Fastest responses:

const spec = await graphlit.createSpecification({
  name: 'Groq Llama 3.3',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Groq,
  groq: {
    model: GroqModels.Llama_3_3_70B,
    temperature: 0.1
  }
});

5. Gemini for Cost Efficiency

Google's models:

const spec = await graphlit.createSpecification({
  name: 'Gemini 2.0 Flash',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.Google,
  google: {
    model: GoogleModels.Gemini_2_0_Flash,
    temperature: 0.2
  }
});

6. Long-Form Responses

Higher token limit:

const spec = await graphlit.createSpecification({
  name: 'GPT-4o Long Form',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.3,
    completionTokenLimit: 8000  // Longer responses
  }
});

Common Issues

Issue: Specification not found error Solution: Verify specification ID is correct. Check it wasn't deleted. Store IDs in database.

Issue: Wrong specification type used Solution: Use COMPLETION for RAG, not EXTRACTION or PREPARATION. Check type parameter.

Issue: Responses too short Solution: Increase completionTokenLimit. Default may be too low for long-form responses.

Issue: Responses too random/inconsistent Solution: Lower temperature (0.1-0.3 for factual responses). Lower probability for more focused outputs.

Issue: Model not available error Solution: Check model name enum matches available models. Some models require special access.

Production Example

Multi-model strategy:

// Create specifications for different use cases
const specs = {
  // High accuracy for customer support
  support: await graphlit.createSpecification({
    name: 'Claude Sonnet - Support',
    type: SpecificationTypes.Completion,
    serviceType: ModelServiceTypes.Anthropic,
    anthropic: {
      model: AnthropicModels.Claude_3_7Sonnet,
      temperature: 0.1,
      maxTokens: 2000
    }
  }),
  
  // Fast for internal queries
  internal: await graphlit.createSpecification({
    name: 'GPT-4o-mini - Internal',
    type: SpecificationTypes.Completion,
    serviceType: ModelServiceTypes.OpenAi,
    openAI: {
      model: OpenAiModels.Gpt4OMini_128K,
      temperature: 0.2,
      completionTokenLimit: 1000
    }
  })
};

// Use appropriate spec based on context
const answer = await graphlit.promptConversation({
  prompt: userQuestion,
  specification: { id: isCustomerFacing ? specs.support.createSpecification.id : specs.internal.createSpecification.id }
});

Reusable specification pattern:

// Create default specification once
const defaultSpec = await graphlit.createSpecification({
  name: 'Default GPT-4o',
  type: SpecificationTypes.Completion,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K,
    temperature: 0.1,
    probability: 0.2,
    completionTokenLimit: 2000
  }
});

// Store ID for reuse
await db.config.set('default_spec_id', defaultSpec.createSpecification.id);

// Reuse in all conversations
const specId = await db.config.get('default_spec_id');

const answer = await graphlit.promptConversation({
  prompt: question,
  specification: { id: specId }
});

Last updated

Was this helpful?