Create Completion Model
Specification: Create Completion Model
User Intent
"I want to configure which LLM model to use for RAG conversations or extraction"
Operation
SDK Method:
graphlit.createSpecification()with completion typeGraphQL:
createSpecificationmutationEntity Type: Specification
Common Use Cases: Configure RAG model, set extraction model, customize LLM parameters
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create GPT-4o specification for RAG
const specificationInput: SpecificationInput = {
name: 'GPT-4o for RAG',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K,
temperature: 0.1,
probability: 0.2,
completionTokenLimit: 4000
}
};
const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;
console.log(`Specification created: ${specId}`);
// Use specification in conversation
const conversation = await graphlit.createConversation({
name: 'RAG Chat',
specification: { id: specId }
});
// Or use specification in promptConversation
const answer = await graphlit.promptConversation({
prompt: 'Explain the API',
specification: { id: specId }
});
console.log(answer.message.message);Create specification (snake_case)
spec_input = SpecificationInput( name="GPT-4o for RAG", type=SpecificationTypes.Completion, service_type=ModelServiceTypes.OpenAi, open_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Gpt4OMini_128K, temperature=0.1, probability=0.2, completion_token_limit=4000 ) )
response = await graphlit.createSpecification(spec_input) spec_id = response.create_specification.id
Use in conversation
answer = await graphlit.promptConversation( prompt="Explain the API", specification=EntityReferenceInput(id=spec_id) )
**C#**:
```csharp
using Graphlit;
var client = new Graphlit();
// Create specification (PascalCase)
var specInput = new SpecificationInput {
Name = "GPT-4o for RAG",
Type = SpecificationTypes.Completion,
ServiceType = ModelServiceTypes.OpenAi,
OpenAI = new OpenAIModelPropertiesInput {
Model = OpenAiModels.Gpt4O_128K,
Temperature = 0.1f,
Probability = 0.2f,
CompletionTokenLimit = 4000
}
};
var response = await graphlit.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;
// Use in conversation
var answer = await graphlit.PromptConversation(
prompt: "Explain the API",
specification: new EntityReferenceInput { Id = specId }
);Parameters
SpecificationInput (Required)
name(string): Specification nametype(SpecificationTypes): Must beCOMPLETIONserviceType(ModelServiceTypes): Model providerOPEN_AI- OpenAI modelsANTHROPIC- Anthropic Claude modelsGOOGLE- Google Gemini modelsGROQ- Groq (fast inference)MISTRAL- Mistral modelsCOHERE- Cohere modelsDEEPSEEK- DeepSeek models
Provider-Specific Configuration
OpenAI (openAI):
model(OpenAiModels): Model nameGPT_4O- Best overall (recommended)GPT_4O_MINI- Faster, cheaperO1- Reasoning model
temperature(float): Randomness (0-2, default 0.5)probability(float): Top-p sampling (0-1, default 1)completionTokenLimit(int): Max response tokens
Anthropic (anthropic):
model(AnthropicModels): Model nameCLAUDE_3_7_SONNET- Best balance (recommended)CLAUDE_3_7_OPUS- Most capableCLAUDE_3_5_HAIKU- Fastest
Google (google):
model(GoogleModels): Model nameGEMINI_2_0_FLASH- Fast, good qualityGEMINI_2_0_PRO- Most capable
Response
{
createSpecification: {
id: string; // Specification ID
name: string; // Specification name
state: EntityState; // ENABLED
type: SpecificationCOMPLETION; // COMPLETION
serviceType: ModelServiceTypes; // Provider
openAI?: OpenAIModelProperties; // OpenAI config
anthropic?: AnthropicModelProperties; // Anthropic config
google?: GoogleModelProperties; // Google config
}
}Developer Hints
Completion vs Other Specification Types
COMPLETION
RAG conversations
promptConversation, streamAgent
EXTRACTION
Entity extraction
Extraction workflows
PREPARATION
PDF/audio processing
Preparation workflows
EMBEDDING
Vector embeddings
Content ingestion
Important: Use COMPLETION for RAG conversations, not for workflows.
Temperature Settings by Use Case
// Factual Q&A (low temperature)
const factualSpec = {
temperature: 0.1, // Very deterministic
probability: 0.2 // Focused on top tokens
};
// Creative writing (high temperature)
const creativeSpec = {
temperature: 1.0, // More random
probability: 0.9 // Broader token selection
};
// Balanced (default)
const balancedSpec = {
temperature: 0.5, // Medium randomness
probability: 0.8 // Reasonably focused
};Choosing the Right Model
Best for RAG Accuracy:
Claude Sonnet 3.7 - Best citation accuracy
GPT-4o - Great balance of speed/quality
Gemini 2.0 Flash - Fast, good quality, lower cost
Best for Speed:
GPT-4o-mini - Fastest OpenAI model
Claude Haiku 3.5 - Fastest Anthropic model
Groq - Ultra-fast inference (various models)
Best for Cost:
GPT-4o-mini - Cheapest capable model
Gemini 2.0 Flash - Free tier available
Claude Haiku 3.5 - Low cost, good quality
Reusable Specifications
// Create once, reuse across conversations
const defaultSpec = await graphlit.createSpecification({
name: 'Default RAG',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K,
temperature: 0.1,
completionTokenLimit: 2000
}
});
// Use in multiple conversations
const conv1 = await graphlit.createConversation({
name: 'Support Chat',
specification: { id: defaultSpec.createSpecification.id }
});
const conv2 = await graphlit.createConversation({
name: 'Product Q&A',
specification: { id: defaultSpec.createSpecification.id }
});Variations
1. Basic GPT-4o Specification
Simplest completion spec:
const spec = await graphlit.createSpecification({
name: 'GPT-4o',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K
}
});2. Claude Sonnet for High Accuracy
Best citation accuracy:
const spec = await graphlit.createSpecification({
name: 'Claude Sonnet 3.7',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.Anthropic,
anthropic: {
model: AnthropicModels.Claude_3_7Sonnet,
temperature: 0.1,
maxTokens: 4000
}
});3. Budget-Friendly with GPT-4o-mini
Lower cost:
const spec = await graphlit.createSpecification({
name: 'GPT-4o-mini',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4OMini_128K,
temperature: 0.2,
completionTokenLimit: 1000
}
});4. Groq for Ultra-Fast Inference
Fastest responses:
const spec = await graphlit.createSpecification({
name: 'Groq Llama 3.3',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.Groq,
groq: {
model: GroqModels.Llama_3_3_70B,
temperature: 0.1
}
});5. Gemini for Cost Efficiency
Google's models:
const spec = await graphlit.createSpecification({
name: 'Gemini 2.0 Flash',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.Google,
google: {
model: GoogleModels.Gemini_2_0_Flash,
temperature: 0.2
}
});6. Long-Form Responses
Higher token limit:
const spec = await graphlit.createSpecification({
name: 'GPT-4o Long Form',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K,
temperature: 0.3,
completionTokenLimit: 8000 // Longer responses
}
});Common Issues
Issue: Specification not found error
Solution: Verify specification ID is correct. Check it wasn't deleted. Store IDs in database.
Issue: Wrong specification type used
Solution: Use COMPLETION for RAG, not EXTRACTION or PREPARATION. Check type parameter.
Issue: Responses too short
Solution: Increase completionTokenLimit. Default may be too low for long-form responses.
Issue: Responses too random/inconsistent
Solution: Lower temperature (0.1-0.3 for factual responses). Lower probability for more focused outputs.
Issue: Model not available error Solution: Check model name enum matches available models. Some models require special access.
Production Example
Multi-model strategy:
// Create specifications for different use cases
const specs = {
// High accuracy for customer support
support: await graphlit.createSpecification({
name: 'Claude Sonnet - Support',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.Anthropic,
anthropic: {
model: AnthropicModels.Claude_3_7Sonnet,
temperature: 0.1,
maxTokens: 2000
}
}),
// Fast for internal queries
internal: await graphlit.createSpecification({
name: 'GPT-4o-mini - Internal',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4OMini_128K,
temperature: 0.2,
completionTokenLimit: 1000
}
})
};
// Use appropriate spec based on context
const answer = await graphlit.promptConversation({
prompt: userQuestion,
specification: { id: isCustomerFacing ? specs.support.createSpecification.id : specs.internal.createSpecification.id }
});Reusable specification pattern:
// Create default specification once
const defaultSpec = await graphlit.createSpecification({
name: 'Default GPT-4o',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K,
temperature: 0.1,
probability: 0.2,
completionTokenLimit: 2000
}
});
// Store ID for reuse
await db.config.set('default_spec_id', defaultSpec.createSpecification.id);
// Reuse in all conversations
const specId = await db.config.get('default_spec_id');
const answer = await graphlit.promptConversation({
prompt: question,
specification: { id: specId }
});Last updated
Was this helpful?