Create Completion Model
Specification: Create Completion Model
User Intent
"I want to configure which LLM model to use for RAG conversations or extraction"
Operation
SDK Method:
graphlit.createSpecification()with completion typeGraphQL:
createSpecificationmutationEntity Type: Specification
Common Use Cases: Configure RAG model, set extraction model, customize LLM parameters
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create GPT-4o specification for RAG
const specificationInput: SpecificationInput = {
name: 'GPT-4o for RAG',
type: SpecificationTypes.Completion,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Gpt4O_128K,
temperature: 0.1,
probability: 0.2,
completionTokenLimit: 4000
}
};
const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;
console.log(`Specification created: ${specId}`);
// Use specification in conversation
const conversation = await graphlit.createConversation({
name: 'RAG Chat',
specification: { id: specId }
});
// Or use specification in promptConversation
const answer = await graphlit.promptConversation({
prompt: 'Explain the API',
specification: { id: specId }
});
console.log(answer.message.message);Create specification (snake_case)
spec_input = SpecificationInput( name="GPT-4o for RAG", type=SpecificationTypes.Completion, service_type=ModelServiceTypes.OpenAi, open_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Gpt4OMini_128K, temperature=0.1, probability=0.2, completion_token_limit=4000 ) )
response = await graphlit.createSpecification(spec_input) spec_id = response.create_specification.id
Use in conversation
answer = await graphlit.promptConversation( prompt="Explain the API", specification=EntityReferenceInput(id=spec_id) )
Parameters
SpecificationInput (Required)
name(string): Specification nametype(SpecificationTypes): Must beCOMPLETIONserviceType(ModelServiceTypes): Model providerOPEN_AI- OpenAI modelsANTHROPIC- Anthropic Claude modelsGOOGLE- Google Gemini modelsGROQ- Groq (fast inference)MISTRAL- Mistral modelsCOHERE- Cohere modelsDEEPSEEK- DeepSeek models
Provider-Specific Configuration
OpenAI (openAI):
model(OpenAiModels): Model nameGPT_4O- Best overall (recommended)GPT_4O_MINI- Faster, cheaperO1- Reasoning model
temperature(float): Randomness (0-2, default 0.5)probability(float): Top-p sampling (0-1, default 1)completionTokenLimit(int): Max response tokens
Anthropic (anthropic):
model(AnthropicModels): Model nameCLAUDE_3_7_SONNET- Best balance (recommended)CLAUDE_3_7_OPUS- Most capableCLAUDE_3_5_HAIKU- Fastest
Google (google):
model(GoogleModels): Model nameGEMINI_2_0_FLASH- Fast, good qualityGEMINI_2_0_PRO- Most capable
Response
Developer Hints
Completion vs Other Specification Types
COMPLETION
RAG conversations
promptConversation, streamAgent
EXTRACTION
Entity extraction
Extraction workflows
PREPARATION
PDF/audio processing
Preparation workflows
EMBEDDING
Vector embeddings
Content ingestion
Important: Use COMPLETION for RAG conversations, not for workflows.
Temperature Settings by Use Case
Choosing the Right Model
Best for RAG Accuracy:
Claude Sonnet 3.7 - Best citation accuracy
GPT-4o - Great balance of speed/quality
Gemini 2.0 Flash - Fast, good quality, lower cost
Best for Speed:
GPT-4o-mini - Fastest OpenAI model
Claude Haiku 3.5 - Fastest Anthropic model
Groq - Ultra-fast inference (various models)
Best for Cost:
GPT-4o-mini - Cheapest capable model
Gemini 2.0 Flash - Free tier available
Claude Haiku 3.5 - Low cost, good quality
Reusable Specifications
Variations
1. Basic GPT-4o Specification
Simplest completion spec:
2. Claude Sonnet for High Accuracy
Best citation accuracy:
3. Budget-Friendly with GPT-4o-mini
Lower cost:
4. Groq for Ultra-Fast Inference
Fastest responses:
5. Gemini for Cost Efficiency
Google's models:
6. Long-Form Responses
Higher token limit:
Common Issues
Issue: Specification not found error
Solution: Verify specification ID is correct. Check it wasn't deleted. Store IDs in database.
Issue: Wrong specification type used
Solution: Use COMPLETION for RAG, not EXTRACTION or PREPARATION. Check type parameter.
Issue: Responses too short
Solution: Increase completionTokenLimit. Default may be too low for long-form responses.
Issue: Responses too random/inconsistent
Solution: Lower temperature (0.1-0.3 for factual responses). Lower probability for more focused outputs.
Issue: Model not available error Solution: Check model name enum matches available models. Some models require special access.
Production Example
Multi-model strategy:
Reusable specification pattern:
Last updated
Was this helpful?