Create Embedding Model
Specification: Create Embedding Model
User Intent
"I want to configure which embedding model to use for vector search"
Operation
SDK Method:
graphlit.createSpecification()with embedding typeGraphQL:
createSpecificationmutationEntity Type: Specification
Common Use Cases: Configure vector embeddings, customize semantic search, optimize retrieval quality
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create embedding specification
const specificationInput: SpecificationInput = {
name: 'OpenAI text-embedding-3-large',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
};
const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;
console.log(`Embedding specification created: ${specId}`);
// Use specification during content ingestion
await graphlit.ingestUri(
'https://docs.example.com/page.html',
undefined, // name
undefined, // id
undefined, // identifier
true, // isSynchronous
undefined, // workflow
undefined, // collections
undefined, // observations
{ id: specId } // embedding specification
);
// Content will be embedded with specified modelCreate embedding specification (snake_case)
spec_input = SpecificationInput( name="OpenAI text-embedding-3-large", type=SpecificationTypes.Embedding, service_type=ModelServiceTypes.OpenAi, open_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Embedding_3Large ) )
response = await graphlit.createSpecification(spec_input) spec_id = response.create_specification.id
Use during ingestion
await graphlit.ingestUri( uri="https://docs.example.com/page.html", is_synchronous=True, embedding=EntityReferenceInput(id=spec_id) )
Parameters
SpecificationInput (Required)
name(string): Specification nametype(SpecificationTypes): Must beEMBEDDINGserviceType(ModelServiceTypes): Model providerOPEN_AI- OpenAI embedding modelsANTHROPIC- Voyage embeddings (via Anthropic)COHERE- Cohere embedding modelsMISTRAL- Mistral embedding modelsJINA_AI- Jina AI embeddings
Provider-Specific Configuration
OpenAI (openAI):
model(OpenAiModels): Embedding modelTEXT_EMBEDDING_3_LARGE- Best quality (recommended)TEXT_EMBEDDING_3_SMALL- Faster, lower costTEXT_EMBEDDING_ADA_002- Legacy model
Cohere (cohere):
model(CohereModels): Embedding modelEMBED_ENGLISH_V3- English textEMBED_MULTILINGUAL_V3- Multi-language
Voyage (voyage):
model(VoyageModels): Embedding modelVOYAGE_3_LARGE- Highest qualityVOYAGE_3- Balanced
Response
Developer Hints
Embedding Model Impacts Search Quality
Important: The embedding model determines semantic search quality. Better embeddings = better RAG retrieval.
When to Specify Embedding Model
Use Custom Embedding Spec When:
You need specific embedding dimensions
Optimizing for cost vs quality
Using non-default provider (Cohere, Voyage)
Multi-language content (use multilingual models)
Use Default When:
Standard English content
Not sure which model to use
Getting started / prototyping
Choosing Embedding Model
Best for Quality:
OpenAI text-embedding-3-large - Best overall (recommended)
Voyage 3 Large - Excellent quality
Cohere Embed v3 - Good for specific domains
Best for Cost:
OpenAI text-embedding-3-small - Good balance
Jina AI v2 - Free tier available
Best for Multi-Language:
Cohere Embed Multilingual v3 - Best multi-language
OpenAI text-embedding-3-large - Good multi-language support
Embedding Dimensions
Different models have different dimensions:
text-embedding-3-large: 3072 dimensions
text-embedding-3-small: 1536 dimensions
text-embedding-ada-002: 1536 dimensions
Important: You cannot change embedding models after content is ingested. The model used during ingestion is permanent for that content.
Variations
1. Basic OpenAI Large Embedding
Highest quality (recommended):
2. Budget-Friendly Small Embedding
Lower cost:
3. Cohere for Multi-Language
Best for non-English:
4. Voyage for High Accuracy
Alternative high-quality option:
5. Project-Wide Default Embedding
Set default for all content:
6. Domain-Specific Embeddings
Different models for different content types:
Common Issues
Issue: Search quality degraded after changing embedding model Solution: You can't change embeddings for existing content. Must re-ingest all content with new model.
Issue: Specification not found error
Solution: Verify specification ID is correct. Check type is EMBEDDING not COMPLETION.
Issue: Multi-language search not working well Solution: Use multilingual embedding models (Cohere Multilingual, not OpenAI).
Issue: High embedding costs
Solution: Use text-embedding-3-small instead of large. Quality difference is small for many use cases.
Issue: Inconsistent search results Solution: Ensure all content uses same embedding model. Mixed embeddings cause poor results.
Production Example
Project-wide embedding configuration:
Multi-environment embedding strategy:
Last updated
Was this helpful?