Create Embedding Model
Specification: Create Embedding Model
User Intent
"I want to configure which embedding model to use for vector search"
Operation
SDK Method:
graphlit.createSpecification()with embedding typeGraphQL:
createSpecificationmutationEntity Type: Specification
Common Use Cases: Configure vector embeddings, customize semantic search, optimize retrieval quality
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create embedding specification
const specificationInput: SpecificationInput = {
name: 'OpenAI text-embedding-3-large',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
};
const response = await graphlit.createSpecification(specificationInput);
const specId = response.createSpecification.id;
console.log(`Embedding specification created: ${specId}`);
// Use specification during content ingestion
await graphlit.ingestUri(
'https://docs.example.com/page.html',
undefined, // name
undefined, // id
undefined, // identifier
true, // isSynchronous
undefined, // workflow
undefined, // collections
undefined, // observations
{ id: specId } // embedding specification
);
// Content will be embedded with specified modelCreate embedding specification (snake_case)
spec_input = SpecificationInput( name="OpenAI text-embedding-3-large", type=SpecificationTypes.Embedding, service_type=ModelServiceTypes.OpenAi, open_ai=OpenAiModelPropertiesInput( model=OpenAiModels.Embedding_3Large ) )
response = await graphlit.createSpecification(spec_input) spec_id = response.create_specification.id
Use during ingestion
await graphlit.ingestUri( uri="https://docs.example.com/page.html", is_synchronous=True, embedding=EntityReferenceInput(id=spec_id) )
**C#**:
```csharp
using Graphlit;
var client = new Graphlit();
// Create embedding specification (PascalCase)
var specInput = new SpecificationInput {
Name = "OpenAI text-embedding-3-large",
Type = SpecificationTypes.Embedding,
ServiceType = ModelServiceTypes.OpenAi,
OpenAI = new OpenAIModelPropertiesInput {
Model = OpenAiModels.Embedding_3Large
}
};
var response = await graphlit.CreateSpecification(specInput);
var specId = response.CreateSpecification.Id;
// Use during ingestion
await graphlit.IngestUri(
uri: "https://docs.example.com/page.html",
isSynchronous: true,
embedding: new EntityReferenceInput { Id = specId }
);Parameters
SpecificationInput (Required)
name(string): Specification nametype(SpecificationTypes): Must beEMBEDDINGserviceType(ModelServiceTypes): Model providerOPEN_AI- OpenAI embedding modelsANTHROPIC- Voyage embeddings (via Anthropic)COHERE- Cohere embedding modelsMISTRAL- Mistral embedding modelsJINA_AI- Jina AI embeddings
Provider-Specific Configuration
OpenAI (openAI):
model(OpenAiModels): Embedding modelTEXT_EMBEDDING_3_LARGE- Best quality (recommended)TEXT_EMBEDDING_3_SMALL- Faster, lower costTEXT_EMBEDDING_ADA_002- Legacy model
Cohere (cohere):
model(CohereModels): Embedding modelEMBED_ENGLISH_V3- English textEMBED_MULTILINGUAL_V3- Multi-language
Voyage (voyage):
model(VoyageModels): Embedding modelVOYAGE_3_LARGE- Highest qualityVOYAGE_3- Balanced
Response
{
createSpecification: {
id: string; // Specification ID
name: string; // Specification name
state: EntityState; // ENABLED
type: SpecificationEMBEDDING; // EMBEDDING
serviceType: ModelServiceTypes; // Provider
openAI?: OpenAIModelProperties; // OpenAI config
}
}Developer Hints
Embedding Model Impacts Search Quality
Important: The embedding model determines semantic search quality. Better embeddings = better RAG retrieval.
// Higher quality (recommended for production)
const largeEmbedding = {
model: OpenAiModels.Embedding_3Large
};
// Lower cost (good for testing)
const smallEmbedding = {
model: OpenAiModels.Embedding_3Small
};When to Specify Embedding Model
Use Custom Embedding Spec When:
You need specific embedding dimensions
Optimizing for cost vs quality
Using non-default provider (Cohere, Voyage)
Multi-language content (use multilingual models)
Use Default When:
Standard English content
Not sure which model to use
Getting started / prototyping
// Default (no embedding spec) - uses OpenAI text-embedding-3-small
await graphlit.ingestUri(uri, undefined, undefined, undefined, true);
// Custom embedding spec - uses specified model
await graphlit.ingestUri(
uri, undefined, undefined, undefined, true,
undefined, undefined, undefined,
{ id: embeddingSpecId }
);Choosing Embedding Model
Best for Quality:
OpenAI text-embedding-3-large - Best overall (recommended)
Voyage 3 Large - Excellent quality
Cohere Embed v3 - Good for specific domains
Best for Cost:
OpenAI text-embedding-3-small - Good balance
Jina AI v2 - Free tier available
Best for Multi-Language:
Cohere Embed Multilingual v3 - Best multi-language
OpenAI text-embedding-3-large - Good multi-language support
Embedding Dimensions
Different models have different dimensions:
text-embedding-3-large: 3072 dimensions
text-embedding-3-small: 1536 dimensions
text-embedding-ada-002: 1536 dimensions
Important: You cannot change embedding models after content is ingested. The model used during ingestion is permanent for that content.
Variations
1. Basic OpenAI Large Embedding
Highest quality (recommended):
const spec = await graphlit.createSpecification({
name: 'OpenAI Large',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
});2. Budget-Friendly Small Embedding
Lower cost:
const spec = await graphlit.createSpecification({
name: 'OpenAI Small',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Small
}
});3. Cohere for Multi-Language
Best for non-English:
const spec = await graphlit.createSpecification({
name: 'Cohere Multilingual',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.Cohere,
cohere: {
model: CohereModels.Embed_Multilingual_V3
}
});4. Voyage for High Accuracy
Alternative high-quality option:
const spec = await graphlit.createSpecification({
name: 'Voyage 3 Large',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.Voyage,
voyage: {
model: VoyageModels.Voyage_3_Large
}
});5. Project-Wide Default Embedding
Set default for all content:
// Create specification
const defaultEmbedding = await graphlit.createSpecification({
name: 'Default Embeddings',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
});
// Store ID for reuse
const embeddingSpecId = defaultEmbedding.createSpecification.id;
// Use for all ingestion
await graphlit.ingestUri(
uri, undefined, undefined, undefined, true,
undefined, undefined, undefined,
{ id: embeddingSpecId }
);6. Domain-Specific Embeddings
Different models for different content types:
// Technical documentation
const technicalSpec = await graphlit.createSpecification({
name: 'Technical Docs',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
});
// Customer support (lower cost)
const supportSpec = await graphlit.createSpecification({
name: 'Support Content',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Small
}
});
// Ingest with appropriate spec
await graphlit.ingestUri(
technicalDocUri,
undefined, undefined, undefined, true,
undefined, undefined, undefined,
{ id: technicalSpec.createSpecification.id }
);Common Issues
Issue: Search quality degraded after changing embedding model Solution: You can't change embeddings for existing content. Must re-ingest all content with new model.
Issue: Specification not found error
Solution: Verify specification ID is correct. Check type is EMBEDDING not COMPLETION.
Issue: Multi-language search not working well Solution: Use multilingual embedding models (Cohere Multilingual, not OpenAI).
Issue: High embedding costs
Solution: Use text-embedding-3-small instead of large. Quality difference is small for many use cases.
Issue: Inconsistent search results Solution: Ensure all content uses same embedding model. Mixed embeddings cause poor results.
Production Example
Project-wide embedding configuration:
// Create embedding spec once during setup
async function setupProject() {
const embeddingSpec = await graphlit.createSpecification({
name: 'Project Default Embeddings',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large
}
});
// Store ID in database/config
await db.config.set('embedding_spec_id', embeddingSpec.createSpecification.id);
return embeddingSpec.createSpecification.id;
}
// Use in all ingestion
async function ingestContent(uri: string) {
const embeddingSpecId = await db.config.get('embedding_spec_id');
return await graphlit.ingestUri(
uri,
undefined, undefined, undefined, true,
undefined, undefined, undefined,
{ id: embeddingSpecId }
);
}Multi-environment embedding strategy:
// Different embeddings for dev/prod
const embeddingSpecs = {
development: await graphlit.createSpecification({
name: 'Dev - Small Embeddings',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Small // Lower cost
}
}),
production: await graphlit.createSpecification({
name: 'Prod - Large Embeddings',
type: SpecificationTypes.Embedding,
serviceType: ModelServiceTypes.OpenAi,
openAI: {
model: OpenAiModels.Embedding_3Large // Best quality
}
})
};
// Use based on environment
const specId = process.env.NODE_ENV === 'production'
? embeddingSpecs.production.createSpecification.id
: embeddingSpecs.development.createSpecification.id;
await graphlit.ingestUri(
uri, undefined, undefined, undefined, true,
undefined, undefined, undefined,
{ id: specId }
);Last updated
Was this helpful?