Vector Search Explained
Content: Vector Search Explained
User Intent
"How does semantic/vector search work in Graphlit?"
Operation
SDK Method:
queryContents()withsearchType: SearchVectorGraphQL:
queryContentsqueryCommon Use Cases: Semantic similarity, concept search, "find similar documents"
How Vector Search Works
Vector search finds content based on semantic meaning, not just keyword matches. Content is converted to high-dimensional vectors (embeddings), and similarity is measured by distance between vectors.
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { ContentTypes, FileTypes, ModelServiceTypes, SearchTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Pure vector search
const results = await graphlit.queryContents({
search: "machine learning research papers",
searchType: SearchTypes.Vector,
limit: 10
});
console.log(`Found ${results.contents.results.length} semantically similar results`);
results.contents.results.forEach((content, index) => {
console.log(`\n${index + 1}. ${content.name}`);
console.log(` Relevance: ${content.relevance}`);
console.log(` Type: ${content.type}`);
// Show matching text chunk
if (content.pages && content.pages.length > 0) {
const topChunk = content.pages[0].chunks?.[0];
if (topChunk) {
console.log(` Match: "${topChunk.text.substring(0, 100)}..."`);
}
}
});
// Find similar documents to a specific one
const similar = await graphlit.queryContents({
filter: {
similarContents: [{ id: 'content-id' }]
},
limit: 5
});
console.log(`\nDocuments similar to content-id:`);
similar.contents.results.forEach(content => {
console.log(`- ${content.name} (relevance: ${content.relevance})`);
});The Vector Search Pipeline
1. Ingestion Time (Embedding Creation)
2. Query Time (Semantic Search)
When to Use Vector Search
** Good For**:
Conceptual queries ("AI safety concerns")
Semantic similarity ("find similar documents")
Cross-language concepts
Synonyms and paraphrases
Question answering
Vague or exploratory queries
** Not Good For**:
Exact phrases ("Project Alpha v2.3")
Names and IDs ("PROJ-1234")
Codes and identifiers
Very short queries (< 3 words)
Spelling-sensitive searches
Examples: Vector vs Keyword
Embedding Models
Current Default: OpenAI text-embedding-3-large
Dimensions: 3072
Quality: Highest
Cost: Higher
Use: Best results, production recommended
Alternative: OpenAI text-embedding-3-small
Dimensions: 1536
Quality: Good
Cost: Lower
Use: Cost-sensitive applications
Legacy: OpenAI text-embedding-ada-002
Dimensions: 1536
Quality: Baseline
Cost: Lower
Use: Not recommended for new projects
Configure Embedding Model:
Similarity Scoring
Relevance Score (content.relevance):
Range: 0.0 (no match) to 1.0 (perfect match)
Based on cosine similarity
Higher = more semantically similar
Typically use threshold (e.g., > 0.7)
Vector search (snake_case)
results = await graphlit.client.query_contents( search="machine learning research", search_type=SearchTypes.Vector, limit=10 )
for content in results.contents.results: print(f"{content.name} - Relevance: {content.relevance}")
Find similar documents
similar = await graphlit.client.query_contents( filter=ContentFilterInput( similar_contents=[ EntityReferenceInput(id='content-id') ] ) )
Developer Hints
Vector Search is Expensive
Chunk Size Matters
Query Quality Tips
Variations
1. Basic Vector Search
2. Vector Search with Filters
3. Find Similar Documents
4. Vector Search with Relevance Threshold
5. Multi-Collection Vector Search
6. Vector Search Pagination
Common Issues & Solutions
Issue: No results returned Solution: Query might be too specific or embeddings not created
Issue: Results not semantically relevant Solution: Try hybrid search or keyword search
Issue: Want to change embedding model for existing content Solution: Must re-ingest content with new specification
Issue: Slow query performance Solution: Vector search is computationally expensive
Production Example
Last updated
Was this helpful?