Vector Search Explained

Content: Vector Search Explained

User Intent

"How does semantic/vector search work in Graphlit?"

Operation

  • SDK Method: queryContents() with searchType: SearchVector

  • GraphQL: queryContents query

  • Common Use Cases: Semantic similarity, concept search, "find similar documents"

How Vector Search Works

Vector search finds content based on semantic meaning, not just keyword matches. Content is converted to high-dimensional vectors (embeddings), and similarity is measured by distance between vectors.

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, FileTypes, ModelServiceTypes, SearchTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Pure vector search
const results = await graphlit.queryContents({
  search: "machine learning research papers",
  searchType: SearchTypes.Vector,
  limit: 10
});

console.log(`Found ${results.contents.results.length} semantically similar results`);

results.contents.results.forEach((content, index) => {
  console.log(`\n${index + 1}. ${content.name}`);
  console.log(`   Relevance: ${content.relevance}`);
  console.log(`   Type: ${content.type}`);
  
  // Show matching text chunk
  if (content.pages && content.pages.length > 0) {
    const topChunk = content.pages[0].chunks?.[0];
    if (topChunk) {
      console.log(`   Match: "${topChunk.text.substring(0, 100)}..."`);
    }
  }
});

// Find similar documents to a specific one
const similar = await graphlit.queryContents({
  filter: {
    similarContents: [{ id: 'content-id' }]
  },
  limit: 5
});

console.log(`\nDocuments similar to content-id:`);
similar.contents.results.forEach(content => {
  console.log(`- ${content.name} (relevance: ${content.relevance})`);
});

The Vector Search Pipeline

1. Ingestion Time (Embedding Creation)

// When content is ingested:
const content = await graphlit.ingestUri(
  'https://example.com/document.pdf',
  undefined,
  undefined,
  undefined,
  undefined,
  { id: 'workflow-id' }
);

// Behind the scenes:
// 1. Document is chunked (default: 512 tokens per chunk)
// 2. Each chunk → embedding model (e.g., text-embedding-3-large)
// 3. Model returns 3072-dimensional vector per chunk
// 4. Vectors stored in Azure AI Search
// 5. Ready for similarity search
// User searches:
const results = await graphlit.queryContents({
  search: "climate change impact on agriculture",
  searchType: SearchTypes.Vector
});

// Behind the scenes:
// 1. Query text → embedding model (same model as ingestion)
// 2. Get query vector (3072 dimensions)
// 3. Cosine similarity against all content vectors
// 4. Return top-K most similar chunks
// 5. Aggregate by parent content
// 6. Rank and return results

** Good For**:

  • Conceptual queries ("AI safety concerns")

  • Semantic similarity ("find similar documents")

  • Cross-language concepts

  • Synonyms and paraphrases

  • Question answering

  • Vague or exploratory queries

** Not Good For**:

  • Exact phrases ("Project Alpha v2.3")

  • Names and IDs ("PROJ-1234")

  • Codes and identifiers

  • Very short queries (< 3 words)

  • Spelling-sensitive searches

Examples: Vector vs Keyword

// Vector search finds semantic meaning
const vector = await graphlit.queryContents({
  search: "reducing carbon emissions",
  searchType: SearchTypes.Vector
});
// Matches: "lowering CO2 output", "decreasing greenhouse gases",
//          "climate change mitigation", "sustainability efforts"

// Keyword search finds exact tokens
const keyword = await graphlit.queryContents({
  search: "reducing carbon emissions",
  searchType: SearchTypes.Keyword
});
// Matches: Only documents with words "reducing", "carbon", "emissions"

Embedding Models

Current Default: OpenAI text-embedding-3-large

  • Dimensions: 3072

  • Quality: Highest

  • Cost: Higher

  • Use: Best results, production recommended

Alternative: OpenAI text-embedding-3-small

  • Dimensions: 1536

  • Quality: Good

  • Cost: Lower

  • Use: Cost-sensitive applications

Legacy: OpenAI text-embedding-ada-002

  • Dimensions: 1536

  • Quality: Baseline

  • Cost: Lower

  • Use: Not recommended for new projects

Configure Embedding Model:

// Create embedding specification
const embeddingSpec = await graphlit.createSpecification({
  name: "High Quality Embeddings",
  type: SpecificationTypes.Embedding,
  serviceType: ModelServiceTypes.OpenAI,
  openAI: {
    model: OpenAiModels.Embedding_3Large,
    chunkTokenLimit: 512  // Tokens per chunk
  }
});

// Use in workflow
const workflow = await graphlit.createWorkflow({
  name: "With Custom Embeddings",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Document
      }
    }]
  },
  specification: { id: embeddingSpec.createSpecification.id }
});

Similarity Scoring

Relevance Score (content.relevance):

  • Range: 0.0 (no match) to 1.0 (perfect match)

  • Based on cosine similarity

  • Higher = more semantically similar

  • Typically use threshold (e.g., > 0.7)

const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector
});

// Filter by relevance
const highRelevance = results.contents.results.filter(
  content => content.relevance > 0.7
);

console.log(`High relevance matches: ${highRelevance.length}`);

Vector search (snake_case)

results = await graphlit.client.query_contents( search="machine learning research", search_type=SearchTypes.Vector, limit=10 )

for content in results.contents.results: print(f"{content.name} - Relevance: {content.relevance}")

Find similar documents

similar = await graphlit.client.query_contents( filter=ContentFilterInput( similar_contents=[ EntityReferenceInput(id='content-id') ] ) )


**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Vector search (PascalCase)
var results = await graphlit.QueryContents(new ContentFilter
{
    Search = "machine learning research",
    SearchType = SearchVector,
    Limit = 10
});

foreach (var content in results.Contents.Results)
{
    Console.WriteLine($"{content.Name} - Relevance: {content.Relevance}");
}

// Find similar documents
var similar = await graphlit.QueryContents(new ContentFilter
{
    SimilarContents = new[]
    {
        new EntityReference { Id = "content-id" }
    }
});

Developer Hints

Vector Search is Expensive

// Each query:
// 1. Embeds query text (LLM API call)
// 2. Compares against millions of vectors
// 3. Aggregates and ranks results

// Cost factors:
// - Embedding API calls
// - Vector index size
// - Query frequency

// Optimization:
// - Cache query embeddings for common queries
// - Use hybrid search (better results, similar cost)
// - Limit result count

Chunk Size Matters

// Small chunks (256 tokens):
// - More precise matching
// - More chunks = more embeddings = higher cost
// - Better for specific queries

// Large chunks (1024 tokens):
// - More context per chunk
// - Fewer chunks = lower cost
// - Better for broad queries

// Default (512 tokens):
// - Balanced approach
// - Recommended for most use cases

Query Quality Tips

// ✓ Good queries (natural language)
"How does machine learning improve healthcare?"
"Impact of remote work on productivity"
"Best practices for API security"

// ✗ Poor queries (too short/vague)
"AI"
"docs"
"help"

// ✓ Better versions
"Artificial intelligence applications"
"Documentation about features"
"Help with authentication setup"

Variations

const results = await graphlit.queryContents({
  search: "quantum computing applications",
  searchType: SearchTypes.Vector
});

2. Vector Search with Filters

// Combine semantic search with metadata filters
const filtered = await graphlit.queryContents({
  search: "machine learning",
  searchType: SearchTypes.Vector,
  filter: {
    types: [ContentTypes.File],
    fileTypes: [FileTypes.Document],
    creationDateRange: {
      from: '2024-01-01'
    }
  }
});

3. Find Similar Documents

// "More like this" functionality
const similar = await graphlit.queryContents({
  filter: {
    similarContents: [{ id: 'original-content-id' }]
  },
  limit: 10
});

4. Vector Search with Relevance Threshold

const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector,
  limit: 50
});

// Client-side filtering by relevance
const relevant = results.contents.results.filter(
  content => content.relevance >= 0.75
);
// Search across specific collections
const results = await graphlit.queryContents({
  search: "product roadmap",
  searchType: SearchTypes.Vector,
  filter: {
    collections: [
      { id: 'engineering-docs' },
      { id: 'product-docs' }
    ]
  }
});

6. Vector Search Pagination

// First page
const page1 = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector,
  limit: 20,
  offset: 0
});

// Second page
const page2 = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector,
  limit: 20,
  offset: 20
});

Common Issues & Solutions

Issue: No results returned Solution: Query might be too specific or embeddings not created

// Check if content has embeddings
const content = await graphlit.getContent('content-id');
if (!content.content.pages || content.content.pages.length === 0) {
  console.log('Content not embedded yet');
}

// Try broader query
const results = await graphlit.queryContents({
  search: "machine learning",  // Broader
  searchType: SearchTypes.Vector
});

Issue: Results not semantically relevant Solution: Try hybrid search or keyword search

// Hybrid often works better
const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Hybrid  // Default and recommended
});

Issue: Want to change embedding model for existing content Solution: Must re-ingest content with new specification

// Create new specification
const newSpec = await graphlit.createSpecification({
  type: SpecificationTypes.Embedding,
  openAI: {
    model: OpenAIModels.TextEmbedding3Small  // Smaller, cheaper
  }
});

// Re-ingest content with new spec
// (embeddings are immutable once created)

Issue: Slow query performance Solution: Vector search is computationally expensive

// Optimizations:
// 1. Use filters to reduce search space
const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector,
  filter: {
    collections: [{ id: 'small-collection' }]  // Narrow scope
  }
});

// 2. Reduce limit
const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Vector,
  limit: 10  // Fewer results = faster
});

// 3. Consider hybrid search (often faster)
const results = await graphlit.queryContents({
  search: "query",
  searchType: SearchTypes.Hybrid
});

Production Example

async function semanticSearch(query: string) {
  console.log(`\n=== SEMANTIC SEARCH ===`);
  console.log(`Query: "${query}"`);
  
  const startTime = Date.now();
  
  const results = await graphlit.queryContents({
    search: query,
    searchType: SearchTypes.Vector,
    limit: 10
  });
  
  const elapsed = Date.now() - startTime;
  
  console.log(`\nFound ${results.contents.results.length} results in ${elapsed}ms`);
  
  results.contents.results.forEach((content, index) => {
    console.log(`\n${index + 1}. ${content.name}`);
    console.log(`   Relevance: ${(content.relevance * 100).toFixed(1)}%`);
    console.log(`   Type: ${content.type}`);
    console.log(`   Created: ${new Date(content.creationDate).toLocaleDateString()}`);
    
    // Show best matching chunk
    if (content.pages && content.pages.length > 0) {
      const topPage = content.pages.sort((a, b) => 
        (b.relevance || 0) - (a.relevance || 0)
      )[0];
      
      if (topPage.chunks && topPage.chunks.length > 0) {
        const topChunk = topPage.chunks.sort((a, b) =>
          (b.relevance || 0) - (a.relevance || 0)
        )[0];
        
        console.log(`   Matching text: "${topChunk.text.substring(0, 150)}..."`);
      }
    }
  });
  
  // Relevance distribution
  const highRelevance = results.contents.results.filter(c => c.relevance >= 0.8).length;
  const mediumRelevance = results.contents.results.filter(c => c.relevance >= 0.6 && c.relevance < 0.8).length;
  const lowRelevance = results.contents.results.filter(c => c.relevance < 0.6).length;
  
  console.log(`\n Relevance Distribution:`);
  console.log(`   High (≥80%): ${highRelevance}`);
  console.log(`   Medium (60-80%): ${mediumRelevance}`);
  console.log(`   Low (<60%): ${lowRelevance}`);
}

// Usage
await semanticSearch("impact of artificial intelligence on healthcare");

Last updated

Was this helpful?