Query Similar Content

Content: Query Similar Content

User Intent

"I want to find content similar to a specific document"

Operation

  • SDK Method: graphlit.queryContents() with content reference

  • GraphQL: queryContents query with content filter

  • Entity Type: Content

  • Common Use Cases: Find related documents, similarity search, content recommendations

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { FileTypes, SearchTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Find content similar to a specific document
const sourceContentId = 'content-id-here';

const similarContent = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  limit: 10
});

console.log(`Found ${similarContent.contents.results.length} similar documents:\n`);

similarContent.contents.results.forEach((content, index) => {
  console.log(`${index + 1}. ${content.name}`);
  if (content.summary) {
    console.log(`   ${content.summary.substring(0, 100)}...`);
  }
});

Find similar content (snake_case)

similar_content = await graphlit.queryContents( filter=ContentFilterInput( contents=[EntityReferenceFilterInput(id=source_content_id)], search_type=SearchTypes.Vector, limit=10 ) )

print(f"Found {len(similar_content.contents.results)} similar documents")

for idx, content in enumerate(similar_content.contents.results, 1): print(f"{idx}. {content.name}")


**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

var sourceContentId = "content-id-here";

// Find similar content (PascalCase)
var similarContent = await graphlit.QueryContents(new ContentFilter {
    Contents = new[] { new EntityReferenceFilter { Id = sourceContentId } },
    SearchType = SearchVector,
    Limit = 10
});

Console.WriteLine($"Found {similarContent.Contents.Results.Count} similar documents");

foreach (var (content, index) in similarContent.Contents.Results.Select((c, i) => (c, i)))
{
    Console.WriteLine($"{index + 1}. {content.Name}");
}

Parameters

ContentFilter

  • contents (EntityReferenceFilter[]): Source content for similarity

  • searchType (SearchTypes): Must be VECTOR for similarity

  • limit (int): Max results to return (default: 100)

  • collections (EntityReferenceFilter[]): Filter by collection (optional)

Response

{
  contents: {
    results: Content[];  // Similar content, ordered by similarity
  }
}

Developer Hints

Vector Search for Similarity

Important: Use searchType: Vector for semantic similarity.

//  CORRECT - Vector search for similarity
const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector
});

//  WRONG - Keyword search won't find similar content
const wrong = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Keyword
});

Exclude Source Document

// Find similar but exclude the source
const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  limit: 11  // Get 11 results
});

// Filter out source document
const filtered = similar.contents.results.filter(
  c => c.id !== sourceContentId
).slice(0, 10);

console.log(`${filtered.length} similar documents (excluding source)`);

Filter Similar Content

// Find similar PDFs only
const similarPdfs = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  fileTypes: [FileTypes.Pdf],
  limit: 10
});

// Find similar in specific collection
const similarInCollection = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  collections: [{ id: collectionId }],
  limit: 10
});

Variations

Find top 10 similar documents:

const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  limit: 10
});

2. Similar Content in Collection

Limit to specific collection:

const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  collections: [{ id: collectionId }],
  limit: 10
});

3. Similar by File Type

Find similar files of same type:

const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  fileTypes: [FileTypes.Pdf],
  limit: 10
});

Build "Related Articles" feature:

async function getRelatedDocuments(contentId: string, limit: number = 5) {
  const results = await graphlit.queryContents({
    contents: [{ id: contentId }],
    searchType: SearchTypes.Vector,
    limit: limit + 1
  });
  
  // Exclude source
  return results.contents.results
    .filter(c => c.id !== contentId)
    .slice(0, limit);
}

// Usage
const related = await getRelatedDocuments('article-123', 5);
console.log('Related Articles:');
related.forEach(doc => console.log(`- ${doc.name}`));

5. Duplicate Detection

Find near-duplicates:

const similar = await graphlit.queryContents({
  contents: [{ id: sourceContentId }],
  searchType: SearchTypes.Vector,
  limit: 5
});

// Check for high similarity (potential duplicates)
similar.contents.results.forEach(content => {
  if (content.id !== sourceContentId) {
    console.log(`Potential duplicate: ${content.name}`);
  }
});

6. Content Clustering

Group similar content:

async function findContentClusters(contentIds: string[]) {
  const clusters: Record<string, string[]> = {};
  
  for (const id of contentIds) {
    const similar = await graphlit.queryContents({
      contents: [{ id }],
      searchType: SearchTypes.Vector,
      limit: 5
    });
    
    clusters[id] = similar.contents.results
      .map(c => c.id)
      .filter(cid => cid !== id);
  }
  
  return clusters;
}

Common Issues

Issue: Source document appears in results Solution: Filter out source document from results manually.

Issue: No similar content found Solution: Ensure content has been embedded. Check embedding specification was used during ingestion.

Issue: Results not semantically similar Solution: Verify searchType: Vector is set. Use better embedding model (text-embedding-3-large).

Production Example

Related content recommendation:

async function getRecommendations(articleId: string) {
  const similar = await graphlit.queryContents({
    contents: [{ id: articleId }],
    searchType: SearchTypes.Vector,
    limit: 6
  });
  
  const recommendations = similar.contents.results
    .filter(c => c.id !== articleId)
    .slice(0, 5);
  
  return recommendations.map(doc => ({
    id: doc.id,
    title: doc.name,
    summary: doc.summary?.substring(0, 150),
    uri: doc.uri
  }));
}

const recs = await getRecommendations('article-123');
console.log('You might also like:', recs);

Last updated

Was this helpful?