Query with Filters

User Intent

"I want to search and filter my ingested content by various criteria"

Operation

  • SDK Method: graphlit.queryContents()

  • GraphQL: queryContents query

  • Entity Type: Content

  • Common Use Cases: Semantic search, filter by type/date/collection, faceted search, similarity search

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, EntityState, FileTypes, ObservableTypes, SearchTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Basic search - returns all content
const response = await graphlit.queryContents({});

console.log(`Found ${response.contents.results.length} content items`);

// Text search with semantic similarity
const searchResponse = await graphlit.queryContents({
  search: 'machine learning best practices',
  searchType: SearchTypes.Hybrid,  // Combines keyword + vector search
  limit: 10
});

console.log(`Search found ${searchResponse.contents.results.length} results`);

// Filter by content type
const pdfResponse = await graphlit.queryContents({
  types: [ContentTypes.File],
  fileTypes: [FileTypes.Pdf],
  limit: 20
});

console.log(`Found ${pdfResponse.contents.results.length} PDF files`);

Parameters

ContentFilter Options

Search:

  • search (string): Search query text

    • Uses semantic similarity (embeddings)

    • Supports natural language queries

  • searchType (SearchTypes): Search strategy

    • KEYWORD - Traditional keyword matching

    • VECTOR - Semantic/embedding search

    • HYBRID - Combined keyword + vector (recommended)

Content Type Filters:

  • types (ContentTypes[]): Filter by content type

    • FILE, PAGE, TEXT, MEMORY, EMAIL, MESSAGE, POST, ISSUE, EVENT

  • fileTypes (FileTypes[]): Filter by file type (when type = FILE)

    • PDF, DOCX, IMAGE, AUDIO, VIDEO, MARKDOWN, etc.

  • textTypes (TextTypes[]): Filter text content

    • PLAIN, MARKDOWN

Temporal Filters:

  • creationDateRange (DateRangeFilter): Filter by creation date

    • from (Date): Start date

    • to (Date): End date

  • modifiedDateRange (DateRangeFilter): Filter by modification date

Organization Filters:

  • collections (EntityReferenceFilter[]): Filter by collection membership

  • feeds (EntityReferenceFilter[]): Filter by source feed

  • workflows (EntityReferenceFilter[]): Filter by workflow used

State Filters:

  • states (EntityState[]): Filter by processing state

    • ENABLED, DISABLED, FINISHED, ERROR, AWAITING_EXTRACTION

Advanced Filters:

  • similarContents (EntityReferenceInput[]): Find similar content

  • observations (ObservationFilter): Filter by extracted entities

Pagination & Sorting:

  • offset (number): Skip first N results (default: 0)

  • limit (number): Max results to return (default: 10, max: 100)

  • orderBy (OrderByTypes): Sort field

    • RELEVANCE - By search relevance score (when search is used)

    • CREATION_DATE - By creation date

    • NAME - Alphabetically by name

  • orderDirection (OrderDirectionTypes): Sort direction

    • ASC - Ascending

    • DESC - Descending (default)

Response

{
  contents: {
    results: Content[];  // Array of content items
    // Each content has:
    // - id, name, state, type, fileType, mimeType
    // - markdown (extracted text)
    // - uri (source URL)
    // - creationDate, modifiedDate
    // - collections, feed, workflow
    // - observations (extracted entities)
    // - relevance (search score, 0.0-1.0)
  }
}

Developer Hints

Search Type Selection

When to use each search type:

Search Type
Best For
How It Works

KEYWORD

Exact term matching, technical terms

Traditional text matching

VECTOR

Semantic/conceptual search, natural language

Embedding similarity

HYBRID

Most searches (recommended)

Combines both approaches

// Use KEYWORD for exact terms
const keywordResults = await graphlit.queryContents({
  search: 'GPT-4',
  searchType: SearchTypes.Keyword
});

// Use VECTOR for conceptual search
const vectorResults = await graphlit.queryContents({
  search: 'articles about artificial intelligence',
  searchType: SearchTypes.Vector
});

// Use HYBRID for best results (default)
const hybridResults = await graphlit.queryContents({
  search: 'machine learning tutorials',
  searchType: SearchTypes.Hybrid
});

OrderBy Behavior

// When using search, ALWAYS use orderBy: RELEVANCE
const searchResults = await graphlit.queryContents({
  search: 'product documentation',
  orderBy: OrderByTypes.Relevance  // Sorts by search score
});

// When NOT searching, use CREATION_DATE or NAME
const recentContent = await graphlit.queryContents({
  orderBy: OrderByTypes.CreationDate,
  orderDirection: OrderDirectionTypes.Desc,  // Newest first
  limit: 10
});

Understanding Relevance Scores

// Results include relevance scores when searching
const results = await graphlit.queryContents({
  search: 'project requirements',
  searchType: SearchTypes.Hybrid
});

results.contents.results.forEach(content => {
  console.log(`${content.name}: ${content.relevance?.toFixed(2)} relevance`);
  // Relevance is 0.0 to 1.0 (higher = more relevant)
});

Pagination Best Practices

// Fetch first page
let offset = 0;
const limit = 20;

const page1 = await graphlit.queryContents({
  search: 'meeting notes',
  offset: 0,
  limit: limit
});

// Fetch second page
const page2 = await graphlit.queryContents({
  search: 'meeting notes',
  offset: limit,  // offset = 20
  limit: limit
});

// Continue until results.length < limit

Variations

1. Filter by Date Range

Find content created in a specific time period:

const lastWeek = new Date();
lastWeek.setDate(lastWeek.getDate() - 7);

const response = await graphlit.queryContents({
  creationDateRange: {
    from: lastWeek,
    to: new Date()
  },
  orderBy: OrderByTypes.CreationDate,
  orderDirection: OrderDirectionTypes.Desc
});

console.log(`Content created in last week: ${response.contents.results.length}`);

2. Filter by Collection

Search within specific collections:

// Get collection ID first
const collections = await graphlit.queryCollections({ name: 'Product Docs' });
const collectionId = collections.collections.results[0]?.id;

if (collectionId) {
  const response = await graphlit.queryContents({
    collections: [{ id: collectionId }],
    search: 'API reference',
    searchType: SearchTypes.Hybrid
  });
  
  console.log(`Found ${response.contents.results.length} results in Product Docs`);
}

3. Filter by Multiple Content Types

Search across specific content types:

const response = await graphlit.queryContents({
  types: [
    ContentTypes.Email,
    ContentTypes.Message,
    ContentTypes.Post
  ],
  search: 'project update',
  searchType: SearchTypes.Hybrid,
  limit: 50
});

console.log(`Found communications about project update: ${response.contents.results.length}`);

Find content similar to existing content:

// First, get a content ID
const originalContent = await graphlit.queryContents({
  search: 'machine learning tutorial',
  limit: 1
});

const contentId = originalContent.contents.results[0]?.id;

if (contentId) {
  // Find similar content
  const similarResults = await graphlit.queryContents({
    similarContents: [{ id: contentId }],
    searchType: SearchTypes.Vector,  // Must use Vector or Hybrid
    limit: 10
  });
  
  console.log(`Found ${similarResults.contents.results.length} similar documents`);
}

5. Filter by State

Find content in specific processing states:

// Find content that failed processing
const errorContent = await graphlit.queryContents({
  states: [EntityState.Error],
  orderBy: OrderByTypes.CreationDate,
  orderDirection: OrderDirectionTypes.Desc
});

console.log(`Content with errors: ${errorContent.contents.results.length}`);

// Find content still processing
const processingContent = await graphlit.queryContents({
  states: [EntityState.AwaitingExtraction]
});

console.log(`Content awaiting processing: ${processingContent.contents.results.length}`);

6. Filter by Feed Source

Find content from specific feeds:

// Get feed ID
const feeds = await graphlit.queryFeeds();
const slackFeedId = feeds.feeds.results.find(f => f.name.includes('Slack'))?.id;

if (slackFeedId) {
  const response = await graphlit.queryContents({
    feeds: [{ id: slackFeedId }],
    creationDateRange: {
      from: new Date('2025-01-01'),
      to: new Date()
    }
  });
  
  console.log(`Slack messages in 2025: ${response.contents.results.length}`);
}

7. Complex Filter Combination

Combine multiple filters:

const response = await graphlit.queryContents({
  // Text search
  search: 'quarterly results',
  searchType: SearchTypes.Hybrid,
  
  // Content type
  types: [ContentTypes.File],
  fileTypes: [FileTypes.Pdf, FileTypes.Docx],
  
  // Date range
  creationDateRange: {
    from: new Date('2024-01-01'),
    to: new Date('2024-12-31')
  },
  
  // Pagination
  offset: 0,
  limit: 50,
  
  // Sorting
  orderBy: OrderByTypes.Relevance
});

console.log(`Found ${response.contents.results.length} relevant documents`);

8. Filter by Extracted Entities

Find content mentioning specific entities:

// Find content mentioning a person
const response = await graphlit.queryContents({
  observations: {
    observable: {
      types: [ObservableTypes.Person],
      states: [EntityState.Enabled]
    },
    name: 'John Smith'
  },
  limit: 100
});

console.log(`Content mentioning John Smith: ${response.contents.results.length}`);

Common Issues

Issue: Search returns no results even though content exists Solution: Check that content has finished processing (state: FINISHED). Embeddings must be generated for vector search.

Issue: Relevance scores seem low (< 0.5) Solution: This is normal. Relevance is relative. Use HYBRID search type for better results.

Issue: orderBy: RELEVANCE but results not sorted by relevance Solution: Relevance ordering only works when search parameter is provided. Without search, use CREATION_DATE or NAME.

Issue: Getting duplicate results across pages Solution: Ensure consistent orderBy and orderDirection across pagination requests. Content may shift if new items are added during pagination.

Issue: Pagination returns fewer results than limit Solution: You've reached the end of results. This is expected behavior.

Production Example

Parallel query with count:

// Query results and total count in parallel
const [response, countResponse] = await Promise.all([
  graphlit.queryContents(filter),
  graphlit.countContents({
    ...filter,
    offset: 0,
    limit: 1000000  // Count all matching
  })
]);

const results = response.contents?.results || [];
const totalCount = countResponse.countContents?.count || undefined;

Last updated

Was this helpful?