Keyword Search Explained

Content: Keyword Search Explained

User Intent

"How does keyword/full-text search work in Graphlit?"

Operation

  • SDK Method: queryContents() with searchType: SearchKeyword

  • GraphQL: queryContents query

  • Common Use Cases: Exact phrase matching, names, IDs, codes, fast lookups

How Keyword Search Works

Keyword search uses traditional full-text indexing with token-based matching. It's fast, precise, and ideal for exact phrases, names, and identifiers.

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, SearchTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Pure keyword search
const results = await graphlit.queryContents({
  search: "Project Alpha status report",
  searchType: SearchTypes.Keyword,
  limit: 10
});

console.log(`Found ${results.contents.results.length} keyword matches`);

results.contents.results.forEach((content, index) => {
  console.log(`\n${index + 1}. ${content.name}`);
  console.log(`   Relevance: ${content.relevance}`);
  console.log(`   Type: ${content.type}`);
});

// Search for exact phrase
const exactPhrase = await graphlit.queryContents({
  search: '"quarterly earnings report"',  // Quotes = exact phrase
  searchType: SearchTypes.Keyword
});

// Search for ID or code
const byId = await graphlit.queryContents({
  search: "PROJ-1234",
  searchType: SearchTypes.Keyword
});

// Search for email address
const byEmail = await graphlit.queryContents({
  search: "[email protected]",
  searchType: SearchTypes.Keyword
});

// Search for specific name
const byName = await graphlit.queryContents({
  search: "Kirk Marple",
  searchType: SearchTypes.Keyword
});

Keyword Search Features

1. Token-Based Matching

// Search tokens: "machine", "learning", "tutorial"
const results = await graphlit.queryContents({
  search: "machine learning tutorial",
  searchType: SearchTypes.Keyword
});

// Matches documents containing these tokens
// Order doesn't matter (by default)
// Stemming applied: "learning" matches "learn", "learned", "learns"
// Use quotes for exact phrase
const exact = await graphlit.queryContents({
  search: '"machine learning tutorial"',  // Exact order required
  searchType: SearchTypes.Keyword
});

// Only matches documents with this exact phrase

3. Boolean Operators (if supported)

// AND operator (both terms required)
const andSearch = await graphlit.queryContents({
  search: "machine AND learning",
  searchType: SearchTypes.Keyword
});

// OR operator (either term)
const orSearch = await graphlit.queryContents({
  search: "machine OR learning",
  searchType: SearchTypes.Keyword
});

// NOT operator (exclude term)
const notSearch = await graphlit.queryContents({
  search: "machine NOT learning",
  searchType: SearchTypes.Keyword
});

4. Case-Insensitive

// All equivalent:
await graphlit.queryContents({ search: "graphlit", searchType: SearchTypes.Keyword });
await graphlit.queryContents({ search: "Graphlit", searchType: SearchTypes.Keyword });
await graphlit.queryContents({ search: "GRAPHLIT", searchType: SearchTypes.Keyword });

// All match "Graphlit", "graphlit", "GRAPHLIT"

5. Stemming

// Search for "running"
const results = await graphlit.queryContents({
  search: "running",
  searchType: SearchTypes.Keyword
});

// Also matches: "run", "runs", "ran", "runner"
// English stemming rules applied

** Good For**:

  • Exact phrases ("Project Alpha v2.3")

  • Names ("Kirk Marple", "Graphlit")

  • Email addresses ("[email protected]")

  • IDs and codes ("PROJ-1234", "ORDER-5678")

  • URLs ("https://graphlit.com")

  • Specific terminology

  • Fast lookups

** Not Good For**:

  • Conceptual queries (use vector search)

  • Synonyms and paraphrases (use vector search)

  • Semantic similarity (use vector search)

  • "Find similar" queries (use vector search)

Performance Characteristics

// Keyword search is FAST:
// - Inverted index lookup
// - Token-based (no vector computation)
// - Sub-100ms queries typical
// - Scales to billions of documents

const startTime = Date.now();
const results = await graphlit.queryContents({
  search: "PROJ-1234",
  searchType: SearchTypes.Keyword
});
console.log(`Query time: ${Date.now() - startTime}ms`);
// Typically: 20-50ms

BM25 Ranking

Algorithm: BM25 (Best Matching 25)

  • Standard information retrieval algorithm

  • Considers term frequency and document length

  • More sophisticated than simple TF-IDF

Relevance Score:

const results = await graphlit.queryContents({
  search: "machine learning",
  searchType: SearchTypes.Keyword
});

results.contents.results.forEach(content => {
  console.log(`${content.name}: ${content.relevance}`);
  // Higher score = more relevant
  // Multiple occurrences of terms increase score
  // Shorter documents rank higher (BM25 length normalization)
});

Keyword search (snake_case)

results = await graphlit.client.query_contents( search="Project Alpha", search_type=SearchTypes.Keyword, limit=10 )

for content in results.contents.results: print(f"{content.name} - Relevance: {content.relevance}")

Exact phrase

exact = await graphlit.client.query_contents( search='"quarterly report"', search_type=SearchTypes.Keyword )

Search by ID

by_id = await graphlit.client.query_contents( search="PROJ-1234", search_type=SearchTypes.Keyword )


**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Keyword search (PascalCase)
var results = await graphlit.QueryContents(new ContentFilter
{
    Search = "Project Alpha",
    SearchType = SearchKeyword,
    Limit = 10
});

foreach (var content in results.Contents.Results)
{
    Console.WriteLine($"{content.Name} - Relevance: {content.Relevance}");
}

// Exact phrase
var exact = await graphlit.QueryContents(new ContentFilter
{
    Search = "\"quarterly report\"",
    SearchType = SearchKeyword
});

// Search by ID
var byId = await graphlit.QueryContents(new ContentFilter
{
    Search = "PROJ-1234",
    SearchType = SearchKeyword
});

Developer Hints

// Token search (default):
const tokens = await graphlit.queryContents({
  search: "machine learning",  // No quotes
  searchType: SearchTypes.Keyword
});
// Matches: "machine learning", "learning about machines", "machine for learning"

// Exact phrase:
const exact = await graphlit.queryContents({
  search: '"machine learning"',  // With quotes
  searchType: SearchTypes.Keyword
});
// Matches: Only "machine learning" in that exact order

Special Characters

// Email addresses work as-is
const email = await graphlit.queryContents({
  search: "[email protected]",
  searchType: SearchTypes.Keyword
});

// URLs work (may need quotes for special chars)
const url = await graphlit.queryContents({
  search: "https://graphlit.com",
  searchType: SearchTypes.Keyword
});

// IDs with hyphens work
const id = await graphlit.queryContents({
  search: "PROJ-1234",
  searchType: SearchTypes.Keyword
});

Combining with Filters

// Keyword search + metadata filters
const filtered = await graphlit.queryContents({
  search: "Project Alpha",
  searchType: SearchTypes.Keyword,
  filter: {
    types: [ContentTypes.Email],
    creationDateRange: {
      from: '2024-01-01'
    }
  }
});

// Fast and precise

Variations

const results = await graphlit.queryContents({
  search: "Graphlit platform",
  searchType: SearchTypes.Keyword
});

2. Exact Phrase Search

const exact = await graphlit.queryContents({
  search: '"semantic memory platform"',
  searchType: SearchTypes.Keyword
});

3. Search by ID

const byId = await graphlit.queryContents({
  search: "PROJ-1234",
  searchType: SearchTypes.Keyword
});

4. Search by Email

const byEmail = await graphlit.queryContents({
  search: "[email protected]",
  searchType: SearchTypes.Keyword
});

5. Search with Content Type Filter

const emailsOnly = await graphlit.queryContents({
  search: "Project Alpha",
  searchType: SearchTypes.Keyword,
  filter: {
    types: [ContentTypes.Email]
  }
});

6. Search in Specific Collection

const inCollection = await graphlit.queryContents({
  search: "meeting notes",
  searchType: SearchTypes.Keyword,
  filter: {
    collections: [{ id: 'team-docs-collection' }]
  }
});
// All terms required (implicit AND)
const multiTerm = await graphlit.queryContents({
  search: "quarterly report Q4 2024",
  searchType: SearchTypes.Keyword
});

Common Issues & Solutions

Issue: No results for partial words Solution: Keyword search doesn't do prefix matching by default

//  Won't match "Graphlit"
await graphlit.queryContents({
  search: "Graph",
  searchType: SearchTypes.Keyword
});

//  Use full word
await graphlit.queryContents({
  search: "Graphlit",
  searchType: SearchTypes.Keyword
});

//  Or use vector search for fuzzy matching
await graphlit.queryContents({
  search: "Graph",
  searchType: SearchTypes.Vector
});

Issue: Too many results Solution: Use exact phrase or add filters

//  Too broad
await graphlit.queryContents({
  search: "report",
  searchType: SearchTypes.Keyword
});

//  More specific
await graphlit.queryContents({
  search: '"quarterly earnings report"',
  searchType: SearchTypes.Keyword
});

//  Add date filter
await graphlit.queryContents({
  search: "report",
  searchType: SearchTypes.Keyword,
  filter: {
    creationDateRange: { from: '2024-01-01' }
  }
});

Issue: Want semantic + keyword Solution: Use hybrid search (combines both)

// ✓ Best of both worlds
await graphlit.queryContents({
  search: "Project Alpha status",
  searchType: SearchTypes.Hybrid  // Recommended
});

Issue: Special characters breaking search Solution: Use quotes for exact phrase

// ✓ Escape with quotes
await graphlit.queryContents({
  search: '"C++ programming guide"',
  searchType: SearchTypes.Keyword
});

Production Example

async function keywordSearch(query: string) {
  console.log(`\n=== KEYWORD SEARCH ===`);
  console.log(`Query: "${query}"`);
  
  const startTime = Date.now();
  
  const results = await graphlit.queryContents({
    search: query,
    searchType: SearchTypes.Keyword,
    limit: 20
  });
  
  const elapsed = Date.now() - startTime;
  
  console.log(`\nFound ${results.contents.results.length} results in ${elapsed}ms`);
  
  // Group by content type
  const byType = new Map<string, number>();
  results.contents.results.forEach(content => {
    byType.set(content.type, (byType.get(content.type) || 0) + 1);
  });
  
  console.log(`\n Results by Type:`);
  byType.forEach((count, type) => {
    console.log(`   ${type}: ${count}`);
  });
  
  // Top results
  console.log(`\n🔝 Top 5 Results:`);
  results.contents.results.slice(0, 5).forEach((content, index) => {
    console.log(`\n${index + 1}. ${content.name}`);
    console.log(`   Type: ${content.type}`);
    console.log(`   Relevance: ${(content.relevance * 100).toFixed(1)}%`);
    console.log(`   Created: ${new Date(content.creationDate).toLocaleDateString()}`);
    
    // Show snippet if available
    if (content.markdown) {
      const snippet = content.markdown
        .split('\n')
        .find(line => line.toLowerCase().includes(query.toLowerCase()));
      
      if (snippet) {
        console.log(`   Snippet: "${snippet.trim().substring(0, 100)}..."`);
      }
    }
  });
  
  // Performance analysis
  console.log(`\n⚡ Performance:`);
  console.log(`   Query time: ${elapsed}ms`);
  console.log(`   Results per ms: ${(results.contents.results.length / elapsed).toFixed(2)}`);
}

// Usage examples
await keywordSearch("Project Alpha");
await keywordSearch("[email protected]");
await keywordSearch("PROJ-1234");
await keywordSearch('"quarterly earnings report"');

Last updated

Was this helpful?