Mastra (TypeScript)

Build an autonomous AI research agent that performs multi-hop web research with entity extraction and intelligent filtering—like OpenAI's Deep Research

⏱️ Time: 30-40 minutes 🎯 Level: Advanced 💻 SDK: TypeScript (Mastra framework)

What You'll Learn

In this tutorial, you'll build a production-ready research agent that:

  • ✅ Extracts entities from documents using Graphlit's knowledge graph

  • ✅ Performs multi-hop web research (searches for discovered entities)

  • ✅ Filters sources before ingesting using native reranking (key innovation!)

  • ✅ Detects convergence automatically (knows when to stop)

  • ✅ Synthesizes multi-source reports with citations (scales to 100+ sources)

What makes this production-ready: Pre-ingestion filtering, autonomous stopping, and summary-based synthesis patterns used in real applications.


What You'll Build

An autonomous agent that takes a topic and:

  1. Ingests seed source - Reads initial document or search results

  2. Discovers entities - Extracts people, companies, concepts from your knowledge graph

  3. Researches each entity - Searches Exa for 10 related sources per entity

  4. Filters intelligently - Analyzes 50 sources, ingests only top 8 high-quality ones

  5. Detects convergence - Stops when novelty score drops below 30%

  6. Synthesizes report - Generates comprehensive markdown with proper citations

Example: Start with Wikipedia on "RAG" → Extracts 15 entities → Searches 50 sources → Filters to 8 → Generates 2000-word report in ~45 seconds

🔗 Full code: GitHub


Prerequisites


Why This Matters: What Graphlit Handles

Before we dive into building, understand what Graphlit provides so you don't have to build it:

Infrastructure (Weeks → Hours)

  • File parsing - PDFs, DOCX, audio, video (30+ formats)

  • Vector database - Managed Qdrant, auto-scaled

  • Multi-tenant isolation - Each user gets isolated environment

  • GraphQL API - Auto-generated, authenticated

Intelligence (Months → API Calls)

  • Automatic entity extraction - LLM-powered workflow extracts Person, Organization, Category during ingestion

  • Knowledge graph - Built on Schema.org/JSON-LD standard, relationships auto-created

  • Native reranker - Fast, accurate relevance scoring (this enables our pre-filtering!)

  • Exa search built-in - No separate API key needed, semantic web search included

  • Summary-based RAG - Scales to 100+ documents via optimized summaries

Time savings: Estimated 12-14 weeks of infrastructure development you skip.

Production proof: This pattern is used in Zine, serving thousands of users with millions of documents.


The Key Innovation: Pre-Ingestion Filtering

Most research implementations blindly ingest everything they find. This creates noise and wastes processing.

The breakthrough: Analyze sources before fully ingesting them.

Here's the pattern:

  1. Quick ingest to temporary collection (lightweight)

  2. Use Graphlit's native reranker to score relevance

  3. Filter out low-scoring sources (<0.5 relevance)

  4. Only fully ingest top 5-8 sources

  5. Delete temporary collection

Why this works: Graphlit's native reranker is fast enough (~2 seconds) to analyze 50 sources before deciding which to fully process.

Result: Process 8 sources instead of 50. Faster, higher quality, better signal-to-noise.


The 5-Phase Research Algorithm

Phase 1: Seed Acquisition

Two starting modes:

URL Mode - Start from a specific source:

pnpm start --url "https://arxiv.org/abs/2005.11401"

Best for: Research papers, documentation, whitepapers

Search Mode - Discover seed sources automatically:

pnpm start --search "retrieval augmented generation" --results 5

Best for: Open-ended research, new topics

Phase 2: Entity-Driven Discovery

Instead of keyword-based research, let the knowledge graph drive discovery:

  • Automatic extraction: Entities extracted during ingestion (no separate step!)

  • Types: Person, Organization, Category (concepts/technical terms)

  • Ranking: By occurrence count and semantic importance

  • Selection: Top 5 become research seeds

Why entity-driven works: A RAG paper mentions "vector databases" and "BERT"—those naturally become your next research directions. Mimics human researcher behavior.

Phase 3: Intelligent Expansion

For each entity:

  1. Search Exa for 10 related sources

  2. Pre-filter before ingesting (the key innovation!)

  3. Only ingest top 3-5 highest-quality sources

The filtering workflow:

50 sources found via Exa search
  ↓ Quick ingest to temp collection
  ↓ Rerank by relevance (native reranker)
  ↓ Filter (keep score >0.5)
  ↓ Full ingest top 5 only
  ↓ Delete temp collection
8 sources ingested total

Benefit: Analyze 50, process 8. Significantly faster with better quality.

Phase 4: Convergence Detection

The agent automatically detects when to stop:

  1. Rerank ALL content by relevance to original query

  2. Calculate novelty: What % of newest sources rank in top 10?

  3. Decision:

    • Novelty >30%: Continue, sources add value

    • Novelty <30%: Stop, diminishing returns

Why this matters: No manual intervention needed. The agent knows when research has converged.

Phase 5: Multi-Source Synthesis

Graphlit's summary-based approach scales beyond traditional RAG:

  1. Auto-summarize each source (25-50 key points + entities)

  2. Concatenate summaries (efficient context usage)

  3. Synthesize via LLM from summaries

  4. Citations automatically included

Traditional RAG hits limits at 10-20 docs. This approach handles 100+ sources.


Implementation: Step-by-Step

Step 1: Project Setup (5 min)

mkdir deep-research && cd deep-research
pnpm init -y
pnpm add @mastra/core @ai-sdk/openai graphlit-client zod dotenv
pnpm add -D typescript tsx
pnpm add chalk ora boxen cli-table3 gradient-string

Create .env:

# From portal.graphlit.dev
GRAPHLIT_ENVIRONMENT_ID=your_environment_id
GRAPHLIT_ORGANIZATION_ID=your_organization_id  
GRAPHLIT_JWT_SECRET=your_jwt_secret

# From platform.openai.com
OPENAI_API_KEY=your_openai_key

Configure tsconfig.json:

{
  "compilerOptions": {
    "target": "ES2022",
    "module": "ESNext",
    "moduleResolution": "node",
    "outDir": "./dist",
    "strict": true,
    "esModuleInterop": true
  }
}

Step 2: Singleton Graphlit Client (2 min)

Create one shared Graphlit instance used by all tools.

File: src/graphlit.ts

import { Graphlit } from 'graphlit-client';

export const graphlit = new Graphlit();
// Auto-reads env vars, handles auth, ready to use

Why singleton: Efficient, no redundant credential passing, production pattern.

Step 3: Build the Key Tools (20 min)

We'll build 10 Mastra tools. Here are the critical ones:

Tool 1: Workflow with Entity Extraction

This sets up automatic entity extraction during ingestion.

File: src/tools/workflow-setup.ts

import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import { graphlit } from '../graphlit.js';
import {
  EntityExtractionServiceTypes,
  ObservableTypes,
} from 'graphlit-client/dist/generated/graphql-types';

export const createWorkflowTool = createTool({
  id: 'create-workflow',
  description: 'Create collection and workflow with entity extraction',
  
  inputSchema: z.object({
    name: z.string(),
  }),
  
  outputSchema: z.object({
    collectionId: z.string(),
    workflowId: z.string(),
  }),
  
  execute: async ({ context }) => {
    // Create collection for organizing content
    const collResp = await graphlit.createCollection({
      name: context.input.name,
    });

    // Create workflow with automatic entity extraction
    const wfResp = await graphlit.upsertWorkflow({
      name: `${context.input.name} Workflow`,
      ingestion: {
        collections: [{ id: collResp.createCollection.id }],
      },
      extraction: {
        jobs: [
          {
            connector: {
              type: EntityExtractionServiceTypes.ModelText,
              extractedTypes: [
                ObservableTypes.Person,
                ObservableTypes.Organization,
                ObservableTypes.Category,
              ],
              extractedCount: 10,
            },
          },
        ],
      },
    });

    return {
      collectionId: collResp.createCollection.id,
      workflowId: wfResp.upsertWorkflow.id,
    };
  },
});

Key insight: Entity extraction happens automatically during ingestion. When you query later, entities are already in your knowledge graph—no separate extraction step needed.

Tool 2: Pre-Ingestion Filtering (The Critical Innovation)

This tool analyzes sources before fully ingesting them.

File: src/tools/rerank.ts

import { createTool } from '@mastra/core/tools';
import { z } from 'zod';
import { graphlit } from '../graphlit.js';

export const filterSearchResultsTool = createTool({
  id: 'filter-search-results',
  description: 'Filter search results BEFORE full ingestion using native reranker',
  
  inputSchema: z.object({
    searchResults: z.array(z.object({ 
      url: z.string(), 
      title: z.string() 
    })),
    query: z.string(),
    maxResults: z.number().default(5),
    minRelevanceScore: z.number().default(0.5),
  }),
  
  outputSchema: z.object({
    filteredUrls: z.array(z.string()),
    skippedCount: z.number(),
    reasoning: z.string(),
  }),
  
  execute: async ({ context }) => {
    // Step 1: Create temporary resources for analysis
    const tempWorkflow = await graphlit.upsertWorkflow({
      name: `Temp Filter ${Date.now()}`,
    });
    
    const tempCollection = await graphlit.createCollection({
      name: `Temp Filter ${Date.now()}`,
    });

    // Step 2: Quick ingestion for analysis only
    const results = await Promise.all(
      context.input.searchResults.map(result =>
        graphlit.ingestUri(
          result.url, result.title, undefined, undefined, true,
          { id: tempWorkflow.upsertWorkflow.id },
          [{ id: tempCollection.createCollection.id }],
        ).catch(() => null) // Graceful failure
      )
    );

    const tempContentIds = results
      .filter(r => r?.ingestUri?.id)
      .map(r => r.ingestUri.id);

    if (tempContentIds.length === 0) {
      return {
        filteredUrls: [],
        skippedCount: context.input.searchResults.length,
        reasoning: 'Failed to ingest any results for analysis',
      };
    }

    // Step 3: Use native reranker for relevance scoring
    const reranked = await graphlit.retrieveSources(
      context.input.query,
      { collections: [{ id: tempCollection.createCollection.id }] },
      undefined,
      { type: 'VECTOR', limit: context.input.maxResults * 2 },
      { type: 'RERANK' }, // Native reranker!
    );

    // Step 4: Filter by relevance score
    const highQuality = (reranked.retrieveSources?.results ?? [])
      .filter(s => !s.score || s.score >= context.input.minRelevanceScore)
      .slice(0, context.input.maxResults);

    // Step 5: Map back to original URLs
    const filteredUrls = [];
    for (const source of highQuality) {
      const content = await graphlit.getContent(source.content?.id || '');
      if (content.content?.uri) {
        filteredUrls.push(content.content.uri);
      }
    }

    // Step 6: Clean up temporary resources
    await graphlit.deleteCollection(tempCollection.createCollection.id);
    await graphlit.deleteWorkflow(tempWorkflow.upsertWorkflow.id);

    return {
      filteredUrls,
      skippedCount: context.input.searchResults.length - filteredUrls.length,
      reasoning: `Analyzed ${tempContentIds.length} sources. Kept top ${filteredUrls.length} with relevance >=${context.input.minRelevanceScore}. Skipped ${context.input.searchResults.length - filteredUrls.length} low-quality sources.`,
    };
  },
});

Why this pattern works: The native reranker is fast (~2 seconds) and accurate. Temporary collection analysis is lightweight. This makes pre-filtering practical at scale.

Tool 3: Diminishing Returns Detection

Automatically detects when research has converged.

File: src/tools/rerank.ts (continued)

export const detectDiminishingReturnsTool = createTool({
  id: 'detect-diminishing-returns',
  description: 'Detect if new sources add value or just repeat existing knowledge',
  
  inputSchema: z.object({
    collectionId: z.string(),
    recentContentIds: z.array(z.string()),
    query: z.string(),
  }),
  
  outputSchema: z.object({
    isDiminishing: z.boolean(),
    noveltyScore: z.number(),
    recommendation: z.string(),
  }),
  
  execute: async ({ context }) => {
    // Rerank ALL content by relevance
    const allRanked = await graphlit.retrieveSources(
      context.input.query,
      { collections: [{ id: context.input.collectionId }] },
      undefined,
      { type: 'VECTOR', limit: 50 },
      { type: 'RERANK' },
    );

    const rankedIds = allRanked.retrieveSources?.results?.map(r => r.content?.id) ?? [];
    const topN = Math.min(10, rankedIds.length);
    const topRanked = rankedIds.slice(0, topN);

    // Count how many recent sources rank in top 10
    const recentInTop = context.input.recentContentIds.filter(id =>
      topRanked.includes(id)
    ).length;

    // Calculate novelty score
    const noveltyScore = recentInTop / Math.min(topN, context.input.recentContentIds.length);
    const isDiminishing = noveltyScore < 0.3;

    return {
      isDiminishing,
      noveltyScore,
      recommendation: isDiminishing
        ? `Stop - only ${recentInTop}/${topN} recent sources are highly relevant. Diminishing returns detected.`
        : `Continue - ${recentInTop}/${topN} recent sources rank in top results. Still adding value.`,
    };
  },
});

The insight: If new sources don't rank highly vs existing content, they're redundant. The agent stops automatically.

Full code for all 10 tools: See the GitHub repository for complete implementations of all tools including ingestion, entity extraction, web search, and report generation.

Step 4: Create the Mastra Agent (3 min)

Bring all tools together with intelligent orchestration.

File: src/agent.ts

import { Agent } from '@mastra/core/agent';
import { openai } from '@ai-sdk/openai';
import {
  createWorkflowTool,
  ingestDocumentTool,
  ingestBatchTool,
  extractEntitiesTool,
  selectTopEntitiesTool,
  searchWebTool,
  filterSearchResultsTool,
  detectDiminishingReturnsTool,
  generateReportTool,
} from './tools/index.js';

export const deepResearchAgent = new Agent({
  name: 'Deep Research Agent',
  
  instructions: `You are an autonomous research agent using semantic memory and knowledge graphs.

Your workflow:
1. Create workflow + collection with entity extraction
2. Ingest seed URL or search results
3. Extract entities discovered in your knowledge graph
4. Select top 5 entities (focus on PERSON, ORGANIZATION, CATEGORY)
5. Search web for each entity (10 results via Exa)
6. Filter search results BEFORE ingesting (use filterSearchResults tool)
7. Batch ingest only filtered, high-quality sources
8. Check convergence (use detectDiminishingReturns - stop if novelty <30%)
9. Generate comprehensive report with citations

Always filter before ingesting to ensure quality and efficiency.`,

  model: openai('gpt-4o'),

  tools: {
    createWorkflow: createWorkflowTool,
    ingestDocument: ingestDocumentTool,
    ingestBatch: ingestBatchTool,
    extractEntities: extractEntitiesTool,
    selectTopEntities: selectTopEntitiesTool,
    searchWeb: searchWebTool,
    filterSearchResults: filterSearchResultsTool,
    detectDiminishingReturns: detectDiminishingReturnsTool,
    generateReport: generateReportTool,
  },
});

Why agent pattern: The LLM decides when to use each tool. Adaptive, resilient, production-ready for AI applications.

Step 5: Build the CLI (5 min)

Create a polished interface for running research.

File: src/main.ts (abbreviated - see full code)

#!/usr/bin/env node
import { config } from 'dotenv';
import { deepResearchAgent } from './agent.js';
import chalk from 'chalk';
import ora from 'ora';
import boxen from 'boxen';

config();

async function main() {
  const args = process.argv.slice(2);
  
  const url = args[args.indexOf('--url') + 1];
  const searchQuery = args[args.indexOf('--search') + 1];
  const numResults = parseInt(args[args.indexOf('--results') + 1] || '5');

  // Comprehensive validation (env vars, args, formats)
  // ... see full code for complete validation

  const spinner = ora('Starting research...').start();

  const prompt = url
    ? `Research starting from: ${url}`
    : `Research: ${searchQuery} (top ${numResults} seeds)`;

  const response = await deepResearchAgent.generate(prompt, {
    onStepFinish: (step) => {
      spinner.succeed(chalk.green(`✅ ${step.toolCalls?.[0]?.toolName}`));
      spinner.start('Next...');
    },
  });

  spinner.succeed(chalk.green('✅ Complete!'));
  console.log('\n' + response.text); // Report to stdout
}

main();

Running Your Agent

URL Mode:

pnpm start --url "https://en.wikipedia.org/wiki/RAG" > report.md

Search Mode:

pnpm start --search "knowledge graph embeddings" --results 5 > report.md

Expected output:

Terminal (progress):

╭────────────────────────────────╮
│ Deep Research Agent            │
│ Powered by Mastra + Graphlit  │
╰────────────────────────────────╯

🔍 Research Query: "knowledge graph embeddings"

✅ searchWeb
✅ ingestBatch
✅ extractEntities
✅ filterSearchResults (kept 4/10)
✅ detectDiminishingReturns (novelty: 0.42 - continue)
✅ generateReport
✅ Complete!

Report (report.md):

# Research Report: Knowledge Graph Embeddings

## Executive Summary

Knowledge graph embeddings represent entities and relations in continuous vector spaces...

[2000 words synthesized from 8 sources]

## References
1. Smith et al. - TransE: Translating Embeddings for Knowledge Graphs
   https://papers.nips.cc/...
2. ...

Production Patterns

Performance Optimizations

Parallel processing:

// Search all entities concurrently
await Promise.allSettled(
  entities.map(e => searchWebTool.execute({ query: e.name }))
);

Synchronous ingestion:

// No polling - content ready when call returns
await graphlit.ingestUri(url, undefined, undefined, undefined, true);

Pre-filtering:

// Analyze 50, process 8
const filtered = await filterSearchResults(results, query, maxResults: 5);

Typical Session Metrics

Without filtering:

  • Sources processed: ~50

  • Processing time: 2-3 minutes

  • Quality: Significant noise

With filtering:

  • Sources processed: ~8

  • Processing time: 30-45 seconds

  • Quality: High signal-to-noise ratio


Alternative Frameworks

This tutorial uses Mastra (TypeScript). Graphlit works with other frameworks:

For Python developers:

  • Agno - Ultra-fast Python agents

  • LangGraph - Graph-based state machines

For TypeScript developers:

  • Vercel AI SDK Workflow - Deterministic orchestration

All use the same Graphlit SDK—choose based on language preference.


Next Steps

Try It Out

Clone and run:

git clone https://github.com/graphlit/graphlit-samples.git
cd graphlit-samples/nextjs/mastra-deep-research
pnpm install
cp .env.example .env
# Add your credentials
pnpm start --search "your query"

Extend It

Domain-specific entities:

  • Medical: ObservableTypes.MedicalCondition, Drug

  • Legal: ObservableTypes.LegalCase, Contract

  • Business: ObservableTypes.Product, Event

Multi-pass research:

  • Extract entities from Layer 2 results

  • Research 2-3 passes deep

  • Configurable depth limits

Real-time monitoring:

  • Create Exa feeds for discovered entities

  • Auto-expand knowledge base daily

Learn More

Related Tutorials:

Production Example:

Graphlit Resources:

Mastra Resources:


Summary

You've learned how to build a production-ready autonomous research agent:

Key innovations:

  1. Pre-ingestion filtering - Native reranker analyzes sources before processing

  2. Diminishing returns detection - Agent knows when to stop autonomously

  3. Summary-based synthesis - Scales to 100+ sources via optimized summaries

  4. Entity-driven discovery - Knowledge graph powers multi-hop reasoning

Architecture:

  • Mastra handles orchestration and tool-calling

  • Graphlit provides semantic memory, knowledge graph, and intelligence

  • Clean separation of concerns, production-ready patterns

Time investment: 30-40 minutes Value delivered: Weeks of infrastructure work eliminated, battle-tested patterns

This approach works for competitive intelligence, market research, technical deep-dives, and any multi-source synthesis use case.


Complete implementation: GitHub Repository

Last updated

Was this helpful?