Agno (Python)

Build an autonomous AI research agent in Python with Agno and Graphlit—5000x faster with simpler code

⏱️ Time: 30-40 minutes 🎯 Level: Advanced 💻 SDK: Python (Agno framework)

What You'll Learn

In this tutorial, you'll build a production-ready research agent in Python that:

  • ✅ Extracts entities from documents using Graphlit's knowledge graph

  • ✅ Performs multi-hop web research (searches for discovered entities)

  • ✅ Filters sources before ingesting using native reranking

  • ✅ Detects convergence automatically (knows when to stop)

  • ✅ Synthesizes multi-source reports with citations

Why Agno: 5000x faster than LangGraph, 50x less memory. Simple Python functions as tools—no decorators, no complex schemas.


What You'll Build

Same autonomous research agent, Python implementation:

  1. Ingests seed source - URL or search results

  2. Discovers entities - From your knowledge graph

  3. Researches each entity - Exa search, 10 sources per entity

  4. Filters intelligently - Analyzes 50, ingests only top 8

  5. Detects convergence - Stops at novelty score <30%

  6. Synthesizes report - Markdown with citations

Example: Wikipedia on "RAG" → 15 entities → 50 sources → 8 filtered → 2000-word report in ~45 seconds

🔗 Full code: GitHub


Prerequisites


Why Agno + Python?

Agno's Advantages

Performance:

  • 5000x faster than LangGraph (~2-3 microseconds per agent)

  • 50x less memory (~6.5KB per agent vs 325KB)

Simplicity:

  • Tools are just Python functions (no decorators!)

  • No complex schemas (docstrings = tool descriptions)

  • Clean async/await syntax

Compare Tool Definition:

Mastra (TypeScript):

export const myTool = createTool({
  id: 'my-tool',
  description: 'Tool description',
  inputSchema: z.object({ /* Zod schema */ }),
  outputSchema: z.object({ /* Zod schema */ }),
  execute: async ({ context }) => { /* ... */ },
});

Agno (Python):

async def my_tool(param: str) -> dict:
    """Tool description."""
    # ... implementation
    return {"result": "value"}

460 lines of Python vs 750 lines of TypeScript for the same functionality.


Why This Matters: What Graphlit Handles

Before we dive into building, understand what Graphlit provides so you don't have to build it:

Infrastructure (Weeks → Hours)

  • File parsing - PDFs, DOCX, audio, video (30+ formats)

  • Vector database - Managed Qdrant, auto-scaled

  • Multi-tenant isolation - Each user gets isolated environment

  • GraphQL API - Auto-generated, authenticated

Intelligence (Months → API Calls)

  • Automatic entity extraction - LLM-powered workflow extracts Person, Organization, Category during ingestion

  • Knowledge graph - Built on Schema.org/JSON-LD standard, relationships auto-created

  • Native reranker - Fast, accurate relevance scoring (enables our pre-filtering!)

  • Exa search built-in - No separate API key needed, semantic web search included

  • Summary-based RAG - Scales to 100+ documents via optimized summaries

Time savings: Estimated 12-14 weeks of infrastructure development you skip.

Production proof: This pattern is used in Zine, serving thousands of users with millions of documents.


The Key Innovation: Pre-Ingestion Filtering

Most research implementations blindly ingest everything they find. This creates noise and wastes processing.

The breakthrough: Analyze sources before fully ingesting them.

Here's the pattern:

  1. Quick ingest to temporary collection (lightweight)

  2. Use Graphlit's native reranker to score relevance

  3. Filter out low-scoring sources (<0.5 relevance)

  4. Only fully ingest top 5-8 sources

  5. Delete temporary collection

Why this works: Graphlit's native reranker is fast enough (~2 seconds) to analyze 50 sources before deciding which to fully process.

Result: Process 8 sources instead of 50. Faster, higher quality, better signal-to-noise.


The 5-Phase Research Algorithm

Phase 1: Seed Acquisition

Two starting modes:

URL Mode - Start from a specific source:

uv run deep-research --url "https://arxiv.org/abs/2005.11401"

Best for: Research papers, documentation, whitepapers

Search Mode - Discover seed sources automatically:

uv run deep-research --search "retrieval augmented generation" --results 5

Best for: Open-ended research, new topics

Phase 2: Entity-Driven Discovery

Instead of keyword-based research, let the knowledge graph drive discovery:

  • Automatic extraction: Entities extracted during ingestion (no separate step!)

  • Types: Person, Organization, Category (concepts/technical terms)

  • Ranking: By occurrence count and semantic importance

  • Selection: Top 5 become research seeds

Why entity-driven works: A RAG paper mentions "vector databases" and "BERT"—those naturally become your next research directions. Mimics human researcher behavior.

Phase 3: Intelligent Expansion

For each entity:

  1. Search Exa for 10 related sources

  2. Pre-filter before ingesting (the key innovation!)

  3. Only ingest top 3-5 highest-quality sources

The filtering workflow:

50 sources found via Exa search
  ↓ Quick ingest to temp collection
  ↓ Rerank by relevance (native reranker)
  ↓ Filter (keep score >0.5)
  ↓ Full ingest top 5 only
  ↓ Delete temp collection
8 sources ingested total

Benefit: Analyze 50, process 8. Significantly faster with better quality.

Phase 4: Convergence Detection

Automatically detect when research has plateaued:

Novelty scoring algorithm:

  1. After ingesting new sources, rerank ALL content by relevance to query

  2. Check how many recent sources appear in top 10

  3. Calculate novelty score: recent_in_top_10 / total_recent

  4. If score <30%, diminishing returns detected → stop

Example:

  • Ingested 5 new sources

  • Reranked all 25 total sources

  • Only 1 new source in top 10

  • Novelty: 1/5 = 20% → Stop researching

Why this works: If new sources don't rank highly vs existing content, they're redundant. Agent stops automatically, no manual intervention.

Phase 5: Multi-Source Synthesis

Traditional RAG struggles beyond 10-20 documents. We scale to 100+:

Summary-based RAG approach:

  1. Create conversation scoped to research collection

  2. Use publish_contents() which operates on optimized summaries

  3. LLM synthesizes across all sources simultaneously

  4. Citations automatically included

Python implementation:

# Create conversation
conversation = await graphlit.client.create_conversation(
    input=ConversationInput(
        name="Research Report",
        collections=[EntityReferenceInput(id=collection_id)],
    )
)

# Generate report using publishContents (summary-based)
response = await graphlit.client.publish_contents(
    publish_type=PublishTypes.MARKDOWN,
    prompt="Synthesize comprehensive report with citations",
    conversation=EntityReferenceInput(id=conversation.create_conversation.id),
)

Why it scales: Operates on summaries, not full content. Fast, accurate, handles 100+ sources.


Implementation: Step-by-Step

Step 1: Project Setup (3 min)

With uv (recommended - faster than pip):

Install uv if you haven't already:

curl -LsSf https://astral.sh/uv/install.sh | sh

Create project:

mkdir deep-research && cd deep-research
uv init
uv add agno graphlit-client python-dotenv rich openai

Or with pip:

mkdir deep-research && cd deep-research
python -m venv venv
source venv/bin/activate
pip install agno graphlit-client python-dotenv

Create .env:

# From portal.graphlit.dev
GRAPHLIT_ENVIRONMENT_ID=your_id
GRAPHLIT_ORGANIZATION_ID=your_org
GRAPHLIT_JWT_SECRET=your_secret

# From platform.openai.com
OPENAI_API_KEY=your_key

Step 2: Singleton Graphlit Client (1 min)

File: deep_research/graphlit_client.py

"""Singleton Graphlit client."""
from graphlit import Graphlit
from dotenv import load_dotenv

# Load environment variables first
load_dotenv()

# One instance, auto-reads env vars
graphlit = Graphlit()

Why singleton: Same pattern as Mastra—one shared instance, efficient.

Note: We load dotenv here so environment variables are available when the module is imported.

Step 3: Build Tools (15 min)

Here's where Agno shines—simple Python functions!

File: deep_research/tools.py

"""Research tools as simple Python functions.

Agno advantage: No decorators, no schemas - just functions with docstrings!
"""
from .graphlit_client import graphlit
from graphlit_api import *


# Tool 1: Create Workflow
async def create_workflow(name: str) -> dict:
    """Create collection and workflow with automatic entity extraction.
    
    Args:
        name: Name for the research collection
        
    Returns:
        Dict with collection_id and workflow_id
    """
    # Create collection
    coll_response = await graphlit.client.create_collection(
        input=CollectionInput(name=name)
    )
    
    collection_id = coll_response.create_collection.id
    
    # Create workflow with entity extraction
    wf_response = await graphlit.client.upsert_workflow(
        workflow=WorkflowInput(
            name=f"{name} Workflow",
            ingestion=IngestionWorkflowStageInput(
                collections=[EntityReferenceInput(id=collection_id)]
            ),
            extraction=ExtractionWorkflowStageInput(
                jobs=[
                    ExtractionWorkflowJobInput(
                        connector=EntityExtractionConnectorInput(
                            type=EntityExtractionServiceTypes.MODELTEXT,
                            extracted_types=[
                                ObservableTypes.PERSON,
                                ObservableTypes.ORGANIZATION,
                                ObservableTypes.CATEGORY,
                            ],
                            extracted_count=10,
                        )
                    )
                ]
            ),
        )
    )
    
    return {
        "collection_id": collection_id,
        "workflow_id": wf_response.upsert_workflow.id,
    }


# Tool 2: Ingest Document
async def ingest_document(url: str, workflow_id: str, collection_id: str) -> dict:
    """Ingest single document with entity extraction.
    
    Args:
        url: URL to ingest
        workflow_id: Workflow for processing
        collection_id: Collection to add to
        
    Returns:
        Dict with content_id
    """
    response = await graphlit.client.ingest_uri(
        uri=url,
        is_synchronous=True,  # No polling!
        workflow=EntityReferenceInput(id=workflow_id),
        collections=[EntityReferenceInput(id=collection_id)],
    )
    
    return {"content_id": response.ingest_uri.id}

Compare to Mastra:

  • No createTool() wrapper

  • No Zod schemas

  • Docstring = tool description (Agno reads it!)

  • Type hints = parameter validation

  • Clean async/await

Tool 3: Pre-Ingestion Filtering (abbreviated - see full code):

async def filter_search_results(
    search_results: list[dict],
    query: str,
    max_results: int = 5,
    min_relevance_score: float = 0.5,
) -> dict:
    """Filter search results BEFORE full ingestion.
    
    This is the key innovation - analyze before processing!
    """
    import time
    
    # Create temp resources
    temp_wf = await graphlit.client.upsert_workflow(
        workflow=WorkflowInput(name=f"Temp Filter {int(time.time())}")
    )
    
    temp_coll = await graphlit.client.create_collection(
        input=CollectionInput(name=f"Temp Filter {int(time.time())}")
    )
    
    # Quick ingestion for analysis
    import asyncio
    tasks = [/* parallel ingestion */]
    results = await asyncio.gather(*tasks, return_exceptions=True)
    
    # Rerank with native reranker
    reranked = await graphlit.client.retrieve_sources(
        prompt=query,
        filter=ContentCriteriaInput(
            collections=[EntityReferenceInput(id=temp_coll.create_collection.id)]
        ),
        search=SearchStrategyInput(type=SearchTypes.VECTOR),
        rerank=RerankStrategyInput(type=RerankTypes.RERANK),
    )
    
    # Filter and clean up
    # ... (see full code)
    
    return {"filtered_urls": filtered_urls, "reasoning": "..."}

Python advantages:

  • asyncio.gather() for parallel operations (cleaner than Promise.allSettled)

  • List comprehensions for filtering

  • Clean exception handling with return_exceptions=True

Full tool implementations: See GitHub - tools.py for all 9 tools (~300 lines).

Step 4: Create Agno Agent (2 min)

File: deep_research/agent.py

"""Deep Research Agent using Agno."""
from agno.agent import Agent
from agno.models.openai import OpenAIChat

from . import tools


research_agent = Agent(
    name="Deep Research Agent",
    
    model=OpenAIChat(id="gpt-4o"),
    
    instructions="""You are an autonomous research agent using semantic memory.

Your workflow:
1. Create workflow + collection with entity extraction
2. Ingest seed URL or search results
3. Extract entities from knowledge graph
4. Select top 5 entities (PERSON, ORGANIZATION, CATEGORY)
5. Search web for each entity (10 results via Exa)
6. Filter search results BEFORE ingesting (use filter_search_results!)
7. Batch ingest only filtered sources
8. Check convergence (use detect_diminishing_returns - stop if <30%)
9. Generate comprehensive report

Always filter before ingesting.""",
    
    # Agno: Just list functions - that's it!
    tools=[
        tools.create_workflow,
        tools.ingest_document,
        tools.ingest_batch,
        tools.extract_entities,
        tools.select_top_entities,
        tools.search_web,
        tools.filter_search_results,
        tools.detect_diminishing_returns,
        tools.generate_report,
    ],
    
    markdown=True,
)

Agno's simplicity: No createTool(), no tool IDs, no schemas. Just functions.

Step 5: Build CLI (3 min)

File: deep_research/main.py (abbreviated - see full code):

#!/usr/bin/env python3
import asyncio
import sys
from dotenv import load_dotenv
from rich.console import Console
from rich.panel import Panel

from .agent import research_agent

console = Console()


async def main():
    load_dotenv()
    
    # Parse args (same pattern as Mastra)
    url = None if "--url" not in sys.argv else sys.argv[sys.argv.index("--url") + 1]
    search_query = None if "--search" not in sys.argv else sys.argv[sys.argv.index("--search") + 1]
    
    # Comprehensive validation (env vars, args)
    # ... validation code ...
    
    # Polished header with rich
    console.print()
    console.print(Panel.fit(
        "[bold cyan]Deep Research Agent[/bold cyan]\n"
        "[dim]Powered by Agno + Graphlit[/dim]",
        border_style="cyan"
    ))
    console.print()
    
    console.print(f"[bold]🔍 Research Query:[/bold] '{search_query}'\n")
    console.print("[bold green]🚀 Starting research...[/bold green]\n")
    
    # Run agent with streaming
    await research_agent.aprint_response(prompt, stream=True)
    
    console.print("\n\n[bold green]✅ Research complete![/bold green]\n")


if __name__ == "__main__":
    asyncio.run(main())

Python advantages:

  • asyncio.run() handles event loop (simpler than Node.js setup)

  • rich library for beautiful CLI (like chalk + boxen + ora combined)

  • aprint_response() streams automatically with tool display

  • Clean, readable code


Running Your Agent

With uv (recommended):

uv run deep-research --search "knowledge graphs"

With pip:

pip install -e .
deep-research --url "https://en.wikipedia.org/wiki/RAG"

Save to file:

uv run deep-research --search "AI agents" > report.md

Cleanup after (deletes collection, workflow, and content):

uv run deep-research --search "test query" --cleanup

Note: Without --cleanup, content remains in your Graphlit account for exploration in the portal.

Alternative commands (all equivalent):

python -m deep_research --search "query"  # Works after install
python -m deep_research.main --search "query"  # Always works

Expected output:

Terminal (progress):

┌─────────────────────────────────┐
│ Deep Research Agent             │
│ Powered by Agno + Graphlit     │
└─────────────────────────────────┘

🔍 Research Query: 'knowledge graphs'
   (Starting with top 5 sources)

🚀 Starting research...

[Tool: create_workflow]
✓ Created collection and workflow

[Tool: search_web]  
✓ Found 5 seed sources

[Tool: ingest_batch]
✓ Ingested 5 sources

[Tool: extract_entities]
✓ Extracted 12 entities

[Tool: search_web]
✓ Searched for 5 entities

[Tool: filter_search_results]
✓ Analyzed 50 sources. Kept 8 (relevance >=0.5)

[Tool: ingest_batch]
✓ Ingested 8 filtered sources

[Tool: detect_diminishing_returns]
✓ Novelty: 0.42 - Continue

[Tool: generate_report]

# Research Report: Knowledge Graphs

## Executive Summary
...

✅ Research complete!

Production Patterns

Performance Optimizations

Parallel operations:

import asyncio

# Search all entities concurrently
tasks = [search_web(entity["name"]) for entity in entities]
results = await asyncio.gather(*tasks, return_exceptions=True)

Synchronous ingestion:

# No polling - content ready when call returns
await graphlit.client.ingest_uri(
    uri=url,
    is_synchronous=True,  # Blocks until processed
    workflow=EntityReferenceInput(id=workflow_id),
)

Graceful error handling:

# Some sources fail? Continue with successful ones
results = await asyncio.gather(*tasks, return_exceptions=True)
successful = [r for r in results if not isinstance(r, Exception)]

Pre-filtering:

# Analyze 50, ingest only 8
filtered = await filter_search_results(
    search_results=all_results,
    query=query,
    max_results=5,
    min_relevance_score=0.5
)

Typical Session Metrics

Without filtering:

  • Sources processed: ~50

  • Processing time: 2-3 minutes

  • Quality: Significant noise

With filtering:

  • Sources processed: ~8

  • Processing time: 30-45 seconds

  • Quality: High signal-to-noise ratio

Agno Performance:

  • Agent startup: <0.003ms (5000x faster than LangGraph)

  • Memory usage: ~6.5KB per agent (50x less)

  • Report generation: 5-10 seconds


Agno vs Other Frameworks

Feature
Agno
Mastra
LangGraph

Language

Python

TypeScript

Python

Speed

5000x faster

Fast

Baseline

Memory

50x less

Standard

Standard

Tool Definition

Just functions

createTool()

@tool decorator

Schema Required

No (docstrings)

Yes (Zod)

Yes (Pydantic)

Code Size

~460 lines

~750 lines

~800 lines

Learning Curve

Easy

Medium

Hard

Choose Agno when:

  • ✅ You prefer Python

  • ✅ You want maximum performance

  • ✅ You want simpler code

  • ✅ You're building high-throughput systems


Next Steps

Try It Out

git clone https://github.com/graphlit/graphlit-samples.git
cd graphlit-samples/python/agno-deep-research
uv sync
cp .env.example .env
# Add credentials
uv run python -m deep_research.main --search "your query"

Extend It

Domain-specific entities:

Medical research:

extracted_types=[
    ObservableTypes.MEDICALCONDITION,
    ObservableTypes.DRUG,
    ObservableTypes.PERSON,  # Researchers
]

Legal research:

extracted_types=[
    ObservableTypes.LEGALCASE,
    ObservableTypes.CONTRACT,
    ObservableTypes.ORGANIZATION,  # Law firms
]

Business intelligence:

extracted_types=[
    ObservableTypes.PRODUCT,
    ObservableTypes.EVENT,
    ObservableTypes.ORGANIZATION,  # Companies
]

Multi-pass research:

  • Extract entities from Layer 2 results

  • Research 2-3 passes deep

  • Configurable depth limits

Real-time monitoring:

  • Create Exa feeds for discovered entities

  • Auto-expand knowledge base daily

FastAPI server (Agno built-in!):

from agno.agent import Agent

agent = Agent(tools=[...])
agent.app  # Built-in FastAPI server!

Learn More

Related Tutorials:

Production Example:

Resources:


Summary

You've learned to build a production-ready autonomous research agent in Python:

Key innovations (same as Mastra):

  1. Pre-ingestion filtering with native reranker

  2. Autonomous convergence detection

  3. Summary-based RAG for scale

  4. Entity-driven discovery

Agno advantages:

  • 5000x faster execution

  • 50x less memory

  • Simpler code (460 vs 750 lines)

  • No complex schemas

  • Clean Python async/await

Time investment: 30-40 minutes Value delivered: Production-ready patterns, weeks of infrastructure eliminated

This approach works for competitive intelligence, market research, technical deep-dives, and any multi-source synthesis.


Complete implementation: GitHub Repository

Last updated

Was this helpful?