Agno (Python)

Build an autonomous AI research agent in Python with Agno and Graphlitβ€”5000x faster with simpler code

⏱️ Time: 30-40 minutes 🎯 Level: Advanced πŸ’» SDK: Python (Agno framework)

What You'll Learn

In this tutorial, you'll build a production-ready research agent in Python that:

  • βœ… Extracts entities from documents using Graphlit's knowledge graph

  • βœ… Performs multi-hop web research (searches for discovered entities)

  • βœ… Filters sources before ingesting using native reranking

  • βœ… Detects convergence automatically (knows when to stop)

  • βœ… Synthesizes multi-source reports with citations

Why Agno: 5000x faster than LangGraph, 50x less memory. Simple Python functions as toolsβ€”no decorators, no complex schemas.


What You'll Build

Same autonomous research agent, Python implementation:

  1. Ingests seed source - URL or search results

  2. Discovers entities - From your knowledge graph

  3. Researches each entity - Exa search, 10 sources per entity

  4. Filters intelligently - Analyzes 50, ingests only top 8

  5. Detects convergence - Stops at novelty score <30%

  6. Synthesizes report - Markdown with citations

Example: Wikipedia on "RAG" β†’ 15 entities β†’ 50 sources β†’ 8 filtered β†’ 2000-word report in ~45 seconds

πŸ”— Full code: GitHub


Prerequisites


Why Agno + Python?

Agno's Advantages

Performance:

  • 5000x faster than LangGraph (~2-3 microseconds per agent)

  • 50x less memory (~6.5KB per agent vs 325KB)

Simplicity:

  • Tools are just Python functions (no decorators!)

  • No complex schemas (docstrings = tool descriptions)

  • Clean async/await syntax

Compare Tool Definition:

Mastra (TypeScript):

Agno (Python):

460 lines of Python vs 750 lines of TypeScript for the same functionality.


Why This Matters: What Graphlit Handles

Before we dive into building, understand what Graphlit provides so you don't have to build it:

Infrastructure (Weeks β†’ Hours)

  • βœ… File parsing - PDFs, DOCX, audio, video (30+ formats)

  • βœ… Vector database - Managed Qdrant, auto-scaled

  • βœ… Multi-tenant isolation - Each user gets isolated environment

  • βœ… GraphQL API - Auto-generated, authenticated

Intelligence (Months β†’ API Calls)

  • βœ… Automatic entity extraction - LLM-powered workflow extracts Person, Organization, Category during ingestion

  • βœ… Knowledge graph - Built on Schema.org/JSON-LD standard, relationships auto-created

  • βœ… Native reranker - Fast, accurate relevance scoring (enables our pre-filtering!)

  • βœ… Exa search built-in - No separate API key needed, semantic web search included

  • βœ… Summary-based RAG - Scales to 100+ documents via optimized summaries

Time savings: Estimated 12-14 weeks of infrastructure development you skip.

Production proof: This pattern is used in Zine, serving thousands of users with millions of documents.


The Key Innovation: Pre-Ingestion Filtering

Most research implementations blindly ingest everything they find. This creates noise and wastes processing.

The breakthrough: Analyze sources before fully ingesting them.

Here's the pattern:

  1. Quick ingest to temporary collection (lightweight)

  2. Use Graphlit's native reranker to score relevance

  3. Filter out low-scoring sources (<0.5 relevance)

  4. Only fully ingest top 5-8 sources

  5. Delete temporary collection

Why this works: Graphlit's native reranker is fast enough (~2 seconds) to analyze 50 sources before deciding which to fully process.

Result: Process 8 sources instead of 50. Faster, higher quality, better signal-to-noise.


The 5-Phase Research Algorithm

Phase 1: Seed Acquisition

Two starting modes:

URL Mode - Start from a specific source:

Best for: Research papers, documentation, whitepapers

Search Mode - Discover seed sources automatically:

Best for: Open-ended research, new topics

Phase 2: Entity-Driven Discovery

Instead of keyword-based research, let the knowledge graph drive discovery:

  • Automatic extraction: Entities extracted during ingestion (no separate step!)

  • Types: Person, Organization, Category (concepts/technical terms)

  • Ranking: By occurrence count and semantic importance

  • Selection: Top 5 become research seeds

Why entity-driven works: A RAG paper mentions "vector databases" and "BERT"β€”those naturally become your next research directions. Mimics human researcher behavior.

Phase 3: Intelligent Expansion

For each entity:

  1. Search Exa for 10 related sources

  2. Pre-filter before ingesting (the key innovation!)

  3. Only ingest top 3-5 highest-quality sources

The filtering workflow:

Benefit: Analyze 50, process 8. Significantly faster with better quality.

Phase 4: Convergence Detection

Automatically detect when research has plateaued:

Novelty scoring algorithm:

  1. After ingesting new sources, rerank ALL content by relevance to query

  2. Check how many recent sources appear in top 10

  3. Calculate novelty score: recent_in_top_10 / total_recent

  4. If score <30%, diminishing returns detected β†’ stop

Example:

  • Ingested 5 new sources

  • Reranked all 25 total sources

  • Only 1 new source in top 10

  • Novelty: 1/5 = 20% β†’ Stop researching

Why this works: If new sources don't rank highly vs existing content, they're redundant. Agent stops automatically, no manual intervention.

Phase 5: Multi-Source Synthesis

Traditional RAG struggles beyond 10-20 documents. We scale to 100+:

Summary-based RAG approach:

  1. Create conversation scoped to research collection

  2. Use publish_contents() which operates on optimized summaries

  3. LLM synthesizes across all sources simultaneously

  4. Citations automatically included

Python implementation:

Why it scales: Operates on summaries, not full content. Fast, accurate, handles 100+ sources.


Implementation: Step-by-Step

Step 1: Project Setup (3 min)

With uv (recommended - faster than pip):

Install uv if you haven't already:

Create project:

Or with pip:

Create .env:

Step 2: Singleton Graphlit Client (1 min)

File: deep_research/graphlit_client.py

Why singleton: Same pattern as Mastraβ€”one shared instance, efficient.

Note: We load dotenv here so environment variables are available when the module is imported.

Step 3: Build Tools (15 min)

Here's where Agno shinesβ€”simple Python functions!

File: deep_research/tools.py

Compare to Mastra:

  • No createTool() wrapper

  • No Zod schemas

  • Docstring = tool description (Agno reads it!)

  • Type hints = parameter validation

  • Clean async/await

Tool 3: Pre-Ingestion Filtering (abbreviated - see full code):

Python advantages:

  • asyncio.gather() for parallel operations (cleaner than Promise.allSettled)

  • List comprehensions for filtering

  • Clean exception handling with return_exceptions=True

Full tool implementations: See GitHub - tools.py for all 9 tools (~300 lines).

Step 4: Create Agno Agent (2 min)

File: deep_research/agent.py

Agno's simplicity: No createTool(), no tool IDs, no schemas. Just functions.

Step 5: Build CLI (3 min)

File: deep_research/main.py (abbreviated - see full code):

Python advantages:

  • asyncio.run() handles event loop (simpler than Node.js setup)

  • rich library for beautiful CLI (like chalk + boxen + ora combined)

  • aprint_response() streams automatically with tool display

  • Clean, readable code


Running Your Agent

With uv (recommended):

With pip:

Save to file:

Cleanup after (deletes collection, workflow, and content):

Note: Without --cleanup, content remains in your Graphlit account for exploration in the portal.

Alternative commands (all equivalent):

Expected output:

Terminal (progress):


Production Patterns

Performance Optimizations

Parallel operations:

Synchronous ingestion:

Graceful error handling:

Pre-filtering:

Typical Session Metrics

Without filtering:

  • Sources processed: ~50

  • Processing time: 2-3 minutes

  • Quality: Significant noise

With filtering:

  • Sources processed: ~8

  • Processing time: 30-45 seconds

  • Quality: High signal-to-noise ratio

Agno Performance:

  • Agent startup: <0.003ms (5000x faster than LangGraph)

  • Memory usage: ~6.5KB per agent (50x less)

  • Report generation: 5-10 seconds


Agno vs Other Frameworks

Feature
Agno
Mastra
LangGraph

Language

Python

TypeScript

Python

Speed

5000x faster

Fast

Baseline

Memory

50x less

Standard

Standard

Tool Definition

Just functions

createTool()

@tool decorator

Schema Required

No (docstrings)

Yes (Zod)

Yes (Pydantic)

Code Size

~460 lines

~750 lines

~800 lines

Learning Curve

Easy

Medium

Hard

Choose Agno when:

  • βœ… You prefer Python

  • βœ… You want maximum performance

  • βœ… You want simpler code

  • βœ… You're building high-throughput systems


Next Steps

Try It Out

Extend It

Domain-specific entities:

Medical research:

Legal research:

Business intelligence:

Multi-pass research:

  • Extract entities from Layer 2 results

  • Research 2-3 passes deep

  • Configurable depth limits

Real-time monitoring:

  • Create Exa feeds for discovered entities

  • Auto-expand knowledge base daily

FastAPI server (Agno built-in!):

Learn More

Related Tutorials:

Production Example:

Resources:


Summary

You've learned to build a production-ready autonomous research agent in Python:

Key innovations (same as Mastra):

  1. Pre-ingestion filtering with native reranker

  2. Autonomous convergence detection

  3. Summary-based RAG for scale

  4. Entity-driven discovery

Agno advantages:

  • 5000x faster execution

  • 50x less memory

  • Simpler code (460 vs 750 lines)

  • No complex schemas

  • Clean Python async/await

Time investment: 30-40 minutes Value delivered: Production-ready patterns, weeks of infrastructure eliminated

This approach works for competitive intelligence, market research, technical deep-dives, and any multi-source synthesis.


Complete implementation: GitHub Repository

Last updated

Was this helpful?