Agno (Python)
Build an autonomous AI research agent in Python with Agno and Graphlitβ5000x faster with simpler code
β±οΈ Time: 30-40 minutes π― Level: Advanced π» SDK: Python (Agno framework)
What You'll Learn
In this tutorial, you'll build a production-ready research agent in Python that:
β Extracts entities from documents using Graphlit's knowledge graph
β Performs multi-hop web research (searches for discovered entities)
β Filters sources before ingesting using native reranking
β Detects convergence automatically (knows when to stop)
β Synthesizes multi-source reports with citations
Why Agno: 5000x faster than LangGraph, 50x less memory. Simple Python functions as toolsβno decorators, no complex schemas.
What You'll Build
Same autonomous research agent, Python implementation:
Ingests seed source - URL or search results
Discovers entities - From your knowledge graph
Researches each entity - Exa search, 10 sources per entity
Filters intelligently - Analyzes 50, ingests only top 8
Detects convergence - Stops at novelty score <30%
Synthesizes report - Markdown with citations
Example: Wikipedia on "RAG" β 15 entities β 50 sources β 8 filtered β 2000-word report in ~45 seconds
π Full code: GitHub
Prerequisites
Required:
Python 3.11+
Graphlit account + credentials
Package manager: uv (recommended,
curl -LsSf https://astral.sh/uv/install.sh | sh) or pip
Recommended:
Complete Quickstart (7 minutes)
Complete Knowledge Graph tutorial (20 minutes)
Why Agno + Python?
Agno's Advantages
Performance:
5000x faster than LangGraph (~2-3 microseconds per agent)
50x less memory (~6.5KB per agent vs 325KB)
Simplicity:
Tools are just Python functions (no decorators!)
No complex schemas (docstrings = tool descriptions)
Clean async/await syntax
Compare Tool Definition:
Mastra (TypeScript):
Agno (Python):
460 lines of Python vs 750 lines of TypeScript for the same functionality.
Why This Matters: What Graphlit Handles
Before we dive into building, understand what Graphlit provides so you don't have to build it:
Infrastructure (Weeks β Hours)
β File parsing - PDFs, DOCX, audio, video (30+ formats)
β Vector database - Managed Qdrant, auto-scaled
β Multi-tenant isolation - Each user gets isolated environment
β GraphQL API - Auto-generated, authenticated
Intelligence (Months β API Calls)
β Automatic entity extraction - LLM-powered workflow extracts Person, Organization, Category during ingestion
β Knowledge graph - Built on Schema.org/JSON-LD standard, relationships auto-created
β Native reranker - Fast, accurate relevance scoring (enables our pre-filtering!)
β Exa search built-in - No separate API key needed, semantic web search included
β Summary-based RAG - Scales to 100+ documents via optimized summaries
Time savings: Estimated 12-14 weeks of infrastructure development you skip.
Production proof: This pattern is used in Zine, serving thousands of users with millions of documents.
The Key Innovation: Pre-Ingestion Filtering
Most research implementations blindly ingest everything they find. This creates noise and wastes processing.
The breakthrough: Analyze sources before fully ingesting them.
Here's the pattern:
Quick ingest to temporary collection (lightweight)
Use Graphlit's native reranker to score relevance
Filter out low-scoring sources (<0.5 relevance)
Only fully ingest top 5-8 sources
Delete temporary collection
Why this works: Graphlit's native reranker is fast enough (~2 seconds) to analyze 50 sources before deciding which to fully process.
Result: Process 8 sources instead of 50. Faster, higher quality, better signal-to-noise.
The 5-Phase Research Algorithm
Phase 1: Seed Acquisition
Two starting modes:
URL Mode - Start from a specific source:
Best for: Research papers, documentation, whitepapers
Search Mode - Discover seed sources automatically:
Best for: Open-ended research, new topics
Phase 2: Entity-Driven Discovery
Instead of keyword-based research, let the knowledge graph drive discovery:
Automatic extraction: Entities extracted during ingestion (no separate step!)
Types: Person, Organization, Category (concepts/technical terms)
Ranking: By occurrence count and semantic importance
Selection: Top 5 become research seeds
Why entity-driven works: A RAG paper mentions "vector databases" and "BERT"βthose naturally become your next research directions. Mimics human researcher behavior.
Phase 3: Intelligent Expansion
For each entity:
Search Exa for 10 related sources
Pre-filter before ingesting (the key innovation!)
Only ingest top 3-5 highest-quality sources
The filtering workflow:
Benefit: Analyze 50, process 8. Significantly faster with better quality.
Phase 4: Convergence Detection
Automatically detect when research has plateaued:
Novelty scoring algorithm:
After ingesting new sources, rerank ALL content by relevance to query
Check how many recent sources appear in top 10
Calculate novelty score:
recent_in_top_10 / total_recentIf score <30%, diminishing returns detected β stop
Example:
Ingested 5 new sources
Reranked all 25 total sources
Only 1 new source in top 10
Novelty: 1/5 = 20% β Stop researching
Why this works: If new sources don't rank highly vs existing content, they're redundant. Agent stops automatically, no manual intervention.
Phase 5: Multi-Source Synthesis
Traditional RAG struggles beyond 10-20 documents. We scale to 100+:
Summary-based RAG approach:
Create conversation scoped to research collection
Use
publish_contents()which operates on optimized summariesLLM synthesizes across all sources simultaneously
Citations automatically included
Python implementation:
Why it scales: Operates on summaries, not full content. Fast, accurate, handles 100+ sources.
Implementation: Step-by-Step
Step 1: Project Setup (3 min)
With uv (recommended - faster than pip):
Install uv if you haven't already:
Create project:
Or with pip:
Create .env:
Step 2: Singleton Graphlit Client (1 min)
File: deep_research/graphlit_client.py
Why singleton: Same pattern as Mastraβone shared instance, efficient.
Note: We load dotenv here so environment variables are available when the module is imported.
Step 3: Build Tools (15 min)
Here's where Agno shinesβsimple Python functions!
File: deep_research/tools.py
Compare to Mastra:
No
createTool()wrapperNo Zod schemas
Docstring = tool description (Agno reads it!)
Type hints = parameter validation
Clean async/await
Tool 3: Pre-Ingestion Filtering (abbreviated - see full code):
Python advantages:
asyncio.gather()for parallel operations (cleaner thanPromise.allSettled)List comprehensions for filtering
Clean exception handling with
return_exceptions=True
Step 4: Create Agno Agent (2 min)
File: deep_research/agent.py
Agno's simplicity: No createTool(), no tool IDs, no schemas. Just functions.
Step 5: Build CLI (3 min)
File: deep_research/main.py (abbreviated - see full code):
Python advantages:
asyncio.run()handles event loop (simpler than Node.js setup)richlibrary for beautiful CLI (like chalk + boxen + ora combined)aprint_response()streams automatically with tool displayClean, readable code
Running Your Agent
With uv (recommended):
With pip:
Save to file:
Cleanup after (deletes collection, workflow, and content):
Note: Without --cleanup, content remains in your Graphlit account for exploration in the portal.
Alternative commands (all equivalent):
Expected output:
Terminal (progress):
Production Patterns
Performance Optimizations
Parallel operations:
Synchronous ingestion:
Graceful error handling:
Pre-filtering:
Typical Session Metrics
Without filtering:
Sources processed: ~50
Processing time: 2-3 minutes
Quality: Significant noise
With filtering:
Sources processed: ~8
Processing time: 30-45 seconds
Quality: High signal-to-noise ratio
Agno Performance:
Agent startup: <0.003ms (5000x faster than LangGraph)
Memory usage: ~6.5KB per agent (50x less)
Report generation: 5-10 seconds
Agno vs Other Frameworks
Language
Python
TypeScript
Python
Speed
5000x faster
Fast
Baseline
Memory
50x less
Standard
Standard
Tool Definition
Just functions
createTool()
@tool decorator
Schema Required
No (docstrings)
Yes (Zod)
Yes (Pydantic)
Code Size
~460 lines
~750 lines
~800 lines
Learning Curve
Easy
Medium
Hard
Choose Agno when:
β You prefer Python
β You want maximum performance
β You want simpler code
β You're building high-throughput systems
Next Steps
Try It Out
Extend It
Domain-specific entities:
Medical research:
Legal research:
Business intelligence:
Multi-pass research:
Extract entities from Layer 2 results
Research 2-3 passes deep
Configurable depth limits
Real-time monitoring:
Create Exa feeds for discovered entities
Auto-expand knowledge base daily
FastAPI server (Agno built-in!):
Learn More
Related Tutorials:
Mastra (TypeScript) - Same algorithm, TypeScript
Knowledge Graph - Entity extraction deep-dive
Production Deployment - Multi-tenant patterns
Production Example:
Zine Case Study - Real app serving thousands
Resources:
Summary
You've learned to build a production-ready autonomous research agent in Python:
Key innovations (same as Mastra):
Pre-ingestion filtering with native reranker
Autonomous convergence detection
Summary-based RAG for scale
Entity-driven discovery
Agno advantages:
5000x faster execution
50x less memory
Simpler code (460 vs 750 lines)
No complex schemas
Clean Python async/await
Time investment: 30-40 minutes Value delivered: Production-ready patterns, weeks of infrastructure eliminated
This approach works for competitive intelligence, market research, technical deep-dives, and any multi-source synthesis.
Complete implementation: GitHub Repository
Last updated
Was this helpful?