Workflows

Complete reference for Graphlit workflows - memory formation pipeline configuration

Workflows define how content is processed as it flows through Graphlit's memory formation pipeline. This is the authoritative reference for all workflow configuration options, defaults, and decision guidance.

On this page:


Overview & Core Concepts

What Workflows Do

Workflows control the memory formation pipeline - how raw content transforms into structured, searchable semantic memory:

The Six Pipeline Stages (in execution order):

  1. Ingestion - Filter which content to accept (file types, paths)

  2. Indexing - Configure embedding model and vector storage

  3. Preparation - Extract text/markdown from files (PDFs, audio, images)

  4. Extraction - Identify entities (people, organizations, topics)

  5. Enrichment - Add external data (links, FHIR, Diffbot)

  6. Classification - Categorize content using LLMs

Additional Configuration:

  • Storage - Where files are stored (defaults to managed storage)

  • Actions - Post-processing webhooks and integrations

The Workflow Object

Key insight: All stages are optional. Graphlit has intelligent defaults.


Default Behavior (No Workflow)

What Happens Without a Workflow

Graphlit's Default Pipeline:

Stage
Default Behavior
Speed

Preparation

Intelligent preparation (see below)

⚡ Fast

Extraction

❌ No entity extraction

⚡ Instant

Enrichment

❌ No external enrichment

⚡ Instant

Indexing

Project default embedding (text-embedding-ada-002)

⚡ Fast

Default Preparation: Intelligent Per-Format Processing

Graphlit automatically selects the best preparation method based on content type:

PDFs & Office Documents:

  • Azure AI Document Intelligence (Layout model)

  • ✅ Extracts text from PDFs, Word docs, PowerPoint

  • ✅ OCR for scanned documents

  • ✅ Basic table recognition

  • ✅ Layout analysis

  • ❌ Advanced table parsing

  • ❌ Image understanding (diagrams, charts)

  • ❌ Complex multi-column layouts

Audio & Video Files:

  • Deepgram Nova 2 Transcription

  • ✅ Automatic transcription

  • ✅ High accuracy

  • ✅ Multiple language support

  • ❌ No speaker diarization (unless you add workflow)

Web Pages:

  • Built-in HTML Parser

  • ✅ Smart HTML extraction

  • ✅ JavaScript rendering (by default)

  • ✅ Markdown conversion

Email, Text, Markdown:

  • Built-in Parsers

  • ✅ Native format support

  • ✅ Metadata extraction

When default preparation is sufficient:

  • Simple text-heavy PDFs (80%+ of documents)

  • Audio/video transcription without speaker identification

  • Most web pages

  • Office documents

  • Standard email/text content

Default Indexing: Project Embedding Model

What it does:

  • ✅ Creates vector embeddings for semantic search

  • ✅ Chunks content intelligently

  • ✅ Stores in vector database

Default model: OpenAI text-embedding-ada-002 (if not configured otherwise)


When Do You Need a Workflow?

Decision Matrix

Goal
Need Workflow?
Stage
Why

Extract text from simple PDF

❌ No

-

Default Azure AI is fine

Extract text from complex PDF

✅ Yes

Preparation

Use vision models (GPT-4o)

Handle images/diagrams in PDF

✅ Yes

Preparation

Vision models understand images

Transcribe audio/video

✅ Yes

Preparation

Use Deepgram or Assembly.AI

Extract entities (people, orgs)

✅ Yes

Extraction

No extraction by default

Build knowledge graph

✅ Yes

Extraction

Entity extraction required

Enrich with external data

✅ Yes

Enrichment

Add Diffbot, FHIR, etc.

Use custom embedding model

✅ Yes

Indexing

Override default embeddings

Filter content during ingestion

✅ Yes

Ingestion

Path/type filtering

Common Scenarios

Scenario 1: Simple Document Q&A

Scenario 2: Complex PDFs with Tables

Scenario 3: Knowledge Graph from Documents


Workflow Stages

Preparation Stage

Purpose: Extract text, markdown, and metadata from raw files.

When you don't need it: Default Azure AI Document Intelligence handles most documents.

When you need it: Complex PDFs, audio transcription, high-quality markdown extraction.

Complete Configuration

Type
Use Case
Speed
Quality
When to Use

AZURE_DOCUMENT_INTELLIGENCE

Default for PDFs - Most PDFs, Office docs

⚡ Fast

⭐⭐⭐ Good

Try this first (automatic default)

REDUCTO

Specialized PDF extraction - Better than default

⚡ Fast

⭐⭐⭐⭐ Excellent

Try if default isn't good enough

MISTRAL_DOCUMENT

Mistral OCR for documents

⚡⚡ Very Fast

⭐⭐⭐⭐ Excellent

Alternative to Reducto

MODEL_DOCUMENT

Vision LLMs - General-purpose, not tuned for docs

⚠️ Slower

⭐⭐⭐⭐ Very Good

Advanced: After trying defaults & Reducto

DEEPGRAM

Default for audio/video - Transcription

⚡ Fast

⭐⭐⭐⭐ Excellent

Automatic default

ASSEMBLY_AI

Audio/video with speaker diarization

⚡ Fast

⭐⭐⭐⭐ Excellent

Alternative to Deepgram

DOCUMENT

Explicit Azure AI configuration

⚡ Fast

⭐⭐⭐ Good

Rarely needed (use default)

EMAIL

Email message parsing

⚡ Instant

⭐⭐⭐⭐⭐ Perfect

Automatic default

PAGE

Web page extraction

⚡ Fast

⭐⭐⭐⭐ Excellent

Automatic default

MODEL_DOCUMENT: Vision LLMs for Documents

Understanding the Options

Document preparation offers a spectrum of tools, each optimized for different needs and cost profiles:

Azure AI Document Intelligence (Default - $0)

  • Automatic OCR and layout analysis

  • Fast, handles most PDFs and Office documents

  • Included in your Graphlit subscription

Reducto / Mistral Document ($)

  • Specialized PDF extraction engines

  • Better at complex tables and multi-column layouts

  • Higher quality than default, but adds per-page cost

Vision LLMs (MODEL_DOCUMENT - $$$)

  • General-purpose vision models (GPT-4o, Claude, Gemini)

  • Understand content semantically, not just structurally

  • Can interpret diagrams, charts, and visual relationships

  • Highest cost (10x more than default), best flexibility

When to use vision LLMs:

  • Specialized tools don't capture the visual meaning you need

  • Documents require semantic understanding of images/diagrams

  • Need custom prompting or model-specific behavior

  • Complex visual documents where structure alone isn't enough

Properties:

Model Selection:

Cost vs. Capability:

  • Vision LLMs cost ~10x more per page than specialized tools

  • Trade higher cost for semantic understanding and flexibility

  • Best for documents where visual meaning matters, not just text extraction

Example:

DEEPGRAM: Audio Transcription (Enhanced)

Default: Deepgram Nova 2 is used automatically for audio/video files.

When to add workflow (enhance default):

  • Enable speaker diarization (identify who's speaking)

  • Enable PII redaction

  • Use different Deepgram model

  • Configure language settings

Properties:

Models:

  • NOVA_2 - Best quality (recommended)

  • NOVA_2_MEDICAL - Medical terminology

  • NOVA_2_FINANCE - Financial terminology

  • NOVA_2_CONVERSATIONAL_AI - Real-time conversations

  • NOVA_2_VOICEMAIL - Voicemail transcription

  • NOVA_2_VIDEO - Video content

  • NOVA_2_PHONE_CALL - Phone calls

Example:

ASSEMBLY_AI: Alternative Audio Transcription

When to use (alternative to default Deepgram):

  • Prefer Assembly.AI over Deepgram

  • Need their specific features

  • Already have Assembly.AI account/credits

Properties:

Models:

  • BEST - Highest accuracy (default)

  • NANO - Fastest, lower cost

Example:

Multi-Job Preparation

Use case: Different file types need different preparation methods.

Example:

Web Page Capture Options

Properties (on PreparationWorkflowStageInput):

Default: disableSmartCapture: false, enableUnblockedCapture: false

When to adjust:

Auto-Summarization During Preparation

Properties:

SummarizationTypes:

  • CHAPTER - Summarize by chapter

  • PAGE - Summarize by page

  • SEGMENT - Summarize by segment

  • QUESTIONS - Generate questions

  • HEADLINES - Extract headlines

  • POSTS - Summarize posts (social media)

Example:


Extraction Stage

Purpose: Identify and extract entities (people, organizations, places, topics) to build knowledge graph.

Default:No extraction - entities are NOT extracted unless you add this stage.

When you need it:

  • Building knowledge graph

  • Entity-based search/filtering

  • Relationship discovery

  • Semantic understanding beyond keywords

Complete Configuration

EntityExtractionServiceTypes

Type
Use Case
Quality

MODEL_TEXT

Recommended - LLM-based extraction

⭐⭐⭐⭐⭐ Excellent

AZURE_COGNITIVE_SERVICES_TEXT

Azure Text Analytics

⭐⭐⭐ Good

MODEL_IMAGE

Extract from images

⭐⭐⭐⭐ Excellent

AZURE_COGNITIVE_SERVICES_IMAGE

Azure Vision

⭐⭐⭐ Good

Properties:

Model Selection:

Observable Types (Standard Entities)

Complete list of built-in entity types:

Custom Entity Types

Use case: Domain-specific entities not covered by standard types.

Example:

Complete Example: Knowledge Graph Extraction


Enrichment Stage

Purpose: Add external data to entities (links, FHIR medical data, Diffbot enrichment).

Default:No enrichment - external data is NOT added unless you configure this.

When you need it:

  • Medical applications (FHIR integration)

  • Web entity enrichment (Diffbot)

  • Link extraction from content

Complete Configuration

EntityEnrichmentServiceTypes

Type
Use Case
External API

DIFFBOT

Web entity enrichment

Diffbot API required

FHIR

Medical entity enrichment

FHIR endpoint required

Properties:

Example:

FHIR Medical Enrichment

Use case: Enrich medical entities with FHIR (Fast Healthcare Interoperability Resources) data.

Properties:

Example:

Diffbot Enrichment

Use case: Enrich entities with data from Diffbot Knowledge Graph.

Properties:

Example:


Indexing Stage

Purpose: Create vector embeddings and store content in vector database for semantic search.

Default:Automatic indexing with project default embedding model (usually text-embedding-3-small).

When you need this stage: Custom embedding model, specialized indexing requirements.

Complete Configuration

Most users don't need this stage - the default indexing works well. Use this only if you need:

  • Different embedding model than project default

  • Specialized indexing configuration

Example:


Ingestion Stage

Purpose: Filter which content gets ingested based on file type, path, or URL patterns.

Default:All content ingested - no filtering.

When you need it:

  • Web crawling (filter by URL path)

  • Selective file type ingestion

  • Exclude certain paths

Complete Configuration

Example: Web Crawling with Path Filters

Example: Selective File Types


Storage Stage

Purpose: Configure where and how content is stored.

Default:Managed storage - Graphlit handles all storage.

When you need it: Bring your own storage (Azure Blob, S3, Google Cloud Storage).

Complete Configuration

Most users don't need this - Graphlit's managed storage is recommended.


Classification Stage

Purpose: Classify content into categories using LLMs.

Default:No classification - content is not categorized unless you add this.

When you need it:

  • Content routing/filtering

  • Auto-categorization

  • Custom taxonomy

Complete Configuration

Example:


Workflow Actions

Purpose: Execute actions after content processing (webhooks, integrations).

Default:No actions - nothing happens after processing unless configured.

When you need it:

  • Webhook notifications

  • Integration triggers

  • Post-processing automation

Complete Configuration

Example: Webhook on Completion


Complete API Reference

WorkflowInput (Top-Level)

All fields except name are optional. Graphlit provides intelligent defaults for missing stages.


Production Patterns

Pattern 1: Multi-Tenant Workflows

Pattern 2: Conditional Workflows (File Type Based)

Pattern 3: Zine Production Pattern

What Zine uses (from /home/kirk/projects/zine):

Key lessons from Zine:

  • Single workflow for simplicity

  • Vision model preparation (complex Slack attachments)

  • Entity extraction for knowledge graph

  • Link extraction for context

Pattern 4: Cost Optimization


Summary

Key Takeaways:

  1. Most workflows are optional - Graphlit has intelligent defaults

  2. Default preparation is Azure AI Document Intelligence - works for 80%+ of documents

  3. No extraction by default - add extraction stage to build knowledge graph

  4. Vision models (MODEL_DOCUMENT) are 10x more expensive - only use for complex PDFs

  5. Audio requires explicit workflow - use Deepgram or Assembly.AI

  6. All stages are composable - mix and match as needed

When in doubt: Start without a workflow, add stages only when you hit limitations.


Related Documentation:

Last updated

Was this helpful?