Configure Workflow for Entity Extraction
Use Case: Configure Workflow for Entity Extraction
User Intent
Operation
Prerequisites
Complete Code Example (TypeScript)
import { Graphlit } from 'graphlit-client';
import {
FilePreparationServiceTypes,
EntityExtractionServiceTypes,
ObservableTypes,
} from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
// Create workflow with entity extraction
const workflow = await graphlit.createWorkflow({
name: "Entity Extraction Workflow",
preparation: {
jobs: [{
connector: {
type: FilePreparationServiceTypes.Document
}
}]
},
extraction: {
jobs: [{
connector: {
type: EntityExtractionServiceTypes.ModelText,
extractedTypes: [
ObservableTypes.Person,
ObservableTypes.Organization,
ObservableTypes.Place,
ObservableTypes.Event
]
}
}]
}
});
console.log(`Created workflow: ${workflow.createWorkflow.id}`);
console.log(`Extracting: Person, Organization, Place, Event`);
// Use workflow with content ingestion
const content = await graphlit.ingestUri(
'https://example.com/document.pdf',
'Entity Extraction Doc',
undefined,
undefined,
true,
{ id: workflow.createWorkflow.id }
);
console.log(`Ingesting content with entity extraction...`);Key differences: snake_case methods, enum values
Step-by-Step Explanation
Step 1: Choose Extraction Type
Step 2: Select Entity Types
Step 3: Add Preparation Stage
Step 4: Configure Model (Optional)
Configuration Options
Changing Extraction Models
Vision Model Extraction (for PDFs with Images)
Multiple Extraction Jobs
Variations
Variation 1: Minimal Extraction (Fast)
Variation 2: Comprehensive Extraction
Variation 3: Medical Content Extraction
Variation 4: Audio/Video Transcription + Extraction
Variation 5: GitHub Repository Analysis
Common Issues & Solutions
Issue: No Entities Extracted
Issue: Too Many Low-Quality Entities
Issue: Extraction Too Slow
Issue: Wrong Entity Types Extracted
Developer Hints
Model Selection Guidelines
Entity Type Selection Strategy
Performance Considerations
Confidence Threshold Recommendations
Cost Optimization
Last updated