Build Knowledge Graph from Emails
Use Case: Build Knowledge Graph from Emails
User Intent
Operation
Prerequisites
Complete Code Example (TypeScript)
import { Graphlit } from 'graphlit-client';
import { ContentTypes, EntityState, FeedServiceTypes, ObservableTypes } from 'graphlit-client/dist/generated/graphql-types';
import {
FeedTypes,
FeedServiceTypes,
ExtractionServiceTypes,
ObservableTypes,
ContentTypes,
EntityState
} from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
console.log('=== Building Knowledge Graph from Emails ===\n');
// Step 1: Create extraction workflow
console.log('Step 1: Creating entity extraction workflow...');
const workflow = await graphlit.createWorkflow({
name: "Email Entity Extraction",
extraction: {
jobs: [{
connector: {
type: EntityExtractionServiceTypes.ModelText,
extractedTypes: [
ObservableTypes.Person, // Senders, recipients, mentions
ObservableTypes.Organization, // Companies from domains/signatures
ObservableTypes.Event, // Meeting mentions, deadlines
ObservableTypes.Product, // Products/services discussed
ObservableTypes.Place // Locations mentioned
]
}
}]
}
});
console.log(`✓ Workflow: ${workflow.createWorkflow.id}\n`);
// Step 2: Create Gmail feed with OAuth
console.log('Step 2: Creating Gmail feed...');
const feed = await graphlit.createFeed({
name: "My Gmail",
type: FeedEmail,
email: {
type: FeedServiceGmail,
token: process.env.GOOGLE_OAUTH_TOKEN!, // From Developer Portal
readLimit: 100, // Number of emails to sync
includeAttachments: true // Sync attachments too
},
workflow: { id: workflow.createWorkflow.id }
});
console.log(`✓ Feed: ${feed.createFeed.id}\n`);
// Step 3: Wait for email sync
console.log('Step 3: Syncing emails...');
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(feed.createFeed.id);
isDone = status.isFeedDone.result;
if (!isDone) {
console.log(' Syncing... (checking again in 5s)');
await new Promise(resolve => setTimeout(resolve, 5000));
}
}
console.log('✓ Sync complete\n');
// Step 4: Query synced emails
console.log('Step 4: Querying synced emails...');
const emails = await graphlit.queryContents({
types: [ContentTypes.Email],
feeds: [{ id: feed.createFeed.id }]
});
console.log(`✓ Synced ${emails.contents.results.length} emails\n`);
// Step 5: Analyze email metadata
console.log('Step 5: Analyzing email senders...\n');
const senders = new Map<string, number>();
emails.contents.results.forEach(email => {
if (email.email?.from) {
email.email.from.forEach(sender => {
const email_addr = sender.email || 'unknown';
senders.set(email_addr, (senders.get(email_addr) || 0) + 1);
});
}
});
console.log('Top email senders:');
Array.from(senders.entries())
.sort((a, b) => b[1] - a[1])
.slice(0, 5)
.forEach(([email, count]) => {
console.log(` ${email}: ${count} emails`);
});
console.log();
// Step 6: Query extracted entities
console.log('Step 6: Querying knowledge graph...\n');
// Get all people from emails
const people = await graphlit.queryObservables({
filter: {
types: [ObservableTypes.Person],
states: [EntityState.Enabled]
}
});
console.log(`People extracted: ${people.observables.results.length}`);
// Get all organizations
const orgs = await graphlit.queryObservables({
filter: {
types: [ObservableTypes.Organization],
states: [EntityState.Enabled]
}
});
console.log(`Organizations extracted: ${orgs.observables.results.length}\n`);
// Step 7: Build contact network
console.log('Step 7: Building contact network...\n');
// Email threads create person-to-person relationships
const contactNetwork = new Map<string, Set<string>>();
emails.contents.results.forEach(email => {
const from = email.email?.from?.[0]?.email;
const toList = email.email?.to?.map(t => t.email) || [];
const ccList = email.email?.cc?.map(c => c.email) || [];
const recipients = [...toList, ...ccList].filter(e => e);
if (from && recipients.length > 0) {
if (!contactNetwork.has(from)) {
contactNetwork.set(from, new Set());
}
recipients.forEach(recipient => {
contactNetwork.get(from)!.add(recipient);
});
}
});
console.log('Top email relationships:');
Array.from(contactNetwork.entries())
.map(([from, to]) => ({ from, count: to.size }))
.sort((a, b) => b.count - a.count)
.slice(0, 5)
.forEach(({ from, count }) => {
console.log(` ${from} → ${count} contacts`);
});
console.log('\n✓ Knowledge graph complete!');Run
Step-by-Step Explanation
Step 1: Create Entity Extraction Workflow
Step 2: Configure Email Feed
Step 3: Sync and Wait for Processing
Step 4: Query Email Content
Step 5: Extract Entity Observations
Step 6: Build Contact Network
Step 7: Query Knowledge Graph
Configuration Options
Limiting Email Sync Scope
Handling Attachments
Variations
Variation 1: Organization Email Domain Mapping
Variation 2: Email Thread Analysis
Variation 3: Contact Frequency Ranking
Variation 4: Entity-Enhanced Email Search
Variation 5: Cross-Source Entity Linking
Common Issues & Solutions
Issue: OAuth Token Expired
Issue: Duplicate Entities from Sender/Recipient and Body
Issue: Too Many Low-Confidence Entities
Issue: Missing Email Body Entities
Developer Hints
OAuth Token Management
Email Sync Best Practices
Entity Quality from Emails
Performance Considerations
Privacy and Security
Last updated