Build Knowledge Graph from Slack Messages
Use Case: Build Knowledge Graph from Slack Messages
User Intent
"How do I extract entities from Slack messages to build a knowledge graph? Show me how to analyze team interactions, mentions, and build organizational networks from Slack data."
Operation
SDK Methods: createWorkflow(), createFeed(), isFeedDone(), queryContents(), queryObservables()
GraphQL: Slack feed creation + entity extraction + team graph queries
Entity: Slack Feed → Message Content → Observations → Observables (Team Graph)
Prerequisites
Graphlit project with API credentials
Slack workspace access
Slack OAuth token (via Graphlit Developer Portal)
Understanding of feed and workflow concepts
Complete Code Example (TypeScript)
import { Graphlit } from 'graphlit-client';
import { ContentTypes, FeedServiceTypes, ObservableTypes } from 'graphlit-client/dist/generated/graphql-types';
import {
FeedTypes,
FeedServiceTypes,
ExtractionServiceTypes,
ObservableTypes,
ContentTypes
} from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
console.log('=== Building Knowledge Graph from Slack ===\n');
// Step 1: Create extraction workflow
console.log('Step 1: Creating entity extraction workflow...');
const workflow = await graphlit.createWorkflow({
name: "Slack Entity Extraction",
extraction: {
jobs: [{
connector: {
type: EntityExtractionServiceTypes.ModelText,
extractedTypes: [
ObservableTypes.Person, // Team members, mentions
ObservableTypes.Organization, // Companies, clients mentioned
ObservableTypes.Product, // Tools, products discussed
ObservableTypes.Software, // Software/services mentioned
ObservableTypes.Category, // Projects, topics, teams
ObservableTypes.Event // Meetings, deadlines mentioned
]
}
}]
}
});
console.log(`✓ Workflow: ${workflow.createWorkflow.id}\n`);
// Step 2: Create Slack feed
console.log('Step 2: Creating Slack feed...');
const feed = await graphlit.createFeed({
name: "Engineering Slack",
type: FeedSlack,
slack: {
type: FeedServiceSlack,
token: process.env.SLACK_OAUTH_TOKEN!, // From Developer Portal
channels: [
{ id: 'C01234567', name: 'engineering' },
{ id: 'C98765432', name: 'product' },
{ id: 'C55555555', name: 'general' }
],
readLimit: 1000 // Messages per channel
},
workflow: { id: workflow.createWorkflow.id }
});
console.log(`✓ Feed: ${feed.createFeed.id}\n`);
// Step 3: Wait for sync
console.log('Step 3: Syncing Slack messages...');
let isDone = false;
while (!isDone) {
const status = await graphlit.isFeedDone(feed.createFeed.id);
isDone = status.isFeedDone.result;
if (!isDone) {
console.log(' Syncing... (checking again in 5s)');
await new Promise(resolve => setTimeout(resolve, 5000));
}
}
console.log('✓ Sync complete\n');
// Step 4: Query messages
console.log('Step 4: Querying synced messages...');
const messages = await graphlit.queryContents({
filter: {
types: [ContentTypes.Message],
feeds: [{ id: feed.createFeed.id }]
}
});
console.log(`✓ Synced ${messages.contents.results.length} messages\n`);
// Step 5: Analyze message metadata
console.log('Step 5: Analyzing Slack activity...\n');
// Messages by channel
const byChannel = new Map<string, number>();
messages.contents.results.forEach(msg => {
const channel = msg.message?.channelName || 'unknown';
byChannel.set(channel, (byChannel.get(channel) || 0) + 1);
});
console.log('Messages by channel:');
Array.from(byChannel.entries())
.sort((a, b) => b[1] - a[1])
.forEach(([channel, count]) => {
console.log(` #${channel}: ${count} messages`);
});
console.log();
// Most active authors
const byAuthor = new Map<string, number>();
messages.contents.results.forEach(msg => {
const author = msg.message?.author?.email || 'unknown';
byAuthor.set(author, (byAuthor.get(author) || 0) + 1);
});
console.log('Most active authors:');
Array.from(byAuthor.entries())
.sort((a, b) => b[1] - a[1])
.slice(0, 5)
.forEach(([author, count]) => {
console.log(` ${author}: ${count} messages`);
});
console.log();
// Step 6: Query extracted entities
console.log('Step 6: Querying knowledge graph...\n');
const people = await graphlit.queryObservables({
filter: { types: [ObservableTypes.Person] }
});
console.log(`People extracted: ${people.observables.results.length}`);
const products = await graphlit.queryObservables({
filter: { types: [ObservableTypes.Product, ObservableTypes.Software] }
});
console.log(`Products/Software: ${products.observables.results.length}\n`);
// Step 7: Analyze @mentions network
console.log('Step 7: Building mention network...\n');
const mentionNetwork = new Map<string, Map<string, number>>();
messages.contents.results.forEach(msg => {
const author = msg.message?.author?.email;
const mentions = msg.message?.mentions?.map(m => m.email) || [];
if (author && mentions.length > 0) {
if (!mentionNetwork.has(author)) {
mentionNetwork.set(author, new Map());
}
mentions.forEach(mentioned => {
if (mentioned) {
const mentionMap = mentionNetwork.get(author)!;
mentionMap.set(mentioned, (mentionMap.get(mentioned) || 0) + 1);
}
});
}
});
console.log('Top mention relationships:');
const topMentions: Array<{ from: string; to: string; count: number }> = [];
mentionNetwork.forEach((mentions, from) => {
mentions.forEach((count, to) => {
topMentions.push({ from, to, count });
});
});
topMentions
.sort((a, b) => b.count - a.count)
.slice(0, 5)
.forEach(({ from, to, count }) => {
console.log(` ${from} → ${to}: ${count} mentions`);
});
console.log('\n✓ Team graph analysis complete!');Run
asyncio.run(build_kg_from_slack())
Step-by-Step Explanation
Step 1: Create Entity Extraction Workflow
Slack-Specific Entity Types:
Person: Team members (from @mentions, authors, user references)
Organization: Companies, clients, partners mentioned in messages
Product: Products discussed, tools mentioned
Software: Software services, APIs, platforms referenced
Category: Project names, topics, team names, initiatives
Event: Meetings mentioned, deadlines, launches
Why These Types:
Slack conversations rich in team/product/project context
@mentions create explicit Person relationships
Channel topics hint at Categories
Tool discussions identify Software entities
Step 2: Configure Slack Feed
Slack Feed Options:
Channel IDs:
Find in Slack: Right-click channel → View channel details → Copy channel ID
Or leave empty to sync all accessible channels
OAuth Setup:
Go to Graphlit Developer Portal
Navigate to Connectors → Messaging
Authorize Slack workspace
Copy OAuth token
Step 3: Sync Timeline
Sync Duration:
1,000 messages: 2-3 minutes
10,000 messages: 15-20 minutes
50,000 messages: 1-2 hours
What's Synced:
Message text
Author (PersonReference)
@mentions (PersonReference[])
Channel name/ID
Conversation/thread IDs
Timestamps
Attachments (if includeAttachments: true)
Reactions (emoji reactions)
Links (URLs in messages)
Step 4: Slack Message Metadata Structure
Step 5: Extract Entities from Messages
Explicit Entities:
@mentions: Automatically captured as Person entities
Channel names: Can hint at Category entities
Links: Organizations (from domains), Software (GitHub, tool links)
Extracted Entities:
Step 6: Build Team Interaction Graph
@Mention Network:
Thread Participation:
Step 7: Cross-Channel Entity Analysis
Configuration Options
Selective Channel Sync
Specific Channels:
All Channels:
Channel Discovery:
Message Limits and Filtering
By Count:
By Date (handled automatically):
Graphlit syncs most recent messages first
Incremental sync on subsequent runs
Thread Handling:
Variations
Variation 1: Team Activity Dashboard
Analyze team engagement metrics:
Variation 2: Product/Tool Mentions Tracking
Track which tools your team discusses:
Variation 3: Project/Topic Clustering
Group messages by extracted Category entities:
Variation 4: Influence Network Analysis
Identify influential team members:
Variation 5: Real-Time Slack Sync with Webhooks
Set up continuous sync with webhook notifications:
Common Issues & Solutions
Issue: OAuth Token Expired
Problem: Feed sync fails after token expiration.
Solution: Refresh token in Developer Portal:
Go to Developer Portal → Connectors → Messaging
Re-authorize Slack workspace
Get new OAuth token
Create new feed with fresh token
Issue: Private Channels Not Syncing
Problem: Private channels don't appear in sync.
Solution: Slack OAuth app needs to be added to private channels:
In Slack, go to private channel
Click channel name → Integrations → Add apps
Add Graphlit app
Re-sync feed
Issue: Too Many Messages, Slow Sync
Problem: Large Slack workspace with 100K+ messages takes hours.
Solutions:
Selective channels: Only sync relevant channels
Lower readLimit: Start with recent messages (readLimit: 1000)
Multiple feeds: Create separate feeds per channel group
Incremental sync: First sync takes long, subsequent syncs fast
Issue: Missing Entities from Short Messages
Problem: Short Slack messages don't extract many entities.
Explanation: Normal - short messages like "Yes", "Agreed", "👍" don't contain entities.
Not a Problem: Longer messages with context will have entities.
Developer Hints
Slack vs Email Entity Differences
Slack: Shorter messages, more informal, lots of @mentions
Email: Longer messages, more formal, signatures with rich Person/Org data
Slack entities: Focus on Product/Software/Category
Email entities: Focus on Person/Organization relationships
Best Practices
Start with key channels: Test with 2-3 channels first
Monitor OAuth tokens: Slack tokens can expire
Thread importance: Include threads for full context
Attachment handling: Attachments significantly increase processing time
Incremental sync: After initial sync, updates are fast
Performance Optimization
Parallel channel sync: Channels sync in parallel
Incremental updates: Only new messages synced after initial load
Entity caching: Query observables once, cache results
Batch queries: Query multiple entities in one call
Privacy and Compliance
Respect Slack workspace privacy settings
Private channels require explicit app addition
DMs not synced (privacy protection)
Deleted messages not synced
Last updated
Was this helpful?