Build Knowledge Graph from Slack Messages

Use Case: Build Knowledge Graph from Slack Messages

User Intent

"How do I extract entities from Slack messages to build a knowledge graph? Show me how to analyze team interactions, mentions, and build organizational networks from Slack data."

Operation

SDK Methods: createWorkflow(), createFeed(), isFeedDone(), queryContents(), queryObservables() GraphQL: Slack feed creation + entity extraction + team graph queries Entity: Slack Feed → Message Content → Observations → Observables (Team Graph)

Prerequisites

Graphlit project with API credentials
Slack workspace access
Slack OAuth token (via Graphlit Developer Portal)
Understanding of feed and workflow concepts

Complete Code Example (TypeScript)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, FeedServiceTypes, ObservableTypes } from 'graphlit-client/dist/generated/graphql-types';
import {
  FeedTypes,
  FeedServiceTypes,
  ExtractionServiceTypes,
  ObservableTypes,
  ContentTypes
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

console.log('=== Building Knowledge Graph from Slack ===\n');

// Step 1: Create extraction workflow
console.log('Step 1: Creating entity extraction workflow...');
const workflow = await graphlit.createWorkflow({
  name: "Slack Entity Extraction",
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        extractedTypes: [
          ObservableTypes.Person,          // Team members, mentions
          ObservableTypes.Organization,    // Companies, clients mentioned
          ObservableTypes.Product,         // Tools, products discussed
          ObservableTypes.Software,        // Software/services mentioned
          ObservableTypes.Category,        // Projects, topics, teams
          ObservableTypes.Event            // Meetings, deadlines mentioned
        ]
      }
    }]
  }
});

console.log(`✓ Workflow: ${workflow.createWorkflow.id}\n`);

// Step 2: Create Slack feed
console.log('Step 2: Creating Slack feed...');
const feed = await graphlit.createFeed({
  name: "Engineering Slack",
  type: FeedSlack,
  slack: {
    type: FeedServiceSlack,
    token: process.env.SLACK_OAUTH_TOKEN!,  // From Developer Portal
    channels: [
      { id: 'C01234567', name: 'engineering' },
      { id: 'C98765432', name: 'product' },
      { id: 'C55555555', name: 'general' }
    ],
    readLimit: 1000  // Messages per channel
  },
  workflow: { id: workflow.createWorkflow.id }
});

console.log(`✓ Feed: ${feed.createFeed.id}\n`);

// Step 3: Wait for sync
console.log('Step 3: Syncing Slack messages...');
let isDone = false;
while (!isDone) {
  const status = await graphlit.isFeedDone(feed.createFeed.id);
  isDone = status.isFeedDone.result;
  
  if (!isDone) {
    console.log('  Syncing... (checking again in 5s)');
    await new Promise(resolve => setTimeout(resolve, 5000));
  }
}
console.log('✓ Sync complete\n');

// Step 4: Query messages
console.log('Step 4: Querying synced messages...');
const messages = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Message],
    feeds: [{ id: feed.createFeed.id }]
  }
});

console.log(`✓ Synced ${messages.contents.results.length} messages\n`);

// Step 5: Analyze message metadata
console.log('Step 5: Analyzing Slack activity...\n');

// Messages by channel
const byChannel = new Map<string, number>();
messages.contents.results.forEach(msg => {
  const channel = msg.message?.channelName || 'unknown';
  byChannel.set(channel, (byChannel.get(channel) || 0) + 1);
});

console.log('Messages by channel:');
Array.from(byChannel.entries())
  .sort((a, b) => b[1] - a[1])
  .forEach(([channel, count]) => {
    console.log(`  #${channel}: ${count} messages`);
  });
console.log();

// Most active authors
const byAuthor = new Map<string, number>();
messages.contents.results.forEach(msg => {
  const author = msg.message?.author?.email || 'unknown';
  byAuthor.set(author, (byAuthor.get(author) || 0) + 1);
});

console.log('Most active authors:');
Array.from(byAuthor.entries())
  .sort((a, b) => b[1] - a[1])
  .slice(0, 5)
  .forEach(([author, count]) => {
    console.log(`  ${author}: ${count} messages`);
  });
console.log();

// Step 6: Query extracted entities
console.log('Step 6: Querying knowledge graph...\n');

const people = await graphlit.queryObservables({
  filter: { types: [ObservableTypes.Person] }
});

console.log(`People extracted: ${people.observables.results.length}`);

const products = await graphlit.queryObservables({
  filter: { types: [ObservableTypes.Product, ObservableTypes.Software] }
});

console.log(`Products/Software: ${products.observables.results.length}\n`);

// Step 7: Analyze @mentions network
console.log('Step 7: Building mention network...\n');

const mentionNetwork = new Map<string, Map<string, number>>();

messages.contents.results.forEach(msg => {
  const author = msg.message?.author?.email;
  const mentions = msg.message?.mentions?.map(m => m.email) || [];
  
  if (author && mentions.length > 0) {
    if (!mentionNetwork.has(author)) {
      mentionNetwork.set(author, new Map());
    }
    
    mentions.forEach(mentioned => {
      if (mentioned) {
        const mentionMap = mentionNetwork.get(author)!;
        mentionMap.set(mentioned, (mentionMap.get(mentioned) || 0) + 1);
      }
    });
  }
});

console.log('Top mention relationships:');
const topMentions: Array<{ from: string; to: string; count: number }> = [];

mentionNetwork.forEach((mentions, from) => {
  mentions.forEach((count, to) => {
    topMentions.push({ from, to, count });
  });
});

topMentions
  .sort((a, b) => b.count - a.count)
  .slice(0, 5)
  .forEach(({ from, to, count }) => {
    console.log(`  ${from} → ${to}: ${count} mentions`);
  });

console.log('\n✓ Team graph analysis complete!');

Run

asyncio.run(build_kg_from_slack())


### C#
```csharp
using Graphlit;
using Graphlit.Api.Input;

var graphlit = new Graphlit();

Console.WriteLine("=== Building Knowledge Graph from Slack ===\n");

// Step 1: Create workflow
Console.WriteLine("Step 1: Creating entity extraction workflow...");
var workflow = await graphlit.CreateWorkflow(
    name: "Slack Entity Extraction",
    extraction: new WorkflowExtractionInput
    {
        Jobs = new[]
        {
            new WorkflowExtractionJobInput
            {
                Connector = new ExtractionConnectorInput
                {
                    Type = ExtractionServiceModelText,
                    ExtractedTypes = new[]
                    {
                        ObservableTypes.Person,
                        ObservableTypes.Organization,
                        ObservableTypes.Product,
                        ObservableTypes.Software,
                        ObservableTypes.Category
                    }
                }
            }
        }
    }
);

Console.WriteLine($"✓ Workflow: {workflow.CreateWorkflow.Id}\n");

// Step 2: Create Slack feed
Console.WriteLine("Step 2: Creating Slack feed...");
var feed = await graphlit.CreateFeed(
    name: "Engineering Slack",
    type: FeedSlack,
    slack: new SlackFeedInput
    {
        Type = FeedServiceSlack,
        Token = Environment.GetEnvironmentVariable("SLACK_OAUTH_TOKEN"),
        Channels = new[]
        {
            new SlackChannelInput { Id = "C01234567", Name = "engineering" },
            new SlackChannelInput { Id = "C98765432", Name = "product" }
        },
        ReadLimit = 1000
    },
    workflow: new EntityReferenceInput { Id = workflow.CreateWorkflow.Id }
);

Console.WriteLine($"✓ Feed: {feed.CreateFeed.Id}\n");

// (Continue with remaining steps...)

Step-by-Step Explanation

Step 1: Create Entity Extraction Workflow

Slack-Specific Entity Types:

Person: Team members (from @mentions, authors, user references)
Organization: Companies, clients, partners mentioned in messages
Product: Products discussed, tools mentioned
Software: Software services, APIs, platforms referenced
Category: Project names, topics, team names, initiatives
Event: Meetings mentioned, deadlines, launches

Why These Types:

Slack conversations rich in team/product/project context
@mentions create explicit Person relationships
Channel topics hint at Categories
Tool discussions identify Software entities

Step 2: Configure Slack Feed

Slack Feed Options:

slack: {
  type: FeedServiceSlack,
  token: slackOAuthToken,              // From Developer Portal
  
  channels: [                          // Specific channels to sync
    { id: 'C01234567', name: 'engineering' },
    { id: 'C98765432', name: 'product' }
  ],
  // OR sync all channels:
  // channels: []  // Empty = all channels user has access to
  
  readLimit: 1000,                     // Messages per channel
  includeAttachments: true,            // Sync file attachments
  includeThreads: true                 // Sync threaded replies
}

Channel IDs:

Find in Slack: Right-click channel → View channel details → Copy channel ID
Or leave empty to sync all accessible channels

OAuth Setup:

Go to Graphlit Developer Portal
Navigate to Connectors → Messaging
Authorize Slack workspace
Copy OAuth token

Step 3: Sync Timeline

Sync Duration:

1,000 messages: 2-3 minutes
10,000 messages: 15-20 minutes
50,000 messages: 1-2 hours

What's Synced:

Message text
Author (PersonReference)
@mentions (PersonReference[])
Channel name/ID
Conversation/thread IDs
Timestamps
Attachments (if includeAttachments: true)
Reactions (emoji reactions)
Links (URLs in messages)

Step 4: Slack Message Metadata Structure

message: {
  identifier: "1234567890.123456",     // Slack message ID
  conversationIdentifier: "p9876543",   // Thread ID (if in thread)
  channelIdentifier: "C01234567",       // Channel ID
  channelName: "engineering",           // Channel name
  author: {                             // Message author
    name: "Kirk Marple",
    email: "[email protected]",
    givenName: "Kirk",
    familyName: "Marple"
  },
  mentions: [                           // @mentioned users
    { name: "Jane Doe", email: "[email protected]" }
  ],
  attachmentCount: 2,                   // Number of attachments
  links: [                              // URLs in message
    "https://graphlit.com",
    "https://github.com/graphlit"
  ]
}

Step 5: Extract Entities from Messages

Explicit Entities:

@mentions: Automatically captured as Person entities
Channel names: Can hint at Category entities
Links: Organizations (from domains), Software (GitHub, tool links)

Extracted Entities:

const message = await graphlit.getContent(messageId);

message.content.observations?.forEach(obs => {
  console.log(`${obs.type}: ${obs.observable.name}`);
  // Person: "Kirk Marple" (from @mention or text)
  // Product: "Graphlit" (mentioned in message)
  // Software: "GitHub" (from github.com link)
});

Step 6: Build Team Interaction Graph

@Mention Network:

// Who mentions whom most frequently?
const mentionGraph = new Map<string, Map<string, number>>();

messages.contents.results.forEach(msg => {
  const author = msg.message?.author?.email;
  const mentions = msg.message?.mentions || [];
  
  if (author && mentions.length > 0) {
    if (!mentionGraph.has(author)) {
      mentionGraph.set(author, new Map());
    }
    
    mentions.forEach(mentioned => {
      if (mentioned.email && mentioned.email !== author) {
        const mentions = mentionGraph.get(author)!;
        mentions.set(mentioned.email, (mentions.get(mentioned.email) || 0) + 1);
      }
    });
  }
});

Thread Participation:

// Who participates in same threads?
const threadMap = new Map<string, Set<string>>();

messages.contents.results.forEach(msg => {
  const threadId = msg.message?.conversationIdentifier || msg.id;
  const author = msg.message?.author?.email;
  
  if (author) {
    if (!threadMap.has(threadId)) {
      threadMap.set(threadId, new Set());
    }
    threadMap.get(threadId)!.add(author);
  }
});

// Co-participation network
const coparticipation = new Map<string, Set<string>>();

threadMap.forEach(participants => {
  const people = Array.from(participants);
  for (let i = 0; i < people.length; i++) {
    for (let j = i + 1; j < people.length; j++) {
      if (!coparticipation.has(people[i])) {
        coparticipation.set(people[i], new Set());
      }
      coparticipation.get(people[i])!.add(people[j]);
      
      if (!coparticipation.has(people[j])) {
        coparticipation.set(people[j], new Set());
      }
      coparticipation.get(people[j])!.add(people[i]);
    }
  }
});

Step 7: Cross-Channel Entity Analysis

// Which entities span multiple channels?
const entityChannels = new Map<string, Set<string>>();

messages.contents.results.forEach(msg => {
  const channel = msg.message?.channelName;
  msg.observations?.forEach(obs => {
    const entityId = obs.observable.id;
    if (!entityChannels.has(entityId)) {
      entityChannels.set(entityId, new Set());
    }
    if (channel) {
      entityChannels.get(entityId)!.add(channel);
    }
  });
});

// Find cross-channel topics
const crossChannel = Array.from(entityChannels.entries())
  .filter(([_, channels]) => channels.size > 1)
  .map(([entityId, channels]) => ({
    entity: entityId,
    channelCount: channels.size
  }))
  .sort((a, b) => b.channelCount - a.channelCount);

console.log('Most cross-channel entities:');
crossChannel.slice(0, 5).forEach(item => {
  console.log(`  Entity: ${item.entity}, Channels: ${item.channelCount}`);
});

Configuration Options

Selective Channel Sync

Specific Channels:

channels: [
  { id: 'C01234567', name: 'engineering' },
  { id: 'C98765432', name: 'product' }
]

All Channels:

channels: []  // Empty array = sync all accessible channels

Channel Discovery:

// First, sync without specific channels to discover
const exploreFeed = await graphlit.createFeed({
  name: "Slack Explore",
  type: FeedSlack,
  slack: {
    type: FeedServiceSlack,
    token: slackToken,
    channels: [],     // All channels
    readLimit: 10     // Just a few messages per channel
  }
});

// Query to see what channels were found
const messages = await graphlit.queryContents({
  filter: { feeds: [{ id: exploreFeed.createFeed.id }] }
});

const channels = new Set(
  messages.contents.results.map(m => m.message?.channelName)
);

console.log('Available channels:', Array.from(channels));

Message Limits and Filtering

By Count:

readLimit: 5000  // Most recent 5000 messages per channel

By Date (handled automatically):

Graphlit syncs most recent messages first
Incremental sync on subsequent runs

Thread Handling:

includeThreads: true   // Sync threaded replies
// Or
includeThreads: false  // Main channel messages only

Variations

Variation 1: Team Activity Dashboard

Analyze team engagement metrics:

interface TeamMetrics {
  totalMessages: number;
  activeUsers: number;
  topChannels: Array<{ channel: string; messages: number }>;
  topPosters: Array<{ user: string; messages: number }>;
  averageMessagesPerDay: number;
}

function calculateTeamMetrics(messages: typeof messages.contents.results): TeamMetrics {
  const users = new Set<string>();
  const channelCounts = new Map<string, number>();
  const userCounts = new Map<string, number>();
  const dates = new Set<string>();
  
  messages.forEach(msg => {
    const author = msg.message?.author?.email;
    const channel = msg.message?.channelName;
    const date = msg.creationDate?.split('T')[0];
    
    if (author) {
      users.add(author);
      userCounts.set(author, (userCounts.get(author) || 0) + 1);
    }
    
    if (channel) {
      channelCounts.set(channel, (channelCounts.get(channel) || 0) + 1);
    }
    
    if (date) {
      dates.add(date);
    }
  });
  
  return {
    totalMessages: messages.length,
    activeUsers: users.size,
    topChannels: Array.from(channelCounts.entries())
      .map(([channel, messages]) => ({ channel, messages }))
      .sort((a, b) => b.messages - a.messages)
      .slice(0, 5),
    topPosters: Array.from(userCounts.entries())
      .map(([user, messages]) => ({ user, messages }))
      .sort((a, b) => b.messages - a.messages)
      .slice(0, 5),
    averageMessagesPerDay: messages.length / dates.size
  };
}

const metrics = calculateTeamMetrics(messages.contents.results);
console.log('Team Metrics:', metrics);

Variation 2: Product/Tool Mentions Tracking

Track which tools your team discusses:

// Extract Software/Product entities
const tools = await graphlit.queryObservables({
  filter: {
    types: [ObservableTypes.Software, ObservableTypes.Product]
  }
});

// Count mentions per tool
const toolMentions = new Map<string, number>();

for (const tool of tools.observables.results) {
  const mentionCount = await graphlit.queryContents({
    filter: {
      types: [ContentTypes.Message],
      observations: [{
        type: tool.observable.type,
        observable: { id: tool.observable.id }
      }]
    }
  });
  
  toolMentions.set(tool.observable.name, mentionCount.contents.results.length);
}

console.log('Most discussed tools:');
Array.from(toolMentions.entries())
  .sort((a, b) => b[1] - a[1])
  .slice(0, 10)
  .forEach(([tool, count]) => {
    console.log(`  ${tool}: ${count} mentions`);
  });

Variation 3: Project/Topic Clustering

Group messages by extracted Category entities:

// Extract Category entities (projects, topics)
const categories = await graphlit.queryObservables({
  filter: { types: [ObservableTypes.Category] }
});

// For each category, find related messages
const projectMessages = new Map<string, typeof messages.contents.results>();

for (const category of categories.observables.results) {
  const related = await graphlit.queryContents({
    filter: {
      types: [ContentTypes.Message],
      observations: [{
        type: ObservableTypes.Category,
        observable: { id: category.observable.id }
      }]
    }
  });
  
  projectMessages.set(category.observable.name, related.contents.results);
}

console.log('Messages by project:');
projectMessages.forEach((msgs, project) => {
  console.log(`  ${project}: ${msgs.length} messages`);
});

Variation 4: Influence Network Analysis

Identify influential team members:

interface InfluenceMetrics {
  user: string;
  messageCount: number;
  mentionCount: number;  // Times mentioned by others
  reachCount: number;    // Unique people who mention them
  influenceScore: number;
}

function calculateInfluence(messages: typeof messages.contents.results): InfluenceMetrics[] {
  const userMetrics = new Map<string, InfluenceMetrics>();
  
  messages.forEach(msg => {
    const author = msg.message?.author?.email;
    const mentions = msg.message?.mentions || [];
    
    // Track author's activity
    if (author) {
      if (!userMetrics.has(author)) {
        userMetrics.set(author, {
          user: author,
          messageCount: 0,
          mentionCount: 0,
          reachCount: 0,
          influenceScore: 0
        });
      }
      userMetrics.get(author)!.messageCount++;
    }
    
    // Track who gets mentioned
    mentions.forEach(mentioned => {
      if (mentioned.email) {
        if (!userMetrics.has(mentioned.email)) {
          userMetrics.set(mentioned.email, {
            user: mentioned.email,
            messageCount: 0,
            mentionCount: 0,
            reachCount: 0,
            influenceScore: 0
          });
        }
        userMetrics.get(mentioned.email)!.mentionCount++;
      }
    });
  });
  
  // Calculate reach (unique people who mention each user)
  const mentioners = new Map<string, Set<string>>();
  messages.forEach(msg => {
    const author = msg.message?.author?.email;
    msg.message?.mentions?.forEach(mentioned => {
      if (author && mentioned.email) {
        if (!mentioners.has(mentioned.email)) {
          mentioners.set(mentioned.email, new Set());
        }
        mentioners.get(mentioned.email)!.add(author);
      }
    });
  });
  
  mentioners.forEach((mentionersSet, user) => {
    if (userMetrics.has(user)) {
      userMetrics.get(user)!.reachCount = mentionersSet.size;
    }
  });
  
  // Calculate influence score
  userMetrics.forEach((metrics) => {
    metrics.influenceScore = 
      (metrics.messageCount * 1) +
      (metrics.mentionCount * 2) +
      (metrics.reachCount * 3);
  });
  
  return Array.from(userMetrics.values())
    .sort((a, b) => b.influenceScore - a.influenceScore);
}

const influence = calculateInfluence(messages.contents.results);
console.log('Most influential team members:');
influence.slice(0, 5).forEach((metrics, i) => {
  console.log(`${i + 1}. ${metrics.user}`);
  console.log(`   Messages: ${metrics.messageCount}, Mentions: ${metrics.mentionCount}, Reach: ${metrics.reachCount}`);
  console.log(`   Influence Score: ${metrics.influenceScore}`);
});

Variation 5: Real-Time Slack Sync with Webhooks

Set up continuous sync with webhook notifications:

// Create feed with webhook
const feed = await graphlit.createFeed({
  name: "Slack Live Sync",
  type: FeedSlack,
  slack: {
    type: FeedServiceSlack,
    token: slackToken,
    channels: [],  // All channels
    readLimit: 100
  },
  workflow: { id: workflowId },
  schedulePolicy: {
    repeatInterval: 'PT5M'  // Sync every 5 minutes
  }
});

// Set up webhook to get notified of new content
// (Webhook configuration in Developer Portal)
// When new messages arrive, extract entities immediately

Common Issues & Solutions

Issue: OAuth Token Expired

Problem: Feed sync fails after token expiration.

Solution: Refresh token in Developer Portal:

Go to Developer Portal → Connectors → Messaging
Re-authorize Slack workspace
Get new OAuth token
Create new feed with fresh token

Issue: Private Channels Not Syncing

Problem: Private channels don't appear in sync.

Solution: Slack OAuth app needs to be added to private channels:

In Slack, go to private channel
Click channel name → Integrations → Add apps
Add Graphlit app
Re-sync feed

Issue: Too Many Messages, Slow Sync

Problem: Large Slack workspace with 100K+ messages takes hours.

Solutions:

Selective channels: Only sync relevant channels
Lower readLimit: Start with recent messages (readLimit: 1000)
Multiple feeds: Create separate feeds per channel group
Incremental sync: First sync takes long, subsequent syncs fast

// Optimize: Sync critical channels first
const criticalFeed = await graphlit.createFeed({
  slack: {
    channels: [
      { id: 'C_ENGINEERING', name: 'engineering' },
      { id: 'C_PRODUCT', name: 'product' }
    ],
    readLimit: 5000  // More messages for critical channels
  }
});

Issue: Missing Entities from Short Messages

Problem: Short Slack messages don't extract many entities.

Explanation: Normal - short messages like "Yes", "Agreed", "👍" don't contain entities.

Not a Problem: Longer messages with context will have entities.

Developer Hints

Slack vs Email Entity Differences

Slack: Shorter messages, more informal, lots of @mentions
Email: Longer messages, more formal, signatures with rich Person/Org data
Slack entities: Focus on Product/Software/Category
Email entities: Focus on Person/Organization relationships

Best Practices

Start with key channels: Test with 2-3 channels first
Monitor OAuth tokens: Slack tokens can expire
Thread importance: Include threads for full context
Attachment handling: Attachments significantly increase processing time
Incremental sync: After initial sync, updates are fast

Performance Optimization

Parallel channel sync: Channels sync in parallel
Incremental updates: Only new messages synced after initial load
Entity caching: Query observables once, cache results
Batch queries: Query multiple entities in one call

Privacy and Compliance

Respect Slack workspace privacy settings
Private channels require explicit app addition
DMs not synced (privacy protection)
Deleted messages not synced

Last updated 2 months ago

Was this helpful?