Build Knowledge Graph from Meeting Recordings
User Intent
"How do I extract entities from meeting recordings (audio/video)? Show me how to transcribe meetings and analyze participants, topics, action items, and decisions."
Operation
SDK Methods: createWorkflow(), ingestUri(), isContentDone(), getContent(), queryObservables()
GraphQL: Audio/video ingestion + transcription + entity extraction
Entity: Audio/Video → Transcription → Text → Observations → Observables (Meeting Graph)
Prerequisites
Graphlit project with API credentials
Meeting recordings (MP3, MP4, WAV, or other audio/video formats)
Understanding of workflow configuration
Transcription service access (Deepgram, AssemblyAI, or Whisper)
Complete Code Example (TypeScript)
import { Graphlit } from 'graphlit-client';
import { ObservableTypes } from 'graphlit-client/dist/generated/graphql-types';
import {
FilePreparationServiceTypes,
AudioTranscriptionServiceTypes,
ExtractionServiceTypes,
ObservableTypes
} from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
console.log('=== Building Knowledge Graph from Meeting ===\n');
// Step 1: Create transcription + extraction workflow
console.log('Step 1: Creating workflow...');
const workflow = await graphlit.createWorkflow({
name: "Meeting Entity Extraction",
preparation: {
jobs: [{
connector: {
type: FilePreparationServiceTypes.Deepgram,
audioTranscription: {
model: DeepgramModels.Nova2 // Fast, accurate
}
}
}]
},
extraction: {
jobs: [{
connector: {
type: EntityExtractionServiceTypes.ModelText,
extractedTypes: [
ObservableTypes.Person, // Participants, mentioned people
ObservableTypes.Organization, // Companies discussed
ObservableTypes.Product, // Products, services mentioned
ObservableTypes.Event, // Action items, deadlines, follow-ups
ObservableTypes.Place, // Locations mentioned
ObservableTypes.Category // Topics, projects, themes
]
}
}]
}
});
console.log(`✓ Workflow: ${workflow.createWorkflow.id}\n`);
// Step 2: Ingest meeting recording
console.log('Step 2: Ingesting meeting recording...');
const meeting = await graphlit.ingestUri(
'https://example.com/meetings/q4-planning.mp4',
"Q4 Planning Meeting",
undefined,
undefined,
undefined,
{ id: workflow.createWorkflow.id }
);
console.log(`✓ Ingested: ${meeting.ingestUri.id}\n`);
// Step 3: Wait for transcription + extraction
console.log('Step 3: Transcribing and extracting entities...');
console.log('(This may take several minutes for long recordings)\n');
let isDone = false;
let lastStatus = '';
while (!isDone) {
const status = await graphlit.isContentDone(meeting.ingestUri.id);
isDone = status.isContentDone.result;
if (!isDone) {
const newStatus = ' Processing...';
if (newStatus !== lastStatus) {
console.log(newStatus);
lastStatus = newStatus;
}
await new Promise(resolve => setTimeout(resolve, 5000));
}
}
console.log('✓ Processing complete\n');
// Step 4: Get meeting details
console.log('Step 4: Retrieving meeting transcript and entities...');
const meetingDetails = await graphlit.getContent(meeting.ingestUri.id);
const content = meetingDetails.content;
console.log(`✓ Meeting: ${content.name}`);
console.log(` Duration: ${content.audio?.duration || 0} seconds`);
console.log(` Entities: ${content.observations?.length || 0}\n`);
// Step 5: Display transcript excerpt
console.log('Step 5: Transcript excerpt...\n');
const transcript = content.markdown || content.text || '';
const excerpt = transcript.substring(0, 500);
console.log(excerpt);
console.log(transcript.length > 500 ? '...\n' : '\n');
// Step 6: Analyze extracted entities
console.log('Step 6: Analyzing entities...\n');
// Group by type
const byType = new Map<string, Set<string>>();
content.observations?.forEach(obs => {
if (!byType.has(obs.type)) {
byType.set(obs.type, new Set());
}
byType.get(obs.type)!.add(obs.observable.name);
});
byType.forEach((entities, type) => {
console.log(`${type} (${entities.size}):`);
Array.from(entities).slice(0, 5).forEach(name => {
console.log(` - ${name}`);
});
if (entities.size > 5) {
console.log(` ... and ${entities.size - 5} more`);
}
console.log();
});
// Step 7: Analyze entity timestamps
console.log('Step 7: Entity mentions with timestamps...\n');
const people = content.observations?.filter(obs =>
obs.type === ObservableTypes.Person
) || [];
people.slice(0, 3).forEach(person => {
console.log(`${person.observable.name}:`);
person.occurrences?.slice(0, 3).forEach(occ => {
if (occ.startTime !== undefined && occ.endTime !== undefined) {
const minutes = Math.floor(occ.startTime / 60);
const seconds = Math.floor(occ.startTime % 60);
console.log(` ${minutes}:${seconds.toString().padStart(2, '0')} - Confidence: ${occ.confidence.toFixed(2)}`);
}
});
console.log();
});
// Step 8: Extract action items (Events)
console.log('Step 8: Action items and deadlines...\n');
const events = content.observations?.filter(obs =>
obs.type === ObservableTypes.Event
) || [];
if (events.length > 0) {
console.log('Identified action items:');
events.forEach(event => {
console.log(` - ${event.observable.name}`);
// Show when mentioned
const firstMention = event.occurrences?.[0];
if (firstMention?.startTime !== undefined) {
const min = Math.floor(firstMention.startTime / 60);
const sec = Math.floor(firstMention.startTime % 60);
console.log(` Mentioned at: ${min}:${sec.toString().padStart(2, '0')}`);
}
});
} else {
console.log('No action items identified');
}
console.log('\n✓ Meeting analysis complete!');Step-by-Step Explanation
Step 1: Create Transcription Workflow
Audio Preparation:
preparation: {
jobs: [{
connector: {
type: FilePreparationServiceTypes.Deepgram,
deepgram: { model: DeepgramModels.Nova2 }
}
}]
}Transcription Service Options:
Deepgram: Fast, accurate, cost-effective (recommended)
AssemblyAI: Good quality, speaker diarization support
Whisper: OpenAI's model, very accurate but slower
What Transcription Produces:
Full text transcript
Timestamps per word/segment
Speaker diarization (who said what)
Confidence scores per segment
Step 2: Supported Audio/Video Formats
Audio Formats:
MP3, WAV, M4A, AAC, FLAC, OGG
Supported bitrates: 8kbps - 320kbps
Sample rates: 8kHz - 48kHz
Video Formats:
MP4, MOV, AVI, MKV, WEBM
Audio track extracted automatically
Video analysis not performed (audio only)
Ingestion Sources:
// From URL
ingestUri('https://example.com/meeting.mp3')
// From local file
const audio = fs.readFileSync('./meeting.mp3');
const base64 = audio.toString('base64');
ingestEncodedFile({
name: 'meeting.mp3',
data: base64,
mimeType: 'audio/mpeg'
})
// From cloud storage (via feed)
createFeed({
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.AzureFile,
// Azure Blob Storage config
}
})Step 3: Processing Timeline
Transcription Time (approximate):
10-minute meeting: 1-2 minutes
30-minute meeting: 3-5 minutes
1-hour meeting: 5-10 minutes
2-hour meeting: 10-20 minutes
Factors Affecting Speed:
Audio quality (clean audio faster)
Number of speakers (more speakers slower)
Background noise (noisy audio slower)
File size and bitrate
Step 4: Transcript Structure
Markdown Format:
# Meeting Transcript
## Segment 1 (00:00 - 00:15)
Speaker 1: Welcome everyone to the Q4 planning meeting...
## Segment 2 (00:15 - 00:45)
Speaker 2: Thanks Kirk. I wanted to discuss the product roadmap...
## Segment 3 (00:45 - 01:20)
Speaker 1: Great points. Let's talk about the Graphlit launch timeline...Accessing Transcript:
const content = await graphlit.getContent(meetingId);
// Full transcript
const transcript = content.content.markdown || content.content.text;
// Audio metadata
const duration = content.content.audio?.duration; // seconds
const channels = content.content.audio?.channels;
const bitrate = content.content.audio?.bitrate;Step 5: Entity Extraction from Transcript
Person Entities:
Participants (from speaker labels)
People mentioned in discussion
Names in action items
Organization Entities:
Companies discussed
Partners, clients, competitors
Departments, teams
Event Entities:
Action items ("Send proposal by Friday")
Deadlines ("Launch date: October 15")
Follow-up meetings ("Schedule review call")
Product/Software Entities:
Tools discussed
Products mentioned
Features planned
Category Entities:
Topics, themes
Projects, initiatives
Meeting subjects
Step 6: Timestamp Analysis
Occurrence Timestamps:
occurrence: {
startTime: 125.3, // Seconds from recording start
endTime: 127.8, // Seconds from recording start
confidence: 0.92 // Extraction confidence
}Use Cases:
Jump to specific entity mentions in playback
Create entity timeline visualization
Find when action items were assigned
Track discussion flow by entity
Format Timestamps for Display:
function formatTime(seconds: number): string {
const mins = Math.floor(seconds / 60);
const secs = Math.floor(seconds % 60);
return `${mins}:${secs.toString().padStart(2, '0')}`;
}
obs.occurrences?.forEach(occ => {
if (occ.startTime !== undefined) {
console.log(`${formatTime(occ.startTime)} - ${obs.observable.name}`);
}
});Step 7: Speaker Diarization
Identifying Speakers:
// Deepgram and AssemblyAI provide speaker labels
const transcript = content.content.markdown;
// Parse speaker segments
const speakerPattern = /Speaker (\d+):/g;
const speakers = new Set<string>();
let match;
while ((match = speakerPattern.exec(transcript)) !== null) {
speakers.add(match[1]);
}
console.log(`Number of speakers: ${speakers.size}`);Linking Speakers to Person Entities:
// Cross-reference speaker labels with extracted Person entities
const people = content.content.observations
?.filter(obs => obs.type === ObservableTypes.Person) || [];
console.log('Participants:');
people.forEach(person => {
console.log(` ${person.observable.name}`);
// Match to speaker label if possible
});Configuration Options
Choosing Transcription Service
Deepgram (Recommended):
deepgram: { model: DeepgramModels.Nova2 }Pros: Fast, accurate, cost-effective, good speaker diarization
Cons: Requires internet connection
Best for: Most use cases, production
AssemblyAI:
type: FilePreparationServiceTypes.AssemblyAi,
assemblyAi: {
model: AssemblyAiModels.Default
}Pros: Very accurate, excellent speaker diarization
Cons: Slower, more expensive
Best for: High-quality transcription needs
Whisper (via Deepgram):
type: FilePreparationServiceTypes.Deepgram,
deepgram: {
model: DeepgramModels.WhisperLarge
}Pros: Very accurate, multilingual support
Cons: Slower than Nova models
Best for: Non-English meetings, maximum accuracy
Audio Quality Preprocessing
For Noisy Audio:
preparation: {
jobs: [{
connector: {
type: FilePreparationServiceTypes.Deepgram,
deepgram: { model: DeepgramModels.Nova2 },
// Preprocessing options (if supported)
}
}]
}Tips for Better Transcription:
Use high-quality recording equipment
Minimize background noise
Single speaker per microphone when possible
16kHz+ sample rate recommended
Avoid heavy audio compression
Variations
Variation 1: Multi-Meeting Analysis
Analyze a series of recurring meetings:
const meetingUrls = [
'https://example.com/meetings/week1.mp4',
'https://example.com/meetings/week2.mp4',
'https://example.com/meetings/week3.mp4'
];
// Ingest all meetings
const meetings = await Promise.all(
meetingUrls.map(uri =>
graphlit.ingestUri(uri, undefined, undefined, undefined, undefined, { id: workflowId })
)
);
// Wait for all to process
const waitForAll = async () => {
const ids = meetings.map(m => m.ingestUri.id);
let allDone = false;
while (!allDone) {
const statuses = await Promise.all(
ids.map(id => graphlit.isContentDone(id))
);
allDone = statuses.every(s => s.isContentDone.result);
if (!allDone) {
await new Promise(resolve => setTimeout(resolve, 5000));
}
}
};
await waitForAll();
// Analyze trends over time
const allEntities = await graphlit.queryObservables({
filter: { types: [ObservableTypes.Person, ObservableTypes.Event] }
});
console.log(`Total entities across ${meetings.length} meetings: ${allEntities.observables.results.length}`);Variation 2: Action Item Tracker
Extract and track action items:
interface ActionItem {
description: string;
assignee?: string;
deadline?: string;
meetingDate: Date;
timestamp: number;
}
function extractActionItems(content: Content): ActionItem[] {
const events = content.observations
?.filter(obs => obs.type === ObservableTypes.Event) || [];
const actionItems: ActionItem[] = [];
events.forEach(event => {
// Look for action-like events
const desc = event.observable.name.toLowerCase();
const isAction =
desc.includes('send') ||
desc.includes('schedule') ||
desc.includes('prepare') ||
desc.includes('follow up') ||
desc.includes('review');
if (isAction) {
actionItems.push({
description: event.observable.name,
meetingDate: new Date(content.creationDate),
timestamp: event.occurrences?.[0]?.startTime || 0
});
}
});
return actionItems;
}
const actions = extractActionItems(meetingDetails.content);
console.log(`Action items: ${actions.length}`);
actions.forEach(action => {
console.log(` - ${action.description}`);
console.log(` At: ${formatTime(action.timestamp)}`);
});Variation 3: Meeting Sentiment & Topic Analysis
Analyze discussion topics and participant contributions:
interface MeetingInsights {
duration: number;
participantCount: number;
topicsDiscussed: string[];
mostMentionedEntity: string;
actionItemCount: number;
}
function analyzeMeeting(content: Content): MeetingInsights {
const people = new Set(
content.observations
?.filter(obs => obs.type === ObservableTypes.Person)
.map(obs => obs.observable.name) || []
);
const categories = content.observations
?.filter(obs => obs.type === ObservableTypes.Category)
.map(obs => obs.observable.name) || [];
const events = content.observations
?.filter(obs => obs.type === ObservableTypes.Event) || [];
// Find most mentioned entity
const entityCounts = new Map<string, number>();
content.observations?.forEach(obs => {
const count = obs.occurrences?.length || 0;
entityCounts.set(obs.observable.name, count);
});
const mostMentioned = Array.from(entityCounts.entries())
.sort((a, b) => b[1] - a[1])[0];
return {
duration: content.audio?.duration || 0,
participantCount: people.size,
topicsDiscussed: categories,
mostMentionedEntity: mostMentioned?.[0] || 'None',
actionItemCount: events.length
};
}
const insights = analyzeMeeting(meetingDetails.content);
console.log('Meeting Insights:');
console.log(` Duration: ${Math.floor(insights.duration / 60)} minutes`);
console.log(` Participants: ${insights.participantCount}`);
console.log(` Topics: ${insights.topicsDiscussed.join(', ')}`);
console.log(` Most discussed: ${insights.mostMentionedEntity}`);
console.log(` Action items: ${insights.actionItemCount}`);Variation 4: Searchable Meeting Archive
Build searchable meeting repository:
// Ingest entire meeting archive
const archive = await graphlit.createFeed({
name: "Meeting Archive",
type: FeedTypes.Site,
site: {
type: FeedServiceTypes.AzureFile,
// Point to Azure Blob container with recordings
},
workflow: { id: workflowId }
});
// Wait for all meetings to process
await graphlit.isFeedDone(archive.createFeed.id);
// Search meetings by entity
const searchForPerson = async (personName: string) => {
const personEntity = await graphlit.queryObservables({
search: personName,
filter: { types: [ObservableTypes.Person] }
});
if (personEntity.observables.results.length > 0) {
const meetings = await graphlit.queryContents({
filter: {
observations: [{
type: ObservableTypes.Person,
observable: { id: personEntity.observables.results[0].observable.id }
}]
}
});
return meetings.contents.results;
}
return [];
};
// Find all meetings Kirk participated in
const kirkMeetings = await searchForPerson("Kirk Marple");
console.log(`Kirk mentioned in ${kirkMeetings.length} meetings`);Variation 5: Meeting Summary Generation
Generate AI summaries with entity context:
// After transcription + extraction, generate summary
const conversation = await graphlit.createConversation({
name: "Meeting Summary"
});
// Provide meeting content as context
const summary = await graphlit.promptConversation({
prompt: "Summarize this meeting, highlighting key decisions, action items, and participants.",
id: conversation.createConversation.id,
filter: {
contents: [{ id: meeting.ingestUri.id }]
}
});
console.log('Meeting Summary:');
console.log(summary.message.message);
// Extract structured data from summary
const structuredPrompt = await graphlit.promptConversation({
prompt: `Extract from this meeting:
1. Key decisions made
2. Action items with assignees
3. Follow-up topics
4. Next steps
Format as JSON.`,
id: conversation.createConversation.id,
filter: {
contents: [{ id: meeting.ingestUri.id }]
}
});
console.log('\nStructured Summary:');
console.log(structuredPrompt.message.message);Common Issues & Solutions
Issue: Poor Transcription Quality
Problem: Transcript has many errors, missing words.
Causes & Solutions:
Low audio quality: Use higher bitrate recordings (128kbps+)
Background noise: Record in quiet environment, use noise cancellation
Multiple speakers: Use individual microphones when possible
Heavy accents: Try Whisper model (better multilingual support)
Poor microphone: Invest in quality recording equipment
// Try Whisper for difficult audio
type: FilePreparationServiceTypes.Deepgram,
deepgram: {
model: DeepgramModels.WhisperLarge // Better multilingual support
}Issue: Processing Takes Too Long
Problem: 1-hour meeting takes 30+ minutes to process.
Explanation: Normal for certain conditions.
Timeline Expectations:
Deepgram: ~10% of audio duration (6 min for 1-hour)
AssemblyAI: ~15% of audio duration (9 min for 1-hour)
Whisper: ~20-30% of audio duration (12-18 min for 1-hour)
Optimization:
Use Deepgram for speed
Process shorter segments
Upload during off-peak hours
Issue: No Speaker Diarization
Problem: All speakers labeled as "Speaker 1".
Causes:
Single audio channel (mono)
Poor speaker separation
Overlapping speech
Solution: Use stereo recording with separate channels per speaker, or accept single speaker label.
Issue: Missing Action Items
Problem: No Event entities extracted for obvious action items.
Explanation: Action items are implicit, not always explicitly stated.
Solution: Use LLM to extract action items from transcript:
// After transcription, use RAG to extract actions
const conversation = await graphlit.createConversation({
name: "Extract Actions"
});
const actions = await graphlit.promptConversation({
prompt: "List all action items, deadlines, and follow-ups mentioned in this meeting. Format as a bullet list with assignees if mentioned.",
id: conversation.createConversation.id,
filter: {
contents: [{ id: meetingId }]
}
});
console.log(actions.message.message);Developer Hints
Transcription Service Selection
Deepgram: Best default choice (speed + accuracy + cost)
AssemblyAI: When speaker diarization critical
Whisper (Deepgram): Non-English meetings, multilingual support
Audio Format Best Practices
Bitrate: 128kbps minimum, 256kbps recommended
Sample rate: 16kHz minimum, 44.1kHz recommended
Channels: Stereo preferred for multi-speaker
Format: WAV/FLAC for quality, MP3 for size
Cost Optimization
Deepgram cheapest per minute
Compress large video files (audio track only needed)
Batch process during off-peak hours
Cache transcripts (don't re-transcribe)
Meeting Entity Quality
High confidence: Participant names, company names
Medium confidence: Action items, deadlines
Low confidence: Implicit mentions, pronouns
Filter threshold: >=0.6 for meetings (lower than documents)
Performance Tips
Process in background (don't block UI)
Show progress estimates (based on duration)
Cache transcripts for quick re-query
Parallel process multiple meetings
Use webhooks for completion notification
Production Patterns
Pattern from Meeting Intelligence Apps
Zoom/Meet recordings → automatic transcription
Entity extraction: participants, action items, topics
Searchable archive by person, topic, or date
Action item tracking dashboard
Meeting summary emails
Enterprise Use Cases
Sales calls: Extract prospects, products, objections
Customer support: Track issues, customers, solutions
Board meetings: Decisions, financial mentions, strategic initiatives
Team standups: Tasks, blockers, sprint planning
Training sessions: Topics covered, questions, feedback
Last updated
Was this helpful?