# Build Knowledge Graph from Meeting Recordings

## User Intent

"How do I extract entities from meeting recordings (audio/video)? Show me how to transcribe meetings and analyze participants, topics, action items, and decisions."

## Operation

**SDK Methods**: `createWorkflow()`, `ingestUri()`, `isContentDone()`, `getContent()`, `queryObservables()`\
**GraphQL**: Audio/video ingestion + transcription + entity extraction\
**Entity**: Audio/Video → Transcription → Text → Observations → Observables (Meeting Graph)

## Prerequisites

* Graphlit project with API credentials
* Meeting recordings (MP3, MP4, WAV, or other audio/video formats)
* Understanding of workflow configuration
* Transcription service access (Deepgram, AssemblyAI, or Whisper)

***

## Complete Code Example (TypeScript)

```typescript
import { Graphlit } from 'graphlit-client';
import { ObservableTypes } from 'graphlit-client/dist/generated/graphql-types';
import {
  FilePreparationServiceTypes,
  AudioTranscriptionServiceTypes,
  ExtractionServiceTypes,
  ObservableTypes
} from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

console.log('=== Building Knowledge Graph from Meeting ===\n');

// Step 1: Create transcription + extraction workflow
console.log('Step 1: Creating workflow...');
const workflow = await graphlit.createWorkflow({
  name: "Meeting Entity Extraction",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        audioTranscription: {
          model: DeepgramModels.Nova2  // Fast, accurate
        }
      }
    }]
  },
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        extractedTypes: [
          ObservableTypes.Person,          // Participants, mentioned people
          ObservableTypes.Organization,    // Companies discussed
          ObservableTypes.Product,         // Products, services mentioned
          ObservableTypes.Event,           // Action items, deadlines, follow-ups
          ObservableTypes.Place,           // Locations mentioned
          ObservableTypes.Category         // Topics, projects, themes
        ]
      }
    }]
  }
});

console.log(`✓ Workflow: ${workflow.createWorkflow.id}\n`);

// Step 2: Ingest meeting recording
console.log('Step 2: Ingesting meeting recording...');
const meeting = await graphlit.ingestUri(
  'https://example.com/meetings/q4-planning.mp4',
  "Q4 Planning Meeting",
  undefined,
  undefined,
  undefined,
  { id: workflow.createWorkflow.id }
);

console.log(`✓ Ingested: ${meeting.ingestUri.id}\n`);

// Step 3: Wait for transcription + extraction
console.log('Step 3: Transcribing and extracting entities...');
console.log('(This may take several minutes for long recordings)\n');

let isDone = false;
let lastStatus = '';
while (!isDone) {
  const status = await graphlit.isContentDone(meeting.ingestUri.id);
  isDone = status.isContentDone.result;
  
  if (!isDone) {
    const newStatus = '  Processing...';
    if (newStatus !== lastStatus) {
      console.log(newStatus);
      lastStatus = newStatus;
    }
    await new Promise(resolve => setTimeout(resolve, 5000));
  }
}
console.log('✓ Processing complete\n');

// Step 4: Get meeting details
console.log('Step 4: Retrieving meeting transcript and entities...');
const meetingDetails = await graphlit.getContent(meeting.ingestUri.id);
const content = meetingDetails.content;

console.log(`✓ Meeting: ${content.name}`);
console.log(`  Duration: ${content.audio?.duration || 0} seconds`);
console.log(`  Entities: ${content.observations?.length || 0}\n`);

// Step 5: Display transcript excerpt
console.log('Step 5: Transcript excerpt...\n');
const transcript = content.markdown || content.text || '';
const excerpt = transcript.substring(0, 500);
console.log(excerpt);
console.log(transcript.length > 500 ? '...\n' : '\n');

// Step 6: Analyze extracted entities
console.log('Step 6: Analyzing entities...\n');

// Group by type
const byType = new Map<string, Set<string>>();
content.observations?.forEach(obs => {
  if (!byType.has(obs.type)) {
    byType.set(obs.type, new Set());
  }
  byType.get(obs.type)!.add(obs.observable.name);
});

byType.forEach((entities, type) => {
  console.log(`${type} (${entities.size}):`);
  Array.from(entities).slice(0, 5).forEach(name => {
    console.log(`  - ${name}`);
  });
  if (entities.size > 5) {
    console.log(`  ... and ${entities.size - 5} more`);
  }
  console.log();
});

// Step 7: Analyze entity timestamps
console.log('Step 7: Entity mentions with timestamps...\n');

const people = content.observations?.filter(obs => 
  obs.type === ObservableTypes.Person
) || [];

people.slice(0, 3).forEach(person => {
  console.log(`${person.observable.name}:`);
  
  person.occurrences?.slice(0, 3).forEach(occ => {
    if (occ.startTime !== undefined && occ.endTime !== undefined) {
      const minutes = Math.floor(occ.startTime / 60);
      const seconds = Math.floor(occ.startTime % 60);
      console.log(`  ${minutes}:${seconds.toString().padStart(2, '0')} - Confidence: ${occ.confidence.toFixed(2)}`);
    }
  });
  
  console.log();
});

// Step 8: Extract action items (Events)
console.log('Step 8: Action items and deadlines...\n');

const events = content.observations?.filter(obs => 
  obs.type === ObservableTypes.Event
) || [];

if (events.length > 0) {
  console.log('Identified action items:');
  events.forEach(event => {
    console.log(`  - ${event.observable.name}`);
    
    // Show when mentioned
    const firstMention = event.occurrences?.[0];
    if (firstMention?.startTime !== undefined) {
      const min = Math.floor(firstMention.startTime / 60);
      const sec = Math.floor(firstMention.startTime % 60);
      console.log(`    Mentioned at: ${min}:${sec.toString().padStart(2, '0')}`);
    }
  });
} else {
  console.log('No action items identified');
}

console.log('\n✓ Meeting analysis complete!');
```

***

## Step-by-Step Explanation

### Step 1: Create Transcription Workflow

**Audio Preparation**:

```typescript
preparation: {
  jobs: [{
    connector: {
      type: FilePreparationServiceTypes.Deepgram,
      deepgram: { model: DeepgramModels.Nova2 }
    }
  }]
}
```

**Transcription Service Options**:

* **Deepgram**: Fast, accurate, cost-effective (recommended)
* **AssemblyAI**: Good quality, speaker diarization support
* **Whisper**: OpenAI's model, very accurate but slower

**What Transcription Produces**:

* Full text transcript
* Timestamps per word/segment
* Speaker diarization (who said what)
* Confidence scores per segment

### Step 2: Supported Audio/Video Formats

**Audio Formats**:

* MP3, WAV, M4A, AAC, FLAC, OGG
* Supported bitrates: 8kbps - 320kbps
* Sample rates: 8kHz - 48kHz

**Video Formats**:

* MP4, MOV, AVI, MKV, WEBM
* Audio track extracted automatically
* Video analysis not performed (audio only)

**Ingestion Sources**:

```typescript
// From URL
ingestUri('https://example.com/meeting.mp3')

// From local file
const audio = fs.readFileSync('./meeting.mp3');
const base64 = audio.toString('base64');
ingestEncodedFile({
  name: 'meeting.mp3',
  data: base64,
  mimeType: 'audio/mpeg'
})

// From cloud storage (via feed)
createFeed({
  type: FeedTypes.Site,
  site: {
    type: FeedServiceTypes.AzureFile,
    // Azure Blob Storage config
  }
})
```

### Step 3: Processing Timeline

**Transcription Time** (approximate):

* 10-minute meeting: 1-2 minutes
* 30-minute meeting: 3-5 minutes
* 1-hour meeting: 5-10 minutes
* 2-hour meeting: 10-20 minutes

**Factors Affecting Speed**:

* Audio quality (clean audio faster)
* Number of speakers (more speakers slower)
* Background noise (noisy audio slower)
* File size and bitrate

### Step 4: Transcript Structure

**Markdown Format**:

```markdown
# Meeting Transcript

## Segment 1 (00:00 - 00:15)
Speaker 1: Welcome everyone to the Q4 planning meeting...

## Segment 2 (00:15 - 00:45)
Speaker 2: Thanks Kirk. I wanted to discuss the product roadmap...

## Segment 3 (00:45 - 01:20)
Speaker 1: Great points. Let's talk about the Graphlit launch timeline...
```

**Accessing Transcript**:

```typescript
const content = await graphlit.getContent(meetingId);

// Full transcript
const transcript = content.content.markdown || content.content.text;

// Audio metadata
const duration = content.content.audio?.duration;  // seconds
const channels = content.content.audio?.channels;
const bitrate = content.content.audio?.bitrate;
```

### Step 5: Entity Extraction from Transcript

**Person Entities**:

* Participants (from speaker labels)
* People mentioned in discussion
* Names in action items

**Organization Entities**:

* Companies discussed
* Partners, clients, competitors
* Departments, teams

**Event Entities**:

* Action items ("Send proposal by Friday")
* Deadlines ("Launch date: October 15")
* Follow-up meetings ("Schedule review call")

**Product/Software Entities**:

* Tools discussed
* Products mentioned
* Features planned

**Category Entities**:

* Topics, themes
* Projects, initiatives
* Meeting subjects

### Step 6: Timestamp Analysis

**Occurrence Timestamps**:

```typescript
occurrence: {
  startTime: 125.3,    // Seconds from recording start
  endTime: 127.8,      // Seconds from recording start
  confidence: 0.92     // Extraction confidence
}
```

**Use Cases**:

* Jump to specific entity mentions in playback
* Create entity timeline visualization
* Find when action items were assigned
* Track discussion flow by entity

**Format Timestamps for Display**:

```typescript
function formatTime(seconds: number): string {
  const mins = Math.floor(seconds / 60);
  const secs = Math.floor(seconds % 60);
  return `${mins}:${secs.toString().padStart(2, '0')}`;
}

obs.occurrences?.forEach(occ => {
  if (occ.startTime !== undefined) {
    console.log(`${formatTime(occ.startTime)} - ${obs.observable.name}`);
  }
});
```

### Step 7: Speaker Diarization

**Identifying Speakers**:

```typescript
// Deepgram and AssemblyAI provide speaker labels
const transcript = content.content.markdown;

// Parse speaker segments
const speakerPattern = /Speaker (\d+):/g;
const speakers = new Set<string>();
let match;

while ((match = speakerPattern.exec(transcript)) !== null) {
  speakers.add(match[1]);
}

console.log(`Number of speakers: ${speakers.size}`);
```

**Linking Speakers to Person Entities**:

```typescript
// Cross-reference speaker labels with extracted Person entities
const people = content.content.observations
  ?.filter(obs => obs.type === ObservableTypes.Person) || [];

console.log('Participants:');
people.forEach(person => {
  console.log(`  ${person.observable.name}`);
  // Match to speaker label if possible
});
```

***

## Configuration Options

### Choosing Transcription Service

**Deepgram (Recommended)**:

```typescript
deepgram: { model: DeepgramModels.Nova2 }
```

* **Pros**: Fast, accurate, cost-effective, good speaker diarization
* **Cons**: Requires internet connection
* **Best for**: Most use cases, production

**AssemblyAI**:

```typescript
type: FilePreparationServiceTypes.AssemblyAi,
assemblyAi: {
  model: AssemblyAiModels.Best
}
```

* **Pros**: Very accurate, excellent speaker diarization
* **Cons**: Slower, more expensive
* **Best for**: High-quality transcription needs

**Whisper (via Deepgram)**:

```typescript
type: FilePreparationServiceTypes.Deepgram,
deepgram: {
  model: DeepgramModels.WhisperLarge
}
```

* **Pros**: Very accurate, multilingual support
* **Cons**: Slower than Nova models
* **Best for**: Non-English meetings, maximum accuracy

### Audio Quality Preprocessing

**For Noisy Audio**:

```typescript
preparation: {
  jobs: [{
    connector: {
      type: FilePreparationServiceTypes.Deepgram,
      deepgram: { model: DeepgramModels.Nova2 },
      // Preprocessing options (if supported)
    }
  }]
}
```

**Tips for Better Transcription**:

1. Use high-quality recording equipment
2. Minimize background noise
3. Single speaker per microphone when possible
4. 16kHz+ sample rate recommended
5. Avoid heavy audio compression

***

## Variations

### Variation 1: Multi-Meeting Analysis

Analyze a series of recurring meetings:

```typescript
const meetingUrls = [
  'https://example.com/meetings/week1.mp4',
  'https://example.com/meetings/week2.mp4',
  'https://example.com/meetings/week3.mp4'
];

// Ingest all meetings
const meetings = await Promise.all(
  meetingUrls.map(uri =>
    graphlit.ingestUri(uri, undefined, undefined, undefined, undefined, { id: workflowId })
  )
);

// Wait for all to process
const waitForAll = async () => {
  const ids = meetings.map(m => m.ingestUri.id);
  let allDone = false;
  
  while (!allDone) {
    const statuses = await Promise.all(
      ids.map(id => graphlit.isContentDone(id))
    );
    allDone = statuses.every(s => s.isContentDone.result);
    
    if (!allDone) {
      await new Promise(resolve => setTimeout(resolve, 5000));
    }
  }
};

await waitForAll();

// Analyze trends over time
const allEntities = await graphlit.queryObservables({
  filter: { types: [ObservableTypes.Person, ObservableTypes.Event] }
});

console.log(`Total entities across ${meetings.length} meetings: ${allEntities.observables.results.length}`);
```

### Variation 2: Action Item Tracker

Extract and track action items:

```typescript
interface ActionItem {
  description: string;
  assignee?: string;
  deadline?: string;
  meetingDate: Date;
  timestamp: number;
}

function extractActionItems(content: Content): ActionItem[] {
  const events = content.observations
    ?.filter(obs => obs.type === ObservableTypes.Event) || [];
  
  const actionItems: ActionItem[] = [];
  
  events.forEach(event => {
    // Look for action-like events
    const desc = event.observable.name.toLowerCase();
    const isAction = 
      desc.includes('send') ||
      desc.includes('schedule') ||
      desc.includes('prepare') ||
      desc.includes('follow up') ||
      desc.includes('review');
    
    if (isAction) {
      actionItems.push({
        description: event.observable.name,
        meetingDate: new Date(content.creationDate),
        timestamp: event.occurrences?.[0]?.startTime || 0
      });
    }
  });
  
  return actionItems;
}

const actions = extractActionItems(meetingDetails.content);
console.log(`Action items: ${actions.length}`);
actions.forEach(action => {
  console.log(`  - ${action.description}`);
  console.log(`    At: ${formatTime(action.timestamp)}`);
});
```

### Variation 3: Meeting Sentiment & Topic Analysis

Analyze discussion topics and participant contributions:

```typescript
interface MeetingInsights {
  duration: number;
  participantCount: number;
  topicsDiscussed: string[];
  mostMentionedEntity: string;
  actionItemCount: number;
}

function analyzeMeeting(content: Content): MeetingInsights {
  const people = new Set(
    content.observations
      ?.filter(obs => obs.type === ObservableTypes.Person)
      .map(obs => obs.observable.name) || []
  );
  
  const categories = content.observations
    ?.filter(obs => obs.type === ObservableTypes.Category)
    .map(obs => obs.observable.name) || [];
  
  const events = content.observations
    ?.filter(obs => obs.type === ObservableTypes.Event) || [];
  
  // Find most mentioned entity
  const entityCounts = new Map<string, number>();
  content.observations?.forEach(obs => {
    const count = obs.occurrences?.length || 0;
    entityCounts.set(obs.observable.name, count);
  });
  
  const mostMentioned = Array.from(entityCounts.entries())
    .sort((a, b) => b[1] - a[1])[0];
  
  return {
    duration: content.audio?.duration || 0,
    participantCount: people.size,
    topicsDiscussed: categories,
    mostMentionedEntity: mostMentioned?.[0] || 'None',
    actionItemCount: events.length
  };
}

const insights = analyzeMeeting(meetingDetails.content);
console.log('Meeting Insights:');
console.log(`  Duration: ${Math.floor(insights.duration / 60)} minutes`);
console.log(`  Participants: ${insights.participantCount}`);
console.log(`  Topics: ${insights.topicsDiscussed.join(', ')}`);
console.log(`  Most discussed: ${insights.mostMentionedEntity}`);
console.log(`  Action items: ${insights.actionItemCount}`);
```

### Variation 4: Searchable Meeting Archive

Build searchable meeting repository:

```typescript
// Ingest entire meeting archive
const archive = await graphlit.createFeed({
  name: "Meeting Archive",
  type: FeedTypes.Site,
  site: {
    type: FeedServiceTypes.AzureFile,
    // Point to Azure Blob container with recordings
  },
  workflow: { id: workflowId }
});

// Wait for all meetings to process
await graphlit.isFeedDone(archive.createFeed.id);

// Search meetings by entity
const searchForPerson = async (personName: string) => {
  const personEntity = await graphlit.queryObservables({
    search: personName,
    filter: { types: [ObservableTypes.Person] }
  });
  
  if (personEntity.observables.results.length > 0) {
    const meetings = await graphlit.queryContents({
      
        observations: [{
          type: ObservableTypes.Person,
          observable: { id: personEntity.observables.results[0].observable.id }
        }]
      });
    
    return meetings.contents.results;
  }
  
  return [];
};

// Find all meetings Kirk participated in
const kirkMeetings = await searchForPerson("Kirk Marple");
console.log(`Kirk mentioned in ${kirkMeetings.length} meetings`);
```

### Variation 5: Meeting Summary Generation

Generate AI summaries with entity context:

```typescript
// After transcription + extraction, generate summary
const conversation = await graphlit.createConversation({
  name: "Meeting Summary"
});

// Provide meeting content as context
const summary = await graphlit.promptConversation({
  prompt: "Summarize this meeting, highlighting key decisions, action items, and participants.",
  id: conversation.createConversation.id,
  filter: {
    contents: [{ id: meeting.ingestUri.id }]
  }
});

console.log('Meeting Summary:');
console.log(summary.message.message);

// Extract structured data from summary
const structuredPrompt = await graphlit.promptConversation({
  prompt: `Extract from this meeting:
  1. Key decisions made
  2. Action items with assignees
  3. Follow-up topics
  4. Next steps
  
  Format as JSON.`,
  id: conversation.createConversation.id,
  filter: {
    contents: [{ id: meeting.ingestUri.id }]
  }
});

console.log('\nStructured Summary:');
console.log(structuredPrompt.message.message);
```

***

## Common Issues & Solutions

### Issue: Poor Transcription Quality

**Problem**: Transcript has many errors, missing words.

**Causes & Solutions**:

1. **Low audio quality**: Use higher bitrate recordings (128kbps+)
2. **Background noise**: Record in quiet environment, use noise cancellation
3. **Multiple speakers**: Use individual microphones when possible
4. **Heavy accents**: Try Whisper model (better multilingual support)
5. **Poor microphone**: Invest in quality recording equipment

```typescript
// Try Whisper for difficult audio
type: FilePreparationServiceTypes.Deepgram,
deepgram: {
  model: DeepgramModels.WhisperLarge  // Better multilingual support
}
```

### Issue: Processing Takes Too Long

**Problem**: 1-hour meeting takes 30+ minutes to process.

**Explanation**: Normal for certain conditions.

**Timeline Expectations**:

* Deepgram: \~10% of audio duration (6 min for 1-hour)
* AssemblyAI: \~15% of audio duration (9 min for 1-hour)
* Whisper: \~20-30% of audio duration (12-18 min for 1-hour)

**Optimization**:

* Use Deepgram for speed
* Process shorter segments
* Upload during off-peak hours

### Issue: No Speaker Diarization

**Problem**: All speakers labeled as "Speaker 1".

**Causes**:

1. Single audio channel (mono)
2. Poor speaker separation
3. Overlapping speech

**Solution**: Use stereo recording with separate channels per speaker, or accept single speaker label.

### Issue: Missing Action Items

**Problem**: No Event entities extracted for obvious action items.

**Explanation**: Action items are implicit, not always explicitly stated.

**Solution**: Use LLM to extract action items from transcript:

```typescript
// After transcription, use RAG to extract actions
const conversation = await graphlit.createConversation({
  name: "Extract Actions"
});

const actions = await graphlit.promptConversation({
  prompt: "List all action items, deadlines, and follow-ups mentioned in this meeting. Format as a bullet list with assignees if mentioned.",
  id: conversation.createConversation.id,
  filter: {
    contents: [{ id: meetingId }]
  }
});

console.log(actions.message.message);
```

***

## Developer Hints

### Transcription Service Selection

* **Deepgram**: Best default choice (speed + accuracy + cost)
* **AssemblyAI**: When speaker diarization critical
* **Whisper (Deepgram)**: Non-English meetings, multilingual support

### Audio Format Best Practices

* **Bitrate**: 128kbps minimum, 256kbps recommended
* **Sample rate**: 16kHz minimum, 44.1kHz recommended
* **Channels**: Stereo preferred for multi-speaker
* **Format**: WAV/FLAC for quality, MP3 for size

### Cost Optimization

* Deepgram cheapest per minute
* Compress large video files (audio track only needed)
* Batch process during off-peak hours
* Cache transcripts (don't re-transcribe)

### Meeting Entity Quality

* **High confidence**: Participant names, company names
* **Medium confidence**: Action items, deadlines
* **Low confidence**: Implicit mentions, pronouns
* **Filter threshold**: >=0.6 for meetings (lower than documents)

### Performance Tips

* Process in background (don't block UI)
* Show progress estimates (based on duration)
* Cache transcripts for quick re-query
* Parallel process multiple meetings
* Use webhooks for completion notification

***

## Production Patterns

### Pattern from Meeting Intelligence Apps

* Zoom/Meet recordings → automatic transcription
* Entity extraction: participants, action items, topics
* Searchable archive by person, topic, or date
* Action item tracking dashboard
* Meeting summary emails

### Enterprise Use Cases

* **Sales calls**: Extract prospects, products, objections
* **Customer support**: Track issues, customers, solutions
* **Board meetings**: Decisions, financial mentions, strategic initiatives
* **Team standups**: Tasks, blockers, sprint planning
* **Training sessions**: Topics covered, questions, feedback

***


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.graphlit.dev/api-guides/use-cases/knowledge-graph/knowledge-graph-from-meetings.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
