Understanding ContentType vs FileType

User Intent

"What's the difference between contentType and fileType? When do I use each?"

Operation

  • Concept: Content classification hierarchy

  • GraphQL Fields: content.type (ContentType), content.fileType (FileType)

  • Entity Type: Content

  • Common Use Cases: Filtering content, UI display logic, workflow routing, understanding content structure

Key Concept: The Hierarchy

ContentType is PRIMARY - Semantic classification of what the content represents:

  • EMAIL - Email messages

  • MESSAGE - Chat messages (Slack, Teams, Discord)

  • PAGE - Web pages

  • FILE - Files (documents, images, audio, video, code)

  • POST - Social posts (Reddit, RSS)

  • EVENT - Calendar events

  • ISSUE - Issue tracker items (Jira, Linear, GitHub)

  • TEXT - Plain text, markdown, HTML

  • MEMORY - Agent or user memory

FileType is SECONDARY - Physical format (ONLY when contentType = FILE):

  • DOCUMENT - PDF, Word, Excel, PowerPoint

  • IMAGE - JPEG, PNG, GIF, TIFF

  • AUDIO - MP3, WAV, podcast files

  • VIDEO - MP4, MOV, AVI

  • CODE - Source code files

  • DATA - JSON, XML, CSV

  • PACKAGE - ZIP, TAR, archives

  • ANIMATION, DRAWING, GEOMETRY, POINT_CLOUD, SHAPE, etc.

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, FileTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Query by ContentType
const emails = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Email]
  }
});

emails.contents.results.forEach(content => {
  console.log(`Type: ${content.type}`);  // EMAIL
  console.log(`FileType: ${content.fileType}`);  // null (emails don't have fileType)
  
  // Access email-specific metadata
  if (content.type === ContentTypes.Email && content.email) {
    console.log(`From: ${content.email.from[0].email}`);
    console.log(`Subject: ${content.email.subject}`);
  }
});

// Query by FileType (for files)
const pdfs = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File],
    fileTypes: [FileTypes.Document]
  }
});

pdfs.contents.results.forEach(content => {
  console.log(`Type: ${content.type}`);  // FILE
  console.log(`FileType: ${content.fileType}`);  // DOCUMENT
  
  // Access document-specific metadata
  if (content.document) {
    console.log(`Pages: ${content.document.pageCount}`);
    console.log(`Author: ${content.document.author}`);
  }
});

// Query all images (regardless of source)
const images = await graphlit.queryContents({
  filter: {
    fileTypes: [FileTypes.Image]
  }
});

images.contents.results.forEach(content => {
  console.log(`Type: ${content.type}`);  // Could be FILE, MESSAGE (if image in message), etc.
  console.log(`FileType: ${content.fileType}`);  // IMAGE
  
  if (content.image) {
    console.log(`Dimensions: ${content.image.width}x${content.image.height}`);
  }
});

ContentType → Metadata Field Mapping

CRITICAL: Each ContentType has a corresponding metadata field on the content object:

ContentType
Metadata Field
Type
When Set

EMAIL

content.email

EmailMetadata

Always for emails

MESSAGE

content.message

MessageMetadata

Always for messages

EVENT

content.event

EventMetadata

Always for events

ISSUE

content.issue

IssueMetadata

Always for issues

POST

content.post

PostMetadata

Always for posts

FILE (Document)

content.document

DocumentMetadata

When fileType=DOCUMENT

FILE (Image)

content.image

ImageMetadata

When fileType=IMAGE

FILE (Audio)

content.audio

AudioMetadata

When fileType=AUDIO

FILE (Video)

content.video

VideoMetadata

When fileType=VIDEO

FILE (Package)

content.package

PackageMetadata

When fileType=PACKAGE

PAGE

content.html, content.markdown

Strings

Always for pages

Access Pattern:

const content = await graphlit.getContent('content-id');

// Always check type first
switch (content.content.type) {
  case ContentTypes.Email:
    // Safe to access content.email
    console.log(`Subject: ${content.content.email.subject}`);
    break;
    
  case ContentTypes.Message:
    // Safe to access content.message
    console.log(`Channel: ${content.content.message.channelName}`);
    break;
    
  case ContentTypes.File:
    // Check fileType for specific metadata
    if (content.content.fileType === FileTypes.Document) {
      console.log(`Pages: ${content.content.document.pageCount}`);
    } else if (content.content.fileType === FileTypes.Image) {
      console.log(`Size: ${content.content.image.width}x${content.content.image.height}`);
    }
    break;
    
  case ContentTypes.Event:
    // Safe to access content.event
    console.log(`Start: ${content.content.event.startDateTime}`);
    break;
}

Developer Hints

FileType Only Exists for Files

//  WRONG - Emails don't have fileType
const emails = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Email],
    fileTypes: [FileTypes.Document]  // This makes no sense!
  }
});

//  CORRECT - FileType only for FILE content
const pdfs = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File],
    fileTypes: [FileTypes.Document]
  }
});

//  ALSO CORRECT - FileType alone (implicit FILE contentType)
const images = await graphlit.queryContents({
  filter: {
    fileTypes: [FileTypes.Image]
  }
});

Automatic Classification

// ContentType is automatically determined at ingestion:
// - Email from Gmail feed → ContentTypes.Email
// - Slack message → ContentTypes.Message
// - PDF file → ContentTypes.File with FileDocument
// - Web page → ContentTypes.Page
// - GitHub issue → ContentTypes.Issue

// You CANNOT override contentType
// It's determined by the source

Query Flexibility

// Query by ContentType only
const allEmails = await graphlit.queryContents({
  filter: { types: [ContentTypes.Email] }
});

// Query by FileType only (searches all FILE content)
const allImages = await graphlit.queryContents({
  filter: { fileTypes: [FileTypes.Image] }
});

// Query by both (most specific)
const pdfFiles = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File],
    fileTypes: [FileTypes.Document]
  }
});

// Query multiple types (OR logic)
const communications = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Email, ContentTypes.Message]
  }
});

Variations

1. Query All Communication Content

// Emails + Messages + Posts
const communications = await graphlit.queryContents({
  filter: {
    types: [
      ContentTypes.Email,
      ContentTypes.Message,
      ContentTypes.Post
    ]
  }
});

2. Query All Media Files

// Images + Audio + Video
const media = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File],
    fileTypes: [
      FileTypes.Image,
      FileTypes.Audio,
      FileTypes.Video
    ]
  }
});

3. Query Documents Only (No Other Files)

const documents = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File],
    fileTypes: [FileTypes.Document]
  }
});

// This excludes images, audio, video, code, etc.

4. Query Calendar Events

const events = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Event]
  }
});

events.contents.results.forEach(content => {
  if (content.event) {
    console.log(`Event: ${content.event.subject}`);
    console.log(`When: ${content.event.startDateTime}`);
    console.log(`Attendees: ${content.event.attendees?.length || 0}`);
  }
});

5. Query Issues from Project Management Tools

const issues = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Issue]
  }
});

issues.contents.results.forEach(content => {
  if (content.issue) {
    console.log(`Issue: ${content.issue.title}`);
    console.log(`Status: ${content.issue.status}`);
    console.log(`Priority: ${content.issue.priority}`);
  }
});

6. UI Display Logic by Type

const content = await graphlit.getContent('content-id');

// Render UI based on content type
function renderContent(content: Content) {
  switch (content.type) {
    case ContentTypes.Email:
      return <EmailViewer email={content.email} />;
    
    case ContentTypes.Message:
      return <MessageViewer message={content.message} />;
    
    case ContentTypes.File:
      if (content.fileType === FileTypes.Document) {
        return <DocumentViewer document={content.document} />;
      } else if (content.fileType === FileTypes.Image) {
        return <ImageViewer image={content.image} />;
      }
      break;
    
    case ContentTypes.Page:
      return <WebPageViewer html={content.html} />;
    
    case ContentTypes.Event:
      return <EventViewer event={content.event} />;
  }
}

7. Workflow Routing by Type

// Create different workflows for different content types
const documentWorkflow = await graphlit.createWorkflow({
  name: "Document Extraction",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Document
      }
    }]
  },
  extraction: { /* ... */ }
});

const audioWorkflow = await graphlit.createWorkflow({
  name: "Audio Transcription",
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Audio,
        audioTranscription: {
          model: AudioTranscriptionServiceTypes.Deepgram
        }
      }
    }]
  }
});

// Route content to appropriate workflow
const content = await graphlit.getContent('content-id');

if (content.content.fileType === FileTypes.Document) {
  // Re-process with document workflow
} else if (content.content.fileType === FileTypes.Audio) {
  // Re-process with audio workflow
}

Common Issues & Solutions

Issue: Querying for emails by fileType returns no results

//  WRONG
const emails = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Email],
    fileTypes: [FileTypes.Document]  // Emails don't have fileType!
  }
});

Solution: Only use fileType with FILE content

//  CORRECT
const emails = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.Email]
  }
});

Issue: Metadata field is null even though content exists

const content = await graphlit.getContent('content-id');
console.log(content.content.email);  // undefined

Solution: Check content type first

if (content.content.type === ContentTypes.Email) {
  // Now safe to access email metadata
  console.log(content.content.email.subject);
} else {
  console.log(`Content is ${content.content.type}, not EMAIL`);
}

Issue: Want to query "all files" but fileType filter is too specific

//  TOO SPECIFIC - Only gets documents
const files = await graphlit.queryContents({
  filter: {
    fileTypes: [FileTypes.Document]
  }
});

Solution: Use contentType=FILE without fileType filter

//  CORRECT - Gets all files
const allFiles = await graphlit.queryContents({
  filter: {
    types: [ContentTypes.File]
  }
});

Issue: TypeScript type errors when accessing metadata

// TypeScript error: Property 'email' does not exist
console.log(content.content.email.subject);

Solution: Use type guards

if (content.content.type === ContentTypes.Email && content.content.email) {
  // TypeScript knows email exists
  console.log(content.content.email.subject);
}

Production Example

Real-world pattern: Content type routing in UI:

async function displayContent(contentId: string) {
  const content = await graphlit.getContent(contentId);
  
  console.log(`=== CONTENT DISPLAY ===`);
  console.log(`Type: ${content.content.type}`);
  console.log(`Name: ${content.content.name}`);
  
  switch (content.content.type) {
    case ContentTypes.Email:
      console.log(`\n EMAIL`);
      console.log(`From: ${content.content.email.from[0].email}`);
      console.log(`Subject: ${content.content.email.subject}`);
      console.log(`Date: ${content.content.creationDate}`);
      console.log(`Attachments: ${content.content.email.attachmentCount || 0}`);
      break;
      
    case ContentTypes.Message:
      console.log(`\n💬 MESSAGE`);
      console.log(`Channel: ${content.content.message.channelName}`);
      console.log(`Author: ${content.content.message.author?.name}`);
      console.log(`Mentions: ${content.content.message.mentions?.length || 0}`);
      break;
      
    case ContentTypes.File:
      console.log(`\n FILE`);
      console.log(`FileType: ${content.content.fileType}`);
      console.log(`Size: ${content.content.fileSize} bytes`);
      
      if (content.content.fileType === FileTypes.Document) {
        console.log(`Pages: ${content.content.document?.pageCount}`);
        console.log(`Author: ${content.content.document?.author}`);
      } else if (content.content.fileType === FileTypes.Image) {
        console.log(`Dimensions: ${content.content.image?.width}x${content.content.image?.height}`);
      }
      break;
      
    case ContentTypes.Page:
      console.log(`\n🌐 WEB PAGE`);
      console.log(`URL: ${content.content.uri}`);
      break;
      
    case ContentTypes.Event:
      console.log(`\n📅 EVENT`);
      console.log(`Subject: ${content.content.event?.subject}`);
      console.log(`Start: ${content.content.event?.startDateTime}`);
      console.log(`Attendees: ${content.content.event?.attendees?.length || 0}`);
      break;
      
    case ContentTypes.Issue:
      console.log(`\n🎫 ISSUE`);
      console.log(`Title: ${content.content.issue?.title}`);
      console.log(`Status: ${content.content.issue?.status}`);
      console.log(`Priority: ${content.content.issue?.priority}`);
      break;
  }
  
  // Show extracted entities regardless of type
  if (content.content.observations?.length > 0) {
    console.log(`\n🏷 ENTITIES (${content.content.observations.length}):`);
    content.content.observations.forEach(obs => {
      console.log(`  ${obs.type}: ${obs.observable.name}`);
    });
  }
}

// Usage
await displayContent('email-content-id');
await displayContent('slack-message-id');
await displayContent('pdf-document-id');

Last updated

Was this helpful?