Create RSS/Podcast Feed

User Intent

"I want to sync RSS feeds or podcast episodes into Graphlit for search and AI interactions"

Operation

  • SDK Method: graphlit.createFeed() with RSS configuration

  • GraphQL: createFeed mutation

  • Entity Type: Feed

  • Common Use Cases: RSS feed monitoring, podcast ingestion, news aggregation, blog sync

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { ContentTypes, FeedTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

const response = await graphlit.createFeed({
  name: 'Tech News RSS',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://news.example.com/rss.xml',
    readLimit: 100,
  },
});

const feedId = response.createFeed.id;
console.log(`RSS feed created: ${feedId}`);

while (true) {
  const status = await graphlit.isFeedDone(feedId);
  if (status.isFeedDone.result) {
    break;
  }

  console.log('Still syncing RSS items...');
  await new Promise((resolve) => setTimeout(resolve, 10_000));
}

console.log('RSS feed sync complete!');

const items = await graphlit.queryContents({
  feeds: [{ id: feedId }],
  types: [ContentTypes.Page],
});

console.log(`Synced ${items.contents.results.length} RSS items`);

Parameters

FeedInput (Required)

  • name (string): Feed name

  • type (FeedTypes): Must be RSS

  • rss (RSSFeedPropertiesInput): RSS configuration

RSSFeedPropertiesInput (Required)

  • uri (string): RSS feed URL

    • Must be valid RSS/Atom XML feed

    • Works with podcast feeds (audio files auto-downloaded)

  • readLimit (int): Max items to sync

    • Limits initial sync

    • Continuous monitoring after initial sync

Optional

  • correlationId (string): For tracking

  • collections (EntityReferenceInput[]): Auto-add items to collections

  • workflow (EntityReferenceInput): Apply workflow to items

Response

{
  createFeed: {
    id: string;              // Feed ID
    name: string;            // Feed name
    state: EntityState;      // ENABLED
    type: FeedTypes.Rss;     // RSS
    rss: {
      uri: string;           // Feed URL
      readLimit: number;     // Item limit
    }
  }
}

Developer Hints

RSS vs Podcast Feeds

Both use same API:

// News RSS feed
const newsFeed = {
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://news.site.com/rss.xml',
    readLimit: 50
  }
};

// Podcast feed (same structure)
const podcastFeed = {
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://podcast.example.com/feed.xml',
    readLimit: 20  // Episodes
  }
};

Important: Podcast audio files are automatically downloaded and transcribed if you have a preparation workflow.

Read Limit Strategy

// Initial backfill (large limit)
const backfillFeed = {
  rss: {
    uri: rssUrl,
    readLimit: 1000  // Get historical items
  }
};

// Ongoing monitoring (small limit)
const monitoringFeed = {
  rss: {
    uri: rssUrl,
    readLimit: 10  // Just recent items
  }
};

Important: After initial sync, feed continuously monitors for new items regardless of readLimit.

Content Type for RSS Items

// RSS items typically come in as Page type
const rssItems = await graphlit.queryContents({
  feeds: [{ id: feedId }],
  types: [ContentTypes.Page]
});

// Podcast audio files come as File type
const podcastEpisodes = await graphlit.queryContents({
  feeds: [{ id: podcastFeedId }],
  types: [ContentTypes.File],
  fileTypes: [FileTypes.Audio]
});

Podcast Transcription

// Create workflow for podcast transcription
const workflow = await graphlit.createWorkflow({
  name: 'Podcast Transcription',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        deepgram: {
          model: DeepgramModels.Nova2
        }
      }
    }]
  }
});

// Create podcast feed with workflow
const podcastFeed = await graphlit.createFeed({
  name: 'Tech Podcast',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://podcast.example.com/feed.xml',
    readLimit: 50
  },
  workflow: { id: workflow.createWorkflow.id }
});

// Episodes automatically transcribed

Variations

1. Basic RSS Feed

Simplest RSS sync:

const feed = await graphlit.createFeed({
  name: 'News Feed',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://news.example.com/rss',
    readLimit: 100
  }
});

2. Podcast Feed with Transcription

Podcast with audio transcription:

// Create transcription workflow
const workflow = await graphlit.createWorkflow({
  name: 'Transcribe Podcasts',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        deepgram: {
          model: DeepgramModels.Nova2
        }
      }
    }]
  }
});

// Create podcast feed
const feed = await graphlit.createFeed({
  name: 'My Podcast',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://podcast.example.com/feed.xml',
    readLimit: 20
  },
  workflow: { id: workflow.createWorkflow.id }
});

3. RSS with Auto-Collection

Add items to collection:

// Create collection
const collection = await graphlit.createCollection({
  name: 'RSS Articles'
});

// Create feed
const feed = await graphlit.createFeed({
  name: 'Tech News',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://technews.com/rss',
    readLimit: 100
  },
  collections: [{ id: collection.createCollection.id }]
});

4. Multiple RSS Feeds

Aggregate multiple sources:

const rssSources = [
  { name: 'TechCrunch', url: 'https://techcrunch.com/feed/' },
  { name: 'Hacker News', url: 'https://news.ycombinator.com/rss' },
  { name: 'The Verge', url: 'https://theverge.com/rss' }
];

// Create collection for all tech news
const collection = await graphlit.createCollection({
  name: 'Tech News Aggregator'
});

// Create feed for each source
for (const source of rssSources) {
  await graphlit.createFeed({
    name: source.name,
    type: FeedTypes.Rss,
    rss: {
      uri: source.url,
      readLimit: 50
    },
    collections: [{ id: collection.createCollection.id }]
  });
}

// Now query all tech news from one collection
const allNews = await graphlit.queryContents({
  collections: [{ id: collection.createCollection.id }],
  orderBy: OrderByTypes.CreationDate,
  orderDirection: OrderDirectionTypes.Desc
});

5. RSS with Entity Extraction

Extract people, companies from articles:

// Create extraction workflow
const workflow = await graphlit.createWorkflow({
  name: 'Extract Entities',
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        modelText: {
          specification: { id: extractionSpecId },
          observables: [
            ObservableTypes.Person,
            ObservableTypes.Organization,
            ObservableTypes.Label
          ]
        }
      }
    }]
  }
});

// Create RSS feed with extraction
const feed = await graphlit.createFeed({
  name: 'Business News',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://businessnews.com/rss',
    readLimit: 100
  },
  workflow: { id: workflow.createWorkflow.id }
});

// Query extracted entities from news
const entities = await graphlit.queryObservables({
  contents: [{ feedId: feed.createFeed.id }],
  observableTypes: [ObservableTypes.Organization]
});

6. Limited Backfill Feed

Only recent items:

const feed = await graphlit.createFeed({
  name: 'Recent News Only',
  type: FeedTypes.Rss,
  rss: {
    uri: 'https://news.example.com/rss',
    readLimit: 10  // Just 10 most recent
  }
});

// Still monitors for new items going forward

Common Issues

Issue: Invalid RSS feed error Solution: Verify RSS URL returns valid XML. Test URL in browser. Some sites require User-Agent header.

Issue: No items syncing Solution: Check readLimit is set. Verify RSS feed has items. Some feeds may be empty.

Issue: Podcast audio not transcribed Solution: Ensure preparation workflow is attached to feed. Check workflow has audio transcription configured.

Issue: Feed syncs old items only Solution: This is initial backfill. After initial sync, feed monitors for new items automatically.

Issue: Duplicate items appearing Solution: Graphlit deduplicates by item GUID. If feed doesn't have unique GUIDs, duplicates may occur.

Production Example

News aggregation pipeline:

// 1. Create collection for all news
const newsCollection = await graphlit.createCollection({
  name: 'Tech News'
});

// 2. Create extraction workflow
const workflow = await graphlit.createWorkflow({
  name: 'News Extraction',
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        modelText: {
          specification: { id: extractionSpecId }
        }
      }
    }]
  }
});

// 3. Create RSS feeds
const feeds = [
  'https://techcrunch.com/feed/',
  'https://theverge.com/rss',
  'https://arstechnica.com/feed/'
];

for (const feedUrl of feeds) {
  const feed = await graphlit.createFeed({
    name: `RSS: ${feedUrl}`,
    type: FeedTypes.Rss,
    rss: {
      uri: feedUrl,
      readLimit: 50
    },
    collections: [{ id: newsCollection.createCollection.id }],
    workflow: { id: workflow.createWorkflow.id }
  });
  
  console.log(`Created feed: ${feed.createFeed.id}`);
}

// 4. Query aggregated news
const latestNews = await graphlit.queryContents({
  collections: [{ id: newsCollection.createCollection.id }],
  orderBy: OrderByTypes.CreationDate,
  orderDirection: OrderDirectionTypes.Desc,
  limit: 20
});

console.log(`\nLatest ${latestNews.contents.results.length} articles:`);
latestNews.contents.results.forEach(article => {
  console.log(`- ${article.name}`);
});

Podcast monitoring system:

// 1. Create transcription workflow
const workflow = await graphlit.createWorkflow({
  name: 'Podcast Transcription',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        deepgram: {
          model: DeepgramModels.Nova2,
          enableSpeakerDiarization: true
        }
      }
    }]
  }
});

// 2. Create podcast feeds
const podcasts = [
  { name: 'Tech Podcast A', url: 'https://podcast-a.com/feed.xml' },
  { name: 'Business Podcast B', url: 'https://podcast-b.com/feed.xml' }
];

for (const podcast of podcasts) {
  const feed = await graphlit.createFeed({
    name: podcast.name,
    type: FeedTypes.Rss,
    rss: {
      uri: podcast.url,
      readLimit: 20  // Last 20 episodes
    },
    workflow: { id: workflow.createWorkflow.id }
  });
  
  // Wait for initial sync
  await waitForFeedCompletion(feed.createFeed.id);
  
  console.log(` ${podcast.name} synced and transcribed`);
}

// 3. Search transcripts
const results = await graphlit.queryContents({
  search: 'artificial intelligence',
  searchType: SearchTypes.Hybrid,
  types: [ContentTypes.File],
  fileTypes: [FileTypes.Audio]
});

console.log(`\nFound ${results.contents.results.length} episodes mentioning AI`);

Last updated

Was this helpful?