Create Web Crawl Feed
User Intent
Operation
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { ContentTypes, FeedTypes } from 'graphlit-client/dist/generated/graphql-types';
const graphlit = new Graphlit();
const response = await graphlit.createFeed({
name: 'Company Documentation',
type: FeedTypes.Web,
web: {
uri: 'https://docs.example.com',
readLimit: 500,
includeFiles: true,
allowedDomains: ['docs.example.com'],
excludedPaths: ['/api/', '/archive/'],
},
});
const feedId = response.createFeed.id;
console.log(`Web crawl feed created: ${feedId}`);
// Poll for crawl completion
while (true) {
const status = await graphlit.isFeedDone(feedId);
if (status.isFeedDone.result) {
break;
}
console.log('Still crawling website...');
await new Promise((resolve) => setTimeout(resolve, 15_000));
}
console.log('Web crawl complete!');
const pages = await graphlit.queryContents({
feeds: [{ id: feedId }],
types: [ContentTypes.Page],
});
console.log(`Crawled ${pages.contents.results.length} pages`);Parameters
FeedInput (Required)
WebFeedPropertiesInput (Required)
Optional (Highly Recommended)
Other Optional
Response
Developer Hints
Always Set Domain Restrictions
Read Limit Strategy
Path Filtering Patterns
File Ingestion
Variations
1. Basic Documentation Crawl
2. Crawl with File Downloads
3. Targeted Path Crawl
4. Multi-Domain Crawl
5. Crawl with Workflow
6. Exclude Unwanted Sections
Common Issues
Production Example
Last updated