Graphlit Platform
Developer PortalChangelogPlatform StatusMore InformationJoin Discord
  • Graphlit Platform
    • What is Graphlit?
    • Key Concepts
  • Getting Started
    • Sign up for Graphlit
    • Create Graphlit Project
    • For Python Developers
    • For Node.js Developers
    • For .NET Developers
  • 🚀Quickstart
    • Next.js applications
      • GitHub Code
    • Python applications
      • GitHub Code
  • Graphlit Data API
    • API Usage
      • API Endpoints
      • API Authentication
      • API Explorer
      • GraphQL 101
    • API Reference
      • Content
        • Ingest With Workflow
        • Ingest File
        • Ingest Encoded File
        • Ingest Web Page
        • Ingest Text
        • Semantic Search
          • Query All Content
          • Query Facets
          • Query By Name
          • Filter By Contents
        • Metadata Filtering
          • Filter By Observations
          • Filter By Feeds
          • Filter By Collections
          • Filter By Content Type
          • Filter By File Type
          • Filter By File Size Range
          • Filter By Date Range
        • Summarize Contents
        • Extract Contents
        • Publish Contents
      • Knowledge Graph
        • Labels
        • Categories
        • Persons
        • Organizations
        • Places
        • Events
        • Products
        • Repos
        • Software
      • Collections
      • Feeds
        • Create Feed With Workflow
        • Create RSS Feed
        • Create Podcast Feed
        • Create Web Feed
        • Create Web Search Feed
        • Create Reddit Feed
        • Create Notion Feed
        • Create YouTube Feed
        • User Storage Feeds
          • Create OneDrive Feed
          • Create Google Drive Feed
          • Create SharePoint Feed
        • Cloud Storage Feeds
          • Create Amazon S3 Feed
          • Create Azure Blob Feed
          • Create Azure File Feed
          • Create Google Blob Feed
        • Messaging Feeds
          • Create Slack Feed
          • Create Microsoft Teams Feed
          • Create Discord Feed
        • Email Feeds
          • Create Google Mail Feed
          • Create Microsoft Outlook Feed
        • Issue Feeds
          • Create Linear Feed
          • Create Jira Feed
          • Create GitHub Issues Feed
        • Configuration Options
      • Workflows
        • Ingestion
        • Indexing
        • Preparation
        • Extraction
        • Enrichment
        • Actions
      • Conversations
      • Specifications
        • Azure OpenAI
        • OpenAI
        • Anthropic
        • Mistral
        • Groq
        • Deepseek
        • Replicate
        • Configuration Options
      • Alerts
        • Create Slack Audio Alert
        • Create Slack Text Alert
      • Projects
    • API Changelog
    • Multi-tenant Applications
  • JSON Mode
    • Overview
    • Document JSON
    • Transcript JSON
  • Content Types
    • Files
      • Documents
      • Audio
      • Video
      • Images
      • Animations
      • Data
      • Emails
      • Code
      • Packages
      • Other
    • Web Pages
    • Text
    • Posts
    • Messages
    • Emails
    • Issues
  • Data Sources
    • Feeds
  • Platform
    • Developer Portal
      • Projects
    • Cloud Platform
      • Security
      • Subprocessors
  • Resources
    • Community
Powered by GitBook
On this page
  • Create Enrichment Workflow
  • Create Web Feed
  • Query Contents

Was this helpful?

  1. Graphlit Data API
  2. API Reference
  3. Feeds

Create Feed With Workflow

Create Web feed with workflow to crawl links.

Last updated 1 year ago

Was this helpful?

When ingesting content into Graphlit, you often will want to configure how the content is processed. Via the Workflow entity, you can specify the stages of the content workflow, which gives fine-grained control over operations like text summarization, entity extraction, and link crawling.

In this example, we will create a workflow to crawl the links found in the Web pages of a Web feed.

, we call createWorkflow mutation, with the enrichment stage configured to crawl a maximum of 10 Web links per-content, and ignoring the content's domain.

, we call createFeed mutation, and pass the ID of the workflow to be used.

, we call the content query to view one of the crawled Web pages.

The content shown has the URI , and that page is not within the sitemap of the feed URI , so it was crawled from the links found on one of the feed's Web pages.

If no workflow is specified with the createFeed mutation, Graphlit will look to see if the project has a default workflow assigned. If one was assigned, it will use that, and if not, it will process the content with the built-in workflow stages (which simply indexes metadata, and prepare content for semantic search and conversations).

Create Enrichment Workflow

Mutation:

mutation CreateWorkflow($workflow: WorkflowInput!) {
  createWorkflow(workflow: $workflow) {
    id
    name
    state
    enrichment {
      link {
        enableCrawling
        allowedDomains
        excludedDomains
        allowedLinks
        excludedLinks
        allowedFiles
        excludedFiles
        allowContentDomain
      }
    }
  }
}

Variables:

{
  "workflow": {
    "enrichment": {
      "link": {
        "enableCrawling": true,
        "allowedLinks": [
          "WEB"
        ],
        "allowContentDomain": false,
        "maximumLinks": 10
      }
    },
    "name": "Enrichment Workflow"
  }
}

Response:

{
  "enrichment": {
    "link": {
      "enableCrawling": true,
      "allowedLinks": [
        "WEB"
      ],
      "allowContentDomain": false
    }
  },
  "id": "d8875dd0-7a3b-45f9-b8bb-cc45fb04d5c3",
  "name": "Enrichment Workflow",
  "state": "ENABLED"
}

Create Web Feed

Mutation:

mutation CreateFeed($feed: FeedInput!) {
  createFeed(feed: $feed) {
    id
    name
    state
    type
  }
}

Variables:

{
  "feed": {
    "type": "WEB",
    "web": {
      "uri": "https://openai.com/blog",
      "readLimit": 10
    },
    "workflow": {
      "id": "d8875dd0-7a3b-45f9-b8bb-cc45fb04d5c3"
    },
    "schedulePolicy": {
      "recurrenceType": "ONCE"
    },
    "name": "Feed With Workflow"
  }
}

Response:

{
  "type": "WEB",
  "id": "1b6c0901-81c4-457e-bd35-c31367bc2799",
  "name": "Feed With Workflow",
  "state": "ENABLED"
}

Query Contents

Query:

query QueryContents($filter: ContentFilter!) {
  contents(filter: $filter) {
    results {
      id
      name
      creationDate
      owner {
        id
      }
      state
      originalDate
      finishedDate
      workflowDuration
      uri
      text
      type
      fileType
      mimeType
      fileName
      fileSize
      masterUri
      mezzanineUri
      transcriptUri
      links {
        uri
        linkType
      }
      document {
        title
        subject
        summary
        author
        publisher
        description
        keywords
        pageCount
      }
    }
  }
}

Variables:

{
  "filter": {
    "queryType": "SIMPLE",
    "searchType": "VECTOR",
    "offset": 0,
    "limit": 1
  }
}

Response:

{
  "results": [
    {
      "type": "PAGE",
      "links": [
        {
          "uri": "https://www.cornell.edu/",
          "linkType": "WEB"
        },
        {
          "uri": "https://info.arxiv.org/about/ourmembers.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/about/donate.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/index.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://arxiv.org/search/advanced",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/login",
          "linkType": "WEB"
        },
        {
          "uri": "https://info.arxiv.org/about/index.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://arxiv.org/abs/2303.10130v1",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/search/econ?searchtype=author&query=Eloundou%2C+T",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/search/econ?searchtype=author&query=Manning%2C+S",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/search/econ?searchtype=author&query=Mishkin%2C+P",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/search/econ?searchtype=author&query=Rock%2C+D",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/abs/2303.10130",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/abs/2303.10130v5",
          "linkType": "WEB"
        },
        {
          "uri": "https://doi.org/10.48550/arXiv.2303.10130",
          "linkType": "WEB"
        },
        {
          "uri": "http://creativecommons.org/licenses/by-sa/4.0/",
          "linkType": "WEB"
        },
        {
          "uri": "https://ui.adsabs.harvard.edu/abs/arXiv:2303.10130",
          "linkType": "WEB"
        },
        {
          "uri": "https://scholar.google.com/scholar_lookup?arxiv_id=2303.10130",
          "linkType": "WEB"
        },
        {
          "uri": "https://api.semanticscholar.org/arXiv:2303.10130",
          "linkType": "WEB"
        },
        {
          "uri": "https://info.arxiv.org/help/trackback.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://static.arxiv.org/static/browse/0.3.4/css/cite.css",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/ct?url=http%3A%2F%2Fwww.bibsonomy.org%2FBibtexHandler%3FrequTask%3Dupload%26url%3Dhttps%3A%2F%2Farxiv.org%2Fabs%2F2303.10130%26description%3DGPTs+are+GPTs%3A+An+Early+Look+at+the+Labor+Market+Impact+Potential+of+Large+Language+Models&v=51542aa8",
          "linkType": "WEB"
        },
        {
          "uri": "https://arxiv.org/ct?url=https%3A%2F%2Freddit.com%2Fsubmit%3Furl%3Dhttps%3A%2F%2Farxiv.org%2Fabs%2F2303.10130%26title%3DGPTs+are+GPTs%3A+An+Early+Look+at+the+Labor+Market+Impact+Potential+of+Large+Language+Models&v=43ad3eb4",
          "linkType": "WEB"
        },
        {
          "uri": "https://info.arxiv.org/labs/showcase.html#arxiv-bibliographic-explorer",
          "linkType": "FILE"
        },
        {
          "uri": "https://www.litmaps.co/",
          "linkType": "WEB"
        },
        {
          "uri": "https://www.scite.ai/",
          "linkType": "WEB"
        },
        {
          "uri": "https://www.catalyzex.com/",
          "linkType": "WEB"
        },
        {
          "uri": "https://dagshub.com/",
          "linkType": "WEB"
        },
        {
          "uri": "https://paperswithcode.com/",
          "linkType": "WEB"
        },
        {
          "uri": "https://sciencecast.org/welcome",
          "linkType": "WEB"
        },
        {
          "uri": "https://replicate.com/docs/arxiv/about",
          "linkType": "WEB"
        },
        {
          "uri": "https://huggingface.co/docs/hub/spaces",
          "linkType": "WEB"
        },
        {
          "uri": "https://influencemap.cmlab.dev/",
          "linkType": "WEB"
        },
        {
          "uri": "https://www.connectedpapers.com/about",
          "linkType": "WEB"
        },
        {
          "uri": "https://core.ac.uk/services/recommender",
          "linkType": "WEB"
        },
        {
          "uri": "https://info.arxiv.org/labs/index.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/mathjax.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/contact.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/subscribe.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/license/index.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/policies/privacy_policy.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://info.arxiv.org/help/web_accessibility.html",
          "linkType": "FILE"
        },
        {
          "uri": "https://status.arxiv.org/",
          "linkType": "WEB"
        },
        {
          "uri": "https://subscribe.sorryapp.com/24846f03/email/new",
          "linkType": "WEB"
        },
        {
          "uri": "https://subscribe.sorryapp.com/24846f03/slack/new",
          "linkType": "WEB"
        }
      ],
      "mimeType": "text/html",
      "fileType": "DOCUMENT",
      "fileName": "2303.10130.htm",
      "fileSize": 47924,
      "masterUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/e4899d7c-407f-4532-89ad-cb18a00feb87/2303.10130.htm?sv=2023-01-03&se=2023-09-07T02%3A06%3A22Z&sr=c&sp=rl&sig=yGJcvB%2FkBvuszIiXbPRwDlzXugzU97eiXEQDJT3xQFY%3D",
      "mezzanineUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/e4899d7c-407f-4532-89ad-cb18a00feb87/Mezzanine/2303.10130.json?sv=2023-01-03&se=2023-09-07T02%3A06%3A22Z&sr=c&sp=rl&sig=yGJcvB%2FkBvuszIiXbPRwDlzXugzU97eiXEQDJT3xQFY%3D",
      "document": {
        "title": "[2303.10130] GPTs are GPTs: An Early Look at the Labor Market Impact Potential of Large Language Models",
        "description": "We investigate the potential implications of large language models (LLMs),\nsuch as Generative Pre-trained Transformers (GPTs), on the U.S. labor market,\nfocusing on the increased capabilities arising from LLM-powered software\ncompared to LLMs on their own. Using a new rubric, we assess occupations based\non their alignment with LLM capabilities, integrating both human expertise and\nGPT-4 classifications. Our findings reveal that around 80% of the U.S.\nworkforce could have at least 10% of their work tasks affected by the\nintroduction of LLMs, while approximately 19% of workers may see at least 50%\nof their tasks impacted. We do not make predictions about the development or\nadoption timeline of such LLMs. The projected effects span all wage levels,\nwith higher-income jobs potentially facing greater exposure to LLM capabilities\nand LLM-powered software. Significantly, these impacts are not restricted to\nindustries with higher recent productivity growth. Our analysis suggests that,\nwith access to an LLM, about 15% of all worker tasks in the US could be\ncompleted significantly faster at the same level of quality. When incorporating\nsoftware and tooling built on top of LLMs, this share increases to between 47\nand 56% of all tasks. This finding implies that LLM-powered software will have\na substantial effect on scaling the economic impacts of the underlying models.\nWe conclude that LLMs such as GPTs exhibit traits of general-purpose\ntechnologies, indicating that they could have considerable economic, social,\nand policy implications."
      },
      "uri": "https://arxiv.org/abs/2303.10130",
      "id": "e4899d7c-407f-4532-89ad-cb18a00feb87",
      "name": "https://arxiv.org/abs/2303.10130",
      "state": "FINISHED",
      "creationDate": "2023-09-06T20:05:30Z",
      "finishedDate": "2023-09-06T20:05:46Z",
      "workflowDuration": "PT16.3506442S",
      "owner": {
        "id": "530a3721-3273-44b4-bff4-e87218143164"
      }
    }
  ]
}
https://arxiv.org/abs/2303.10130
https://openai.com/blog
First
Then
Finally
Cover

Queries

Cover

Mutations

Cover

Objects