Graphlit Platform
Developer PortalChangelogPlatform StatusMore InformationJoin Discord
  • Graphlit Platform
    • What is Graphlit?
    • Key Concepts
  • Getting Started
    • Sign up for Graphlit
    • Create Graphlit Project
    • For Python Developers
    • For Node.js Developers
    • For .NET Developers
  • 🚀Quickstart
    • Next.js applications
      • GitHub Code
    • Python applications
      • GitHub Code
  • Graphlit Data API
    • API Usage
      • API Endpoints
      • API Authentication
      • API Explorer
      • GraphQL 101
    • API Reference
      • Content
        • Ingest With Workflow
        • Ingest File
        • Ingest Encoded File
        • Ingest Web Page
        • Ingest Text
        • Semantic Search
          • Query All Content
          • Query Facets
          • Query By Name
          • Filter By Contents
        • Metadata Filtering
          • Filter By Observations
          • Filter By Feeds
          • Filter By Collections
          • Filter By Content Type
          • Filter By File Type
          • Filter By File Size Range
          • Filter By Date Range
        • Summarize Contents
        • Extract Contents
        • Publish Contents
      • Knowledge Graph
        • Labels
        • Categories
        • Persons
        • Organizations
        • Places
        • Events
        • Products
        • Repos
        • Software
      • Collections
      • Feeds
        • Create Feed With Workflow
        • Create RSS Feed
        • Create Podcast Feed
        • Create Web Feed
        • Create Web Search Feed
        • Create Reddit Feed
        • Create Notion Feed
        • Create YouTube Feed
        • User Storage Feeds
          • Create OneDrive Feed
          • Create Google Drive Feed
          • Create SharePoint Feed
        • Cloud Storage Feeds
          • Create Amazon S3 Feed
          • Create Azure Blob Feed
          • Create Azure File Feed
          • Create Google Blob Feed
        • Messaging Feeds
          • Create Slack Feed
          • Create Microsoft Teams Feed
          • Create Discord Feed
        • Email Feeds
          • Create Google Mail Feed
          • Create Microsoft Outlook Feed
        • Issue Feeds
          • Create Linear Feed
          • Create Jira Feed
          • Create GitHub Issues Feed
        • Configuration Options
      • Workflows
        • Ingestion
        • Indexing
        • Preparation
        • Extraction
        • Enrichment
        • Actions
      • Conversations
      • Specifications
        • Azure OpenAI
        • OpenAI
        • Anthropic
        • Mistral
        • Groq
        • Deepseek
        • Replicate
        • Configuration Options
      • Alerts
        • Create Slack Audio Alert
        • Create Slack Text Alert
      • Projects
    • API Changelog
    • Multi-tenant Applications
  • JSON Mode
    • Overview
    • Document JSON
    • Transcript JSON
  • Content Types
    • Files
      • Documents
      • Audio
      • Video
      • Images
      • Animations
      • Data
      • Emails
      • Code
      • Packages
      • Other
    • Web Pages
    • Text
    • Posts
    • Messages
    • Emails
    • Issues
  • Data Sources
    • Feeds
  • Platform
    • Developer Portal
      • Projects
    • Cloud Platform
      • Security
      • Subprocessors
  • Resources
    • Community
Powered by GitBook
On this page
  • Create Preparation Workflow
  • Ingest MP3 File
  • Get Content

Was this helpful?

  1. Graphlit Data API
  2. API Reference
  3. Content

Ingest With Workflow

Summarize Podcast MP3 with preparation workflow.

Last updated 1 year ago

Was this helpful?

When ingesting content into Graphlit, you often will want to configure how the content is processed. Via the Workflow entity, you can specify the stages of the content workflow, which gives fine-grained control over operations like text summarization, entity extraction, and link crawling.

In this example, we will create a workflow to summarize the audio transcript from an ingested MP3 file.

, we call createWorkflow mutation, with the preparation stage configured to summarize into 5 bullet points, with a maximum of 400 tokens.

, we call ingestUri mutation, and pass the ID of the workflow to be used.

, we call the content query to view the summarized bullet points.

If no workflow is specified with the ingestUri mutation, Graphlit will look to see if the project has a default workflow assigned. If one was assigned, it will use that, and if not, it will process the content with the built-in workflow stages (which simply indexes metadata, and prepare content for semantic search and conversations).

The workflow reference is an optional parameter on the ingestUri and ingestText mutations.

Create Preparation Workflow

Mutation:

mutation CreateWorkflow($workflow: WorkflowInput!) {
  createWorkflow(workflow: $workflow) {
    id
    name
    state
    preparation {
      summarizations {
        type
        tokens
        items
      }
  }
}

Variables:

{
  "workflow": {
    "preparation": {
      "summarizations": [
        {
          "type": "BULLET_POINTS",
          "tokens": 400,
          "items": 5
        }
      ]
    },
    "name": "Preparation Workflow"
  }
}

Response:

{
  "preparation": {
    "summarizations": [
      {
        "type": "BULLET_POINTS",
        "tokens": 400,
        "items": 5
      }
    ]
  },
  "id": "19a16472-2820-4b5b-870e-a0e543767482",
  "name": "Preparation Workflow",
  "state": "ENABLED"
}

Ingest MP3 File

Mutation:

mutation IngestUri($name: String, $uri: URL!, $workflow: EntityReferenceInput) {
  ingestUri(name: $name, uri: $uri, workflow: $workflow) {
    id
    name
    state
    type
    fileType
    mimeType
    uri
    text
  }
}

Variables:

{
  "uri": "https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
  "workflow": {
    "id": "19a16472-2820-4b5b-870e-a0e543767482"
  }
}

Response:

{
  "type": "FILE",
  "mimeType": "audio/mp3",
  "fileType": "AUDIO",
  "uri": "https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
  "id": "7138775d-7aee-41bb-a17f-ce9c348b3a3d",
  "name": "Unstructured Data is Dark Data Podcast.mp3",
  "state": "CREATED"
}

Get Content

Query:

query GetContent($id: ID!) {
  content(id: $id) {
    id
    name
    creationDate
    owner {
      id
    }
    state
    originalDate
    finishedDate
    workflowDuration
    uri
    text
    type
    fileType
    mimeType
    fileName
    fileSize
    masterUri
    mezzanineUri
    transcriptUri
    summary
    headline
    bullets
    audio {
      title
      bitrate
      channels
      sampleRate
      bitsPerSample
      duration
    }
    workflow {
      id
      name
    }
  }
}

Variables:

{
  "id": "7138775d-7aee-41bb-a17f-ce9c348b3a3d"
}

Response:

{
  "type": "FILE",
  "bullets": [
    "Unstructured data refers to a broad set of file-based data, including imagery, audio, 3D, and documents.",
    "First-order metadata refers to the basic metadata found in the header of a file, such as XF or XMP metadata.",
    "Second-order metadata involves reading the data in the file, such as object detection in images or extracting terms from documents.",
    "Third-order metadata involves making inferences and creating connections between data, such as linking a conveyor belt in an image to an SAP database.",
    "Edge computing involves pushing compute closer to the source of data and taking a derivative version of the data back to the cloud for further analysis."
  ],
  "mimeType": "audio/mpeg",
  "fileType": "AUDIO",
  "fileName": "Unstructured Data is Dark Data Podcast.mp3",
  "fileSize": 33008244,
  "masterUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
  "mezzanineUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Mezzanine/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
  "transcriptUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Transcript/Unstructured%20Data%20is%20Dark%20Data%20Podcast.json?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
  "audio": {
    "bitrate": 106000,
    "channels": 1,
    "sampleRate": 48000,
    "duration": "00:41:26.0640000"
  },
  "workflow": {
    "id": "19a16472-2820-4b5b-870e-a0e543767482",
    "name": "Preparation Workflow"
  },
  "uri": "https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
  "id": "7138775d-7aee-41bb-a17f-ce9c348b3a3d",
  "name": "Unstructured Data is Dark Data Podcast.mp3",
  "state": "FINISHED",
  "creationDate": "2023-09-06T19:02:14Z",
  "finishedDate": "2023-09-06T19:02:46Z",
  "workflowDuration": "PT31.9959878S",
  "owner": {
    "id": "530a3721-3273-44b4-bff4-e87218143164"
  }
}
First
Then
Finally