When ingesting content into Graphlit, you often will want to configure how the content is processed. Via the Workflow entity, you can specify the stages of the content workflow, which gives fine-grained control over operations like text summarization, entity extraction, and link crawling.
In this example, we will create a workflow to summarize the audio transcript from an ingested MP3 file.
First, we call createWorkflow mutation, with the preparation stage configured to summarize into 5 bullet points, with a maximum of 400 tokens.
Then, we call ingestUri mutation, and pass the ID of the workflow to be used.
Finally, we call the content query to view the summarized bullet points.
If no workflow is specified with the ingestUri mutation, Graphlit will look to see if the project has a default workflow assigned. If one was assigned, it will use that, and if not, it will process the content with the built-in workflow stages (which simply indexes metadata, and prepare content for semantic search and conversations).
The workflow reference is an optional parameter on the ingestUri and ingestText mutations.
Create Preparation Workflow
Mutation:
mutationCreateWorkflow($workflow: WorkflowInput!) { createWorkflow(workflow: $workflow) { id name state preparation { summarizations { type tokens items } }}
mutationIngestUri($name: String, $uri: URL!, $workflow: EntityReferenceInput) { ingestUri(name: $name, uri: $uri, workflow: $workflow) { id name state type fileType mimeType uri text }}
{"type":"FILE","mimeType":"audio/mp3","fileType":"AUDIO","uri":"https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3","id":"7138775d-7aee-41bb-a17f-ce9c348b3a3d","name":"Unstructured Data is Dark Data Podcast.mp3","state":"CREATED"}
Get Content
Query:
queryGetContent($id: ID!) { content(id: $id) { id name creationDate owner { id } state originalDate finishedDate workflowDuration uri text type fileType mimeType fileName fileSize masterUri mezzanineUri transcriptUri summary headline bullets audio { title bitrate channels sampleRate bitsPerSample duration } workflow { id name } }}
Variables:
{"id":"7138775d-7aee-41bb-a17f-ce9c348b3a3d"}
Response:
{"type":"FILE","bullets": ["Unstructured data refers to a broad set of file-based data, including imagery, audio, 3D, and documents.","First-order metadata refers to the basic metadata found in the header of a file, such as XF or XMP metadata.", "Second-order metadata involves reading the data in the file, such as object detection in images or extracting terms from documents.",
"Third-order metadata involves making inferences and creating connections between data, such as linking a conveyor belt in an image to an SAP database.",
"Edge computing involves pushing compute closer to the source of data and taking a derivative version of the data back to the cloud for further analysis."
],"mimeType":"audio/mpeg","fileType":"AUDIO","fileName":"Unstructured Data is Dark Data Podcast.mp3","fileSize":33008244, "masterUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
"mezzanineUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Mezzanine/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
"transcriptUri": "https://graphlit202309044a4fa477.blob.core.windows.net/files/7138775d-7aee-41bb-a17f-ce9c348b3a3d/Transcript/Unstructured%20Data%20is%20Dark%20Data%20Podcast.json?sv=2023-01-03&se=2023-09-07T01%3A03%3A48Z&sr=c&sp=rl&sig=rmmXlUUBq4gfkhSnOBO4oH%2FjufYUuIE0dLUUd872XMI%3D",
"audio": {"bitrate":106000,"channels":1,"sampleRate":48000,"duration":"00:41:26.0640000" },"workflow": {"id":"19a16472-2820-4b5b-870e-a0e543767482","name":"Preparation Workflow" },"uri":"https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3","id":"7138775d-7aee-41bb-a17f-ce9c348b3a3d","name":"Unstructured Data is Dark Data Podcast.mp3","state":"FINISHED","creationDate":"2023-09-06T19:02:14Z","finishedDate":"2023-09-06T19:02:46Z","workflowDuration":"PT31.9959878S","owner": {"id":"530a3721-3273-44b4-bff4-e87218143164" }}