When talking about unstructured or complex data, like PDFs, Word documents, MP4 videos, podcasts or even CAD drawings, Graphlit refers to all of those as Content.
Content management systems (CMS) typically can manage documents and images, but Graphlit takes content management several steps farther to support any content type, even 3D files, RSS posts and Slack messages.
If you have multiple pieces of content you want to delete, you can use the deleteContents mutation, and pass an array of IDs for the content you wish to delete.
Mutation:
mutationDeleteContents($ids: [ID!]!) { deleteContents(ids: $ids) { id state }}
While developing and testing your application, you may want to delete all ingested content in your project.
You can use the deleteAllContents mutation. This does not take any additional variables, and will delete all contents in the project or tenant (depending on the JWT). The mutation returns an array of deleted content.
NOTE: This is a hard-deletion of the content, and all linked Graphlit metadata and/or files will be deleted when the content is deleted.
Mutation:
mutationDeleteAllContents { deleteAllContents { id state }}
When you want to get more details on a piece of content which has been ingested, you can use the content query to request any appropriate fields, and pass the ID of the content you wish to get.
For more details on what content fields are available for query, see the Content object schema.
Query:
queryGetContent($id: ID!) { content(id: $id) { id name creationDate state owner { id } originalDate finishedDate workflowDuration uri text type fileType mimeType fileName fileSize masterUri textUri transcriptUri }}
Variables:
{"id":"cc4f2a1f-b103-4cab-8a98-2b8cd84b691c"}
Response:
{"type":"FILE","mimeType":"audio/mpeg","fileType":"AUDIO","fileName":"Unstructured Data is Dark Data Podcast.mp3","fileSize":33008244,"masterUri":"https://graphlit20230701d31d9453.blob.core.windows.net/files/cc4f2a1f-b103-4cab-8a98-2b8cd84b691c/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3","textUri":"https://graphlit20230701d31d9453.blob.core.windows.net/files/cc4f2a1f-b103-4cab-8a98-2b8cd84b691c/Mezzanine/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3","transcriptUri":"https://graphlit20230701d31d9453.blob.core.windows.net/files/cc4f2a1f-b103-4cab-8a98-2b8cd84b691c/Transcript/Unstructured%20Data%20is%20Dark%20Data%20Podcast.json","uri":"https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3","id":"cc4f2a1f-b103-4cab-8a98-2b8cd84b691c","name":"Unstructured Data is Dark Data Podcast.mp3","state":"FINISHED","creationDate":"2023-07-02T23:10:56Z","finishedDate":"2023-07-02T23:11:52Z","workflowDuration":"PT55.8371387S","owner": {"id":"9422b73d-f8d6-4faf-b7a9-152250c862a4" }}
Queries
For the query, search and filter examples shown, these can be combined together within a content filter object.
Metadata filters are applied first, such as by date range, and then similarity search by text occurs over the filtered result set.
As content is processed by Graphlit, it will proceed through multiple states of the content workflow.
Content will always start in the CREATED state, and will end in either the FINISHED or ERRORED state.
When querying the content state, you may see these intermediate states:
State
Description
CREATED
Initial state after the create mutation.
INGESTED
Once content has been retrieved by source URI and cached for processing.
INDEXED
Once content has had technical metadata indexed, such as creation date, title, page count or podcast episode number.
PREPARED
Once content has been prepared for further workflow states, which includes audio transcript creation, text extraction, and image thumbnail generation.
EXTRACTED
Once content has had entities (i.e. persons, organizations) extracted via ML, and stored in the knowledge graph.
ENRICHED
Extracted text from content (i.e. audio transcripts, document text) has has vector embeddings generated via LLM, and they have been stored in vector database for retrieval.
FINISHED
Content has completed all workflow stages successfully, and will appear in search results.
ERRORED
If the content workflow failed at any stage, look at the error field for more information. If content failed unexpectedly, you can use the restartContent mutation to reingest the file and restart the content workflow.
For more information, see the workflow section of the documentation.