Content
Ingest, manage and query Content.
Overview
When talking about unstructured or complex data, like PDFs, Word documents, MP4 videos, podcasts or even CAD drawings, Graphlit refers to all of those as Content
.
Content management systems (CMS) typically can manage documents and images, but Graphlit takes content management several steps farther to support any content type, even 3D files, RSS posts and Slack messages.
In Graphlit, Content is categorized as:
Ingestion
Getting content into Graphlit is called Ingestion
and can start with files, web pages or plain text messages.
See these pages for more details on the content ingestion options:
For bulk ingestion from cloud storage folders, RSS feeds, or messaging services, see Feeds.
Operations
Queries
For the query, search and filter examples shown, these can be combined together within a content filter object.
Metadata filters are applied first, such as by date range, and then similarity search by text occurs over the filtered result set.
Content States
As content is processed by Graphlit, it will proceed through multiple states
of the content workflow.
Content will always start in the CREATED
state, and will end in either the FINISHED
or ERRORED
state.
When querying the content state, you may see these intermediate states:
CREATED
Initial state after the create mutation.
INGESTED
Once content has been retrieved by source URI and cached for processing.
INDEXED
Once content has had technical metadata indexed, such as creation date, title, page count or podcast episode number.
PREPARED
Once content has been prepared for further workflow states, which includes audio transcript creation, text extraction, and image thumbnail generation.
EXTRACTED
Once content has had entities (i.e. persons, organizations) extracted via ML, and stored in the knowledge graph.
ENRICHED
Extracted text from content (i.e. audio transcripts, document text) has has vector embeddings generated via LLM, and they have been stored in vector database for retrieval.
FINISHED
Content has completed all workflow stages successfully, and will appear in search results.
ERRORED
If the content workflow failed at any stage, look at the error
field for more information. If content failed unexpectedly, you can use the restartContent
mutation to reingest the file and restart the content workflow.
For more information, see the workflow section of the documentation.
Schema Reference
Last updated
Was this helpful?