Graphlit Platform
Developer PortalChangelogPlatform StatusMore InformationJoin Discord
  • Graphlit Platform
    • What is Graphlit?
    • Key Concepts
  • Getting Started
    • Sign up for Graphlit
    • Create Graphlit Project
    • For Python Developers
    • For Node.js Developers
    • For .NET Developers
  • 🚀Quickstart
    • Next.js applications
      • GitHub Code
    • Python applications
      • GitHub Code
  • Graphlit Data API
    • API Usage
      • API Endpoints
      • API Authentication
      • API Explorer
      • GraphQL 101
    • API Reference
      • Content
        • Ingest With Workflow
        • Ingest File
        • Ingest Encoded File
        • Ingest Web Page
        • Ingest Text
        • Semantic Search
          • Query All Content
          • Query Facets
          • Query By Name
          • Filter By Contents
        • Metadata Filtering
          • Filter By Observations
          • Filter By Feeds
          • Filter By Collections
          • Filter By Content Type
          • Filter By File Type
          • Filter By File Size Range
          • Filter By Date Range
        • Summarize Contents
        • Extract Contents
        • Publish Contents
      • Knowledge Graph
        • Labels
        • Categories
        • Persons
        • Organizations
        • Places
        • Events
        • Products
        • Repos
        • Software
      • Collections
      • Feeds
        • Create Feed With Workflow
        • Create RSS Feed
        • Create Podcast Feed
        • Create Web Feed
        • Create Web Search Feed
        • Create Reddit Feed
        • Create Notion Feed
        • Create YouTube Feed
        • User Storage Feeds
          • Create OneDrive Feed
          • Create Google Drive Feed
          • Create SharePoint Feed
        • Cloud Storage Feeds
          • Create Amazon S3 Feed
          • Create Azure Blob Feed
          • Create Azure File Feed
          • Create Google Blob Feed
        • Messaging Feeds
          • Create Slack Feed
          • Create Microsoft Teams Feed
          • Create Discord Feed
        • Email Feeds
          • Create Google Mail Feed
          • Create Microsoft Outlook Feed
        • Issue Feeds
          • Create Linear Feed
          • Create Jira Feed
          • Create GitHub Issues Feed
        • Configuration Options
      • Workflows
        • Ingestion
        • Indexing
        • Preparation
        • Extraction
        • Enrichment
        • Actions
      • Conversations
      • Specifications
        • Azure OpenAI
        • OpenAI
        • Anthropic
        • Mistral
        • Groq
        • Deepseek
        • Replicate
        • Configuration Options
      • Alerts
        • Create Slack Audio Alert
        • Create Slack Text Alert
      • Projects
    • API Changelog
    • Multi-tenant Applications
  • JSON Mode
    • Overview
    • Document JSON
    • Transcript JSON
  • Content Types
    • Files
      • Documents
      • Audio
      • Video
      • Images
      • Animations
      • Data
      • Emails
      • Code
      • Packages
      • Other
    • Web Pages
    • Text
    • Posts
    • Messages
    • Emails
    • Issues
  • Data Sources
    • Feeds
  • Platform
    • Developer Portal
      • Projects
    • Cloud Platform
      • Security
      • Subprocessors
  • Resources
    • Community
Powered by GitBook
On this page

Was this helpful?

  1. Graphlit Data API
  2. Quickstart
  3. Ingest Content

Ingest Markdown Text

Ingest Markdown text into Graphlit

Last updated 1 year ago

Was this helpful?

Often you will find software documentation written in Markdown format, or stored in applications like .

You can ingest Markdown into Graphlit, and do semantic search and LLM chat conversations over the text.

Internally, Graphlit generates what we call a text mezzanine, which stores the extracted text of the webpage, segmented in semantically-correct pages and paragraphs.

We can use this description of unstructured metadata, written in Markdown format.

# Metadata in Unstructured Data

The metadata of unstructured data provide a starting point for working with unstructured data. They can be classified into three levels:

- **First Order Metadata**: The data in the header of a file. It is the bare minimum of metadata that one can get out of a file. For example, you can read the EXIF data of an image, but if you are unable to read the image, you will not know what was actually captured.

- **Second Order Metadata**: The data that helps in reading the file and identifying its contents. In the case of images, models are used to detect objects and identify what was captured. Bounding boxes and their tags, often used in training machine learning models, are perfect examples of second-order metadata in images.

- **Third Order Metadata**: Data pulled from making inferences across a bunch of related data and linked databases. This data provides a framework for contextualization that creates edges, like in a knowledge graph, that connect something to something else. This can be thought of as the spider web that grows bigger as more edges are created, as more inferences are pulled.

Graphlit supports Markdown, HTML and plain text formats with the ingestText mutation. You can set the textType field to the format of the provided text.

Assuming you're logged into the Graphlit Developer Portal, you can use the embedded API explorer to test this out. For more information, see the page.

Mutation:

mutation IngestText($name: String!, $text: String!, $textType: TextTypes, $uri: URL) {
  ingestText(name: $name, text: $text, textType: $textType, uri: $uri) {
    id
    name
    state
    type
    fileType
    mimeType
    uri
    text
  }
}

Variables:

{
  "name": "Unstructured Metadata",
  "text": "# Metadata in Unstructured Data\nThe metadata of unstructured data provide a starting point for working with unstructured data. They can be classified into three levels:\n- **First Order Metadata**: The data in the header of a file. It is the bare minimum of metadata that one can get out of a file. For example, you can read the EXIF data of an image, but if you are unable to read the image, you will not know what was actually captured.\n- **Second Order Metadata**: The data that helps in reading the file and identifying its contents. In the case of images, models are used to detect objects and identify what was captured. Bounding boxes and their tags, often used in training machine learning models, are perfect examples of second-order metadata in images.\n- **Third Order Metadata**: Data pulled from making inferences across a bunch of related data and linked databases. This data provides a framework for contextualization that creates edges, like in a knowledge graph, that connect something to something else. This can be thought of as the spider web that grows bigger as more edges are created, as more inferences are pulled.",
  "textType": "MARKDOWN"
}

Response:

{
  "type": "TEXT",
  "text": "# Metadata in Unstructured Data\nThe metadata of unstructured data provide a starting point for working with unstructured data. They can be classified into three levels:\n- **First Order Metadata**: The data in the header of a file. It is the bare minimum of metadata that one can get out of a file. For example, you can read the EXIF data of an image, but if you are unable to read the image, you will not know what was actually captured.\n- **Second Order Metadata**: The data that helps in reading the file and identifying its contents. In the case of images, models are used to detect objects and identify what was captured. Bounding boxes and their tags, often used in training machine learning models, are perfect examples of second-order metadata in images.\n- **Third Order Metadata**: Data pulled from making inferences across a bunch of related data and linked databases. This data provides a framework for contextualization that creates edges, like in a knowledge graph, that connect something to something else. This can be thought of as the spider web that grows bigger as more edges are created, as more inferences are pulled.",
  "mimeType": "text/markdown",
  "fileType": "DOCUMENT",
  "id": "ba1d5e01-6b53-4dab-b114-b4e12b2d388b",
  "name": "Unstructured Metadata",
  "state": "CREATED"
}

Now, try this yourself, and note, you can press the copy icon when you mouse-over the text boxes below.

Once you have your Markdown text ingested into Graphlit, you can .

🚀
explore the knowledge inside
Obsidian
Projects