Graphlit Platform
Developer PortalChangelogPlatform StatusMore InformationJoin Discord
  • Graphlit Platform
    • What is Graphlit?
    • Key Concepts
  • Getting Started
    • Sign up for Graphlit
    • Create Graphlit Project
    • For Python Developers
    • For Node.js Developers
    • For .NET Developers
  • 🚀Quickstart
    • Next.js applications
      • GitHub Code
    • Python applications
      • GitHub Code
  • Graphlit Data API
    • API Usage
      • API Endpoints
      • API Authentication
      • API Explorer
      • GraphQL 101
    • API Reference
      • Content
        • Ingest With Workflow
        • Ingest File
        • Ingest Encoded File
        • Ingest Web Page
        • Ingest Text
        • Semantic Search
          • Query All Content
          • Query Facets
          • Query By Name
          • Filter By Contents
        • Metadata Filtering
          • Filter By Observations
          • Filter By Feeds
          • Filter By Collections
          • Filter By Content Type
          • Filter By File Type
          • Filter By File Size Range
          • Filter By Date Range
        • Summarize Contents
        • Extract Contents
        • Publish Contents
      • Knowledge Graph
        • Labels
        • Categories
        • Persons
        • Organizations
        • Places
        • Events
        • Products
        • Repos
        • Software
      • Collections
      • Feeds
        • Create Feed With Workflow
        • Create RSS Feed
        • Create Podcast Feed
        • Create Web Feed
        • Create Web Search Feed
        • Create Reddit Feed
        • Create Notion Feed
        • Create YouTube Feed
        • User Storage Feeds
          • Create OneDrive Feed
          • Create Google Drive Feed
          • Create SharePoint Feed
        • Cloud Storage Feeds
          • Create Amazon S3 Feed
          • Create Azure Blob Feed
          • Create Azure File Feed
          • Create Google Blob Feed
        • Messaging Feeds
          • Create Slack Feed
          • Create Microsoft Teams Feed
          • Create Discord Feed
        • Email Feeds
          • Create Google Mail Feed
          • Create Microsoft Outlook Feed
        • Issue Feeds
          • Create Linear Feed
          • Create Jira Feed
          • Create GitHub Issues Feed
        • Configuration Options
      • Workflows
        • Ingestion
        • Indexing
        • Preparation
        • Extraction
        • Enrichment
        • Actions
      • Conversations
      • Specifications
        • Azure OpenAI
        • OpenAI
        • Anthropic
        • Mistral
        • Groq
        • Deepseek
        • Replicate
        • Configuration Options
      • Alerts
        • Create Slack Audio Alert
        • Create Slack Text Alert
      • Projects
    • API Changelog
    • Multi-tenant Applications
  • JSON Mode
    • Overview
    • Document JSON
    • Transcript JSON
  • Content Types
    • Files
      • Documents
      • Audio
      • Video
      • Images
      • Animations
      • Data
      • Emails
      • Code
      • Packages
      • Other
    • Web Pages
    • Text
    • Posts
    • Messages
    • Emails
    • Issues
  • Data Sources
    • Feeds
  • Platform
    • Developer Portal
      • Projects
    • Cloud Platform
      • Security
      • Subprocessors
  • Resources
    • Community
Powered by GitBook
On this page

Was this helpful?

  1. Graphlit Data API
  2. Quickstart
  3. Explore Content

Search For Similar

Last updated 1 year ago

Was this helpful?

In addition to the metadata that Graphlit indexes into the knowledge graph, all the text from your content is automatically added into a searchable index.

With the contents query, you can use the search field to provide text to search by.

Graphlit uses OpenAI embedding models to create vector embeddings of the text extracted from your content. The vector embeddings are added into a built-in vector database for knowledge retrieval.

Your search text also has a vector embedding created, which gets matched against the stored vector embeddings to find the most similar query results.

Vector-based Similarity Search

Here is an example of finding similar content to the search text "Unstructured Data":

Query:

query QueryContents($filter: ContentFilter!) {
  contents(filter: $filter) {
    results {
      id
      name
      creationDate
      state
      owner {
        id
      }
      originalDate
      finishedDate
      workflowDuration
      uri
      text
      type
      fileType
      mimeType
      fileName
      fileSize
      masterUri
      mezzanineUri
      transcriptUri
    }
  }
}

Variables:

{
  "filter": {
    "search": "Unstructured Data",
    "offset": 0,
    "limit": 100
  }
}

Response:

{
  "results": [
    {
      "type": "FILE",
      "mimeType": "audio/mpeg",
      "fileType": "AUDIO",
      "fileName": "Unstructured Data is Dark Data Podcast.mp3",
      "fileSize": 33008244,
      "masterUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "mezzanineUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Mezzanine/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "transcriptUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Transcript/Unstructured%20Data%20is%20Dark%20Data%20Podcast.json",
      "uri": "https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "id": "c0cc103d-467b-43c1-8256-8b99f346d4f3",
      "name": "Unstructured Data is Dark Data Podcast.mp3",
      "state": "FINISHED",
      "creationDate": "2023-07-03T22:24:50Z",
      "finishedDate": "2023-07-03T22:25:46Z",
      "workflowDuration": "PT56.2314332S",
      "owner": {
        "id": "9422b73d-f8d6-4faf-b7a9-152250c862a4"
      }
    }
  ]
}
🚀
™️