Search For Similar

In addition to the metadata that Graphlit indexes into the knowledge graph, all the text from your content is automatically added into a searchable index.

With the contents query, you can use the search field to provide text to search by.

Graphlit uses OpenAI™️ embedding models to create vector embeddings of the text extracted from your content. The vector embeddings are added into a built-in vector database for knowledge retrieval.

Your search text also has a vector embedding created, which gets matched against the stored vector embeddings to find the most similar query results.

Vector-based Similarity Search

Here is an example of finding similar content to the search text "Unstructured Data":

Query:

query QueryContents($filter: ContentFilter!) {
  contents(filter: $filter) {
    results {
      id
      name
      creationDate
      state
      owner {
        id
      }
      originalDate
      finishedDate
      workflowDuration
      uri
      text
      type
      fileType
      mimeType
      fileName
      fileSize
      masterUri
      mezzanineUri
      transcriptUri
    }
  }
}

Variables:

{
  "filter": {
    "search": "Unstructured Data",
    "offset": 0,
    "limit": 100
  }
}

Response:

{
  "results": [
    {
      "type": "FILE",
      "mimeType": "audio/mpeg",
      "fileType": "AUDIO",
      "fileName": "Unstructured Data is Dark Data Podcast.mp3",
      "fileSize": 33008244,
      "masterUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "mezzanineUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Mezzanine/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "transcriptUri": "https://graphlit20230701d31d9453.blob.core.windows.net/files/c0cc103d-467b-43c1-8256-8b99f346d4f3/Transcript/Unstructured%20Data%20is%20Dark%20Data%20Podcast.json",
      "uri": "https://graphlitplatform.blob.core.windows.net/samples/Unstructured%20Data%20is%20Dark%20Data%20Podcast.mp3",
      "id": "c0cc103d-467b-43c1-8256-8b99f346d4f3",
      "name": "Unstructured Data is Dark Data Podcast.mp3",
      "state": "FINISHED",
      "creationDate": "2023-07-03T22:24:50Z",
      "finishedDate": "2023-07-03T22:25:46Z",
      "workflowDuration": "PT56.2314332S",
      "owner": {
        "id": "9422b73d-f8d6-4faf-b7a9-152250c862a4"
      }
    }
  ]
}

Last updated 1 year ago

Was this helpful?