Summarize Contents

Summarize multiple content items in parallel.

In addition to summarizing contents as part of your preparation workflow stage, you can summarize contents directly with the summarizeContents mutation.

Create Summarization Specification

First, you optionally can create a specification to use with summarization. Often you will want to use a higher temperature field with summarization, so the LLM gets more creative with the written response.

If no specification is assigned for summarization, Graphlit will use the Azure OpenAI GPT-3.5 Turbo 16k model by default.

Mutation:

mutation CreateSpecification($specification: SpecificationInput!) {
  createSpecification(specification: $specification) {
    id
    name
    state
    type
    serviceType
  }
}

Variables:

{
  "specification": {
    "type": "COMPLETION",
    "serviceType": "AZURE_OPEN_AI",
    "azureOpenAI": {
      "model": "GPT35_TURBO_16K",
      "temperature": 0.8,
      "probability": 0.2
    },
    "name": "GPT-3.5 Turbo Summarization"
  }
}

Response:

{
  "type": "COMPLETION",
  "serviceType": "AZURE_OPEN_AI",
  "id": "835a37ec-aba5-4715-9c87-c75fe72caa35",
  "name": "GPT-3.5 Turbo Summarization",
  "state": "ENABLED"
}

Summarize Contents

Summarizing contents is similar to querying contents, in that it takes a content filter parameter.

Graphlit will query the contents, based on your filter, and then summarize each content separately, with one or more summarizations you specify.

With the slower performance of some LLMs like GPT-4 Turbo 128k, you may get API timeouts attempting to summarize contents, especially with multiple summarizations per content or a larger number of contents. If this happens, you can filter the contents to return less results, limit the number of summarizations being executed per content, or try a different LLM.

Summarization performance is dependent on the number of pages of text, or the length of an audio/video transcript. Graphlit performs recursive summarization on longer content to first summarize each page (or transcript segment) and then summarize the intermediate summaries into the final summary.

In this example, we've ingested the OpenAI blog via web feed, and then are filtering by HYBRID vector search with the text "OpenAI developer conference".

We are creating summary paragraphs, followup questions, and running a custom LLM prompt on each piece of content to extract named entities as JSON-LD format.

Mutation:

mutation SummarizeContents($summarizations: [SummarizationStrategyInput!]!, $filter: ContentFilter) {
  summarizeContents(summarizations: $summarizations, filter: $filter) {
    specification {
      id
    }
    content {
      id
    }
    type
    items {
      text
      tokens
      summarizationTime
    }
    error
  }
}

Variables:

{
  "summarizations": [
    {
      "type": "SUMMARY",
      "specification": {
        "id": "92866ab7-1a2a-4ae0-a8ae-464a3a628824"
      },
      "items": 5
    },
    {
      "type": "QUESTIONS",
      "specification": {
        "id": "92866ab7-1a2a-4ae0-a8ae-464a3a628824"
      },
      "items": 5
    },
    {
      "type": "CUSTOM",
      "specification": {
        "id": "92866ab7-1a2a-4ae0-a8ae-464a3a628824"
      },
      "prompt": "Extract any named entities into JSON-LD format."
    }
  ],
  "filter": {
    "search": "OpenAI developer conference",
    "queryType": "SIMPLE",
    "searchType": "HYBRID"
  }
}

Response:

  {
    "content": {
      "id": "599394d0-e552-4e91-ba04-c3816536048f"
    },
    "type": "CUSTOM",
    "items": [
      {
        "text": "{\n  \"namedEntities\": [\n    {\n      \"organization\": \"OpenAI\",\n      \"type\": \"Company\"\n    },\n    {\n      \"person\": \"Justin Jay Wang\",\n      \"type\": \"Person\"\n    },\n    {\n      \"person\": \"Jie Tang\",\n      \"type\": \"Person\"\n    },\n    {\n      \"person\": \"Jack Clark\",\n      \"type\": \"Person\"\n    },\n    {\n      \"person\": \"Greg Brockman\",\n      \"type\": \"Person\"\n    },\n    {\n      \"person\": \"Aleks Kamko\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Berkeley\",\n      \"type\": \"University\"\n    },\n    {\n      \"organization\": \"Stripe\",\n      \"type\": \"Company\"\n    },\n    {\n      \"person\": \"Alex Ray\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Planet Labs\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Airware\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"NCSU\",\n      \"type\": \"University\"\n    },\n    {\n      \"person\": \"Ankur Handa\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Imperial College London\",\n      \"type\": \"University\"\n    },\n    {\n      \"organization\": \"University of Cambridge\",\n      \"type\": \"University\"\n    },\n    {\n      \"person\": \"Bob McGrew\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Palantir\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Stanford Artificial Intelligence Lab\",\n      \"type\": \"Organization\"\n    },\n    {\n      \"person\": \"Christopher Berner\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Facebook\",\n      \"type\": \"Company\"\n    },\n    {\n      \"person\": \"Erika Reinhardt\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"MIT\",\n      \"type\": \"University\"\n    },\n    {\n      \"person\": \"Jakub Pachocki\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"CMU\",\n      \"type\": \"University\"\n    },\n    {\n      \"person\": \"Jeremy Schlatter\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Mailgun\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Google\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"AeroFS\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Magic\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"University of Toronto\",\n      \"type\": \"University\"\n    },\n    {\n      \"organization\": \"Sunnybrook Research Institutes\",\n      \"type\": \"Organization\"\n    },\n    {\n      \"person\": \"Peter Welinder\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Dropbox\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Anchovi Labs\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"Caltech\",\n      \"type\": \"University\"\n    },\n    {\n      \"person\": \"Rachel Fong\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Locu\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"University of Maryland\",\n      \"type\": \"University\"\n    },\n    {\n      \"organization\": \"Vicarious\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"MetaMind\",\n      \"type\": \"Company\"\n    },\n    {\n      \"person\": \"Tom Brown\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Grouper\",\n      \"type\": \"Company\"\n    },\n    {\n      \"organization\": \"MoPub\",\n      \"type\": \"Company\"\n    },\n    {\n      \"person\": \"Yaroslav Bulatov\",\n      \"type\": \"Person\"\n    },\n    {\n      \"organization\": \"Google Brain\",\n      \"type\": \"Organization\"\n    }\n  ]\n}",
        "tokens": 901,
        "summarizationTime": "PT9.1896719S"
      }
    ]
  },
  {
    "content": {
      "id": "599394d0-e552-4e91-ba04-c3816536048f"
    },
    "type": "SUMMARY",
    "items": [
      {
        "text": "The OpenAI team has grown to 45 people, working on pushing the frontier of AI capabilities through validating novel ideas, creating new software systems, and deploying machine learning on robots.\n\nThe team continues to look for creative, motivated researchers and engineers to help achieve their goals.\n\nNew team members include recent graduates and experienced professionals from diverse backgrounds, such as research, engineering, and product management at companies like Facebook, Dropbox, and Google.\n\nThe team also includes individuals with expertise in areas such as real-time camera tracking, scene understanding, optimization theory, data engineering, and model-based reinforcement learning.\n\nFormer OpenAI interns who have joined the team full-time include Catherine Olsson, Jonathan Ho, Paul Christiano, Peter Chen, Prafulla Dhariwal, Rein Houthooft, and Rocky Duan.",
        "tokens": 161,
        "summarizationTime": "PT2.127506S"
      }
    ]
  },
  {
    "content": {
      "id": "599394d0-e552-4e91-ba04-c3816536048f"
    },
    "type": "QUESTIONS",
    "items": [
      {
        "text": "What is the current size of the OpenAI team?",
        "tokens": 11,
        "summarizationTime": "PT1.2416916S"
      },
      {
        "text": "Can you provide some examples of the new team members and their backgrounds?",
        "tokens": 14,
        "summarizationTime": "PT1.2416916S"
      },
      {
        "text": "What are some of the areas of expertise that the new team members bring to OpenAI?",
        "tokens": 18,
        "summarizationTime": "PT1.2416916S"
      },
      {
        "text": "What are some of the recent research topics that the OpenAI team has been working on?",
        "tokens": 18,
        "summarizationTime": "PT1.2416916S"
      }
    ]
  },
  {
    "content": {
      "id": "c6c0daf9-966d-4470-9d84-f3616d6ccefe"
    },
    "type": "CUSTOM",
    "items": [
      {
        "text": "{\n  \"entities\": [\n    \"OpenAI\",\n    \"Scott Aaronson\",\n    \"Joshua Achiam\",\n    \"Steven Adler\",\n    \"Sandhini Agarwal\",\n    \"Lama Ahmad\",\n    \"Sam Altman\",\n    \"Dario Amodei\",\n    \"Parnian Barekatain\",\n    \"Mohammad Bavarian\",\n    \"Gabriel Bernadett-Shapiro\",\n    \"Greg Brockman\",\n    \"Jack Clark\",\n    \"Arka Dhar\",\n    \"Atty Eleti\",\n    \"Tyna Eloundou\",\n    \"Elie Georges\",\n    \"Vik Goel\",\n    \"Ian Goodfellow\",\n    \"Ryan Greene\",\n    \"Maddie Hall\",\n    \"Jeff Harris\",\n    \"Steven Heidel\",\n    \"Joanne Jang\",\n    \"Angela Jiang\",\n    \"Heewoo Jun\",\n    \"Andrej Karpathy\",\n    \"Logan Kilpatrick\",\n    \"Jan Hendrik Kirchner\",\n    \"Teddy Lee\",\n    \"Jan Leike\",\n    \"Jade Leung\",\n    \"Rachel Lim\",\n    \"Sam Manning\",\n    \"Todor Markov\",\n    \"Luke Miller\",\n    \"Pamela Mishkin\",\n    \"Igor Mordatch\",\n    \"Mira Murati\",\n    \"Elon Musk\",\n    \"Arvind Neelakantan\",\n    \"Harold Nguyen\",\n    \"Joel Parish\",\n    \"Andrew Peng\",\n    \"Ashley Pilipiszyn\",\n    \"Michelle Pokrass\",\n    \"Henrique Pond\",\n    \"Boris Power\",\n    \"Bob Rotsted\",\n    \"Ted Sanders\",\n    \"Shibani Santurkar\",\n    \"Girish Sastry\",\n    \"Larissa Schiavo\",\n    \"John Schulman\",\n    \"Ilya Sutskever\",\n    \"Jie Tang\",\n    \"Andrea Vallone\",\n    \"Peter Welinder\",\n    \"Lilian Weng\",\n    \"Michael Wu\",\n    \"Jeffrey Wu\",\n    \"Wojciech Zaremba\",\n    \"Chong Zhang\",\n    \"Democratic inputs to AI grant program\",\n    \"OpenAI is approaching 2024 worldwide elections\",\n    \"ChatGPT Team\",\n    \"GPT Store\",\n    \"OpenAI and journalism\",\n    \"Superalignment Fast Grants\",\n    \"Partnership with Axel Springer\",\n    \"Sam Altman returns as CEO\",\n    \"OpenAI announces leadership transition\",\n    \"OpenAI Data Partnerships\",\n    \"GPTs\",\n    \"Frontier risk and preparedness\",\n    \"Frontier Model Forum updates\",\n    \"DALLE 3\",\n    \"ChatGPT Plus\",\n    \"OpenAI Red Teaming Network\",\n    \"OpenAI Dublin\",\n    \"OpenAIs first developer conference\",\n    \"Teaching with AI\"\n  ]\n}",
        "tokens": 590,
        "summarizationTime": "PT5.0455123S"
      }
    ]
  },
  {
    "content": {
      "id": "c6c0daf9-966d-4470-9d84-f3616d6ccefe"
    },
    "type": "SUMMARY",
    "items": [
      {
        "text": "OpenAI has made several announcements regarding its initiatives and partnerships. These include the introduction of the GPT Store, a collaboration with Axel Springer to enhance the use of AI in journalism, and the launch of the ChatGPT Team. Additionally, OpenAI has revealed its plans for the 2024 worldwide elections and its data partnerships.\n\nLeadership changes at OpenAI have also been announced, with Sam Altman returning as CEO and the appointment of a new initial board. The organization has also introduced new models and developer products, such as GPTs and DALLE 3, as well as hosting its first developer conference in San Francisco.\n\nIn the realm of responsible AI, OpenAI has implemented the Superalignment Fast Grants and established the Red Teaming Network. Furthermore, the organization has emphasized the importance of democratic inputs in its AI grant program and highlighted the topic of frontier risk and preparedness.\n\nNotable individuals involved in these initiatives and announcements include Sam Altman, Axel Springer, and various OpenAI team members such as Scott Aaronson, Steven Adler, and Elie Georges. The organization continues to be at the forefront of AI research and development, with a focus on ethical and beneficial use of AI technology.",
        "tokens": 242,
        "summarizationTime": "PT3.2364665S"
      }
    ]
  },
  {
    "content": {
      "id": "c6c0daf9-966d-4470-9d84-f3616d6ccefe"
    },
    "type": "QUESTIONS",
    "items": [
      {
        "text": "What is the latest announcement from OpenAI?",
        "tokens": 9,
        "summarizationTime": "PT1.5598939S"
      },
      {
        "text": "When did Sam Altman return as CEO of OpenAI?",
        "tokens": 12,
        "summarizationTime": "PT1.5598939S"
      },
      {
        "text": "What new models and developer products were announced at DevDay?",
        "tokens": 12,
        "summarizationTime": "PT1.5598939S"
      },
      {
        "text": "What is the purpose of the OpenAI Red Teaming Network?",
        "tokens": 13,
        "summarizationTime": "PT1.5598939S"
      },
      {
        "text": "When and where is OpenAI's first developer conference scheduled to take place?",
        "tokens": 15,
        "summarizationTime": "PT1.5598939S"
      }
    ]
  },
  {
    "content": {
      "id": "e2258deb-ec61-416b-936f-8caf5cfee9c0"
    },
    "type": "SUMMARY",
    "items": [
      {
        "text": "OpenAI is hosting its first developer conference, OpenAI DevDay, on November 6, 2023 in San Francisco. The event will bring together hundreds of developers to preview new tools and exchange ideas. In-person attendees will have the opportunity to join breakout sessions led by OpenAI's technical staff.\n\nSince launching its API in 2020, OpenAI has continuously updated it to include advanced models like GPT-4, GPT-3.5, DALLE, and Whisper. Over 2 million developers are currently using these models for various applications, from integrating smart assistants into existing applications to building entirely new services.\n\nSam Altman, CEO of OpenAI, expressed excitement about showing the latest work to enable developers to build new things. Those interested in attending the conference in person can sign up to receive a notification when registration opens.\n\nFor press inquiries about attending in person, they can reach out to devdaypress@openai.com. Additionally, OpenAI has released various research and system cards related to AI, including Weak-to-strong generalization, Practices for Governing Agentic AI Systems, DALLE 3 system card, and GPT-4V(ision) system card.",
        "tokens": 240,
        "summarizationTime": "PT2.7975511S"
      }
    ]
  },
  {
    "content": {
      "id": "e2258deb-ec61-416b-936f-8caf5cfee9c0"
    },
    "type": "CUSTOM",
    "items": [
      {
        "text": "{\n  \"entities\": [\n    {\n      \"name\": \"OpenAI\",\n      \"type\": \"Organization\"\n    },\n    {\n      \"name\": \"OpenAI DevDay\",\n      \"type\": \"Event\"\n    },\n    {\n      \"name\": \"San Francisco\",\n      \"type\": \"Location\"\n    },\n    {\n      \"name\": \"GPT-4\",\n      \"type\": \"AI Model\"\n    },\n    {\n      \"name\": \"GPT-3.5\",\n      \"type\": \"AI Model\"\n    },\n    {\n      \"name\": \"DALLE\",\n      \"type\": \"AI Model\"\n    },\n    {\n      \"name\": \"Whisper\",\n      \"type\": \"AI Model\"\n    },\n    {\n      \"name\": \"Sam Altman\",\n      \"type\": \"Person\",\n      \"role\": \"CEO\"\n    }\n  ]\n}",
        "tokens": 181,
        "summarizationTime": "PT2.0237664S"
      }
    ]
  },
  {
    "content": {
      "id": "e2258deb-ec61-416b-936f-8caf5cfee9c0"
    },
    "type": "QUESTIONS",
    "items": [
      {
        "text": "When and where is OpenAI's first developer conference taking place?",
        "tokens": 13,
        "summarizationTime": "PT1.1146973S"
      },
      {
        "text": "What is the purpose of OpenAI DevDay?",
        "tokens": 10,
        "summarizationTime": "PT1.1146973S"
      },
      {
        "text": "How many developers are currently using OpenAI's advanced models?",
        "tokens": 12,
        "summarizationTime": "PT1.1146973S"
      },
      {
        "text": "Where can people sign up to receive a notification when registration for OpenAI DevDay opens?",
        "tokens": 18,
        "summarizationTime": "PT1.1146973S"
      }
    ]
  }
]

Last updated