Create RSS Feed

Create RSS feed to ingest textual RSS posts.

RSS feeds are a useful source of continually updated online content. RSS can provide textual posts, which are consumed in an RSS reader app, but is also used for podcast distribution.

For more information on creating an RSS feed for a podcast, look here.

In this case, we will ingest an RSS feed of machine learning papers.

The createFeed mutation enables the creation of a feed by accepting the feed name, type and rss feed parameters and it returns essential details, including the ID, name, state, and type of the newly generated feed.

Depending on the specified type parameter, Graphlit requires the specific feed parameters including the RSS uri.

Mutation:

mutation CreateFeed($feed: FeedInput!) {
  createFeed(feed: $feed) {
    id
    name
    state
    type
  }
}

Variables:

{
  "feed": {
    "type": "RSS",
    "rss": {
      "uri": "https://export.arxiv.org/api/query?search_query=LLM+AND+cat:cs.CV"
    },
    "name": "ArXiV LLM"
  }
}

Response:

{
  "type": "RSS",
  "id": "a3b22d88-6d91-4ca3-97a2-d8c6c26d61e8",
  "name": "ArXiV LLM",
  "state": "ENABLED"
}

RSS Format

RSS is formatted as XML, and here is an example of the raw XML from an RSS URL.

RSS contains a series of posts, which each contain metadata such as title, summary, authors and published date.

Graphlit parses and stores the post metadata, including any hyperlinks to PDFs or other web pages.

All textual information from the RSS post will be added to the searchable Graphlit index.

<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <link href="http://arxiv.org/api/query?search_query%3DLLM%20AND%20cat%3Acs.CV%26id_list%3D%26start%3D0%26max_results%3D10" rel="self" type="application/atom+xml"/>
  <title type="html">ArXiv Query: search_query=LLM AND cat:cs.CV&amp;id_list=&amp;start=0&amp;max_results=10</title>
  <id>http://arxiv.org/api/prlIZICJV6gXJQHNWz1KUkVz50M</id>
  <updated>2023-07-05T00:00:00-04:00</updated>
  <opensearch:totalResults xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">141</opensearch:totalResults>
  <opensearch:startIndex xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">0</opensearch:startIndex>
  <opensearch:itemsPerPage xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/">10</opensearch:itemsPerPage>
  <entry>
    <id>http://arxiv.org/abs/2305.15023v2</id>
    <updated>2023-06-15T07:02:41Z</updated>
    <published>2023-05-24T11:06:15Z</published>
    <title>Cheap and Quick: Efficient Vision-Language Instruction Tuning for Large
  Language Models</title>
    <summary>  Recently, growing interest has been aroused in extending the multimodal
capability of large language models (LLMs), e.g., vision-language (VL)
learning, which is regarded as the next milestone of artificial general
intelligence. However, existing solutions are prohibitively expensive, which
not only need to optimize excessive parameters, but also require another
large-scale pre-training before VL instruction tuning. In this paper, we
propose a novel and affordable solution for the effective VL adaption of LLMs,
called Mixture-of-Modality Adaptation (MMA). Instead of using large neural
networks to connect the image encoder and LLM, MMA adopts lightweight modules,
i.e., adapters, to bridge the gap between LLMs and VL tasks, which also enables
the joint optimization of the image and language models. Meanwhile, MMA is also
equipped with a routing algorithm to help LLMs achieve an automatic shift
between single- and multi-modal instructions without compromising their ability
of natural language understanding. To validate MMA, we apply it to a recent LLM
called LLaMA and term this formed large vision-language instructed model as
LaVIN. To validate MMA and LaVIN, we conduct extensive experiments under two
setups, namely multimodal science question answering and multimodal dialogue.
The experimental results not only demonstrate the competitive performance and
the superior training efficiency of LaVIN than existing multimodal LLMs, but
also confirm its great potential as a general-purpose chatbot. More
importantly, the actual expenditure of LaVIN is extremely cheap, e.g., only 1.4
training hours with 3.8M trainable parameters, greatly confirming the
effectiveness of MMA. Our project is released at
https://luogen1996.github.io/lavin.
</summary>
    <author>
      <name>Gen Luo</name>
    </author>
    <author>
      <name>Yiyi Zhou</name>
    </author>
    <author>
      <name>Tianhe Ren</name>
    </author>
    <author>
      <name>Shengxin Chen</name>
    </author>
    <author>
      <name>Xiaoshuai Sun</name>
    </author>
    <author>
      <name>Rongrong Ji</name>
    </author>
    <link href="http://arxiv.org/abs/2305.15023v2" rel="alternate" type="text/html"/>
    <link title="pdf" href="http://arxiv.org/pdf/2305.15023v2" rel="related" type="application/pdf"/>
    <arxiv:primary_category xmlns:arxiv="http://arxiv.org/schemas/atom" term="cs.CV" scheme="http://arxiv.org/schemas/atom"/>
    <category term="cs.CV" scheme="http://arxiv.org/schemas/atom"/>
  </entry>
  <entry>
    <id>http://arxiv.org/abs/2305.13655v1</id>
    <updated>2023-05-23T03:59:06Z</updated>
    <published>2023-05-23T03:59:06Z</published>
    <title>LLM-grounded Diffusion: Enhancing Prompt Understanding of Text-to-Image
  Diffusion Models with Large Language Models</title>
    <summary>  Recent advancements in text-to-image generation with diffusion models have
yielded remarkable results synthesizing highly realistic and diverse images.
However, these models still encounter difficulties when generating images from
prompts that demand spatial or common sense reasoning. We propose to equip
diffusion models with enhanced reasoning capabilities by using off-the-shelf
pretrained large language models (LLMs) in a novel two-stage generation
process. First, we adapt an LLM to be a text-guided layout generator through
in-context learning. When provided with an image prompt, an LLM outputs a scene
layout in the form of bounding boxes along with corresponding individual
descriptions. Second, we steer a diffusion model with a novel controller to
generate images conditioned on the layout. Both stages utilize frozen
pretrained models without any LLM or diffusion model parameter optimization. We
validate the superiority of our design by demonstrating its ability to
outperform the base diffusion model in accurately generating images according
to prompts that necessitate both language and spatial reasoning. Additionally,
our method naturally allows dialog-based scene specification and is able to
handle prompts in a language that is not well-supported by the underlying
diffusion model.
</summary>
    <author>
      <name>Long Lian</name>
    </author>
    <author>
      <name>Boyi Li</name>
    </author>
    <author>
      <name>Adam Yala</name>
    </author>
    <author>
      <name>Trevor Darrell</name>
    </author>
    <arxiv:comment xmlns:arxiv="http://arxiv.org/schemas/atom">Work in progress</arxiv:comment>
    <link href="http://arxiv.org/abs/2305.13655v1" rel="alternate" type="text/html"/>
    <link title="pdf" href="http://arxiv.org/pdf/2305.13655v1" rel="related" type="application/pdf"/>
    <arxiv:primary_category xmlns:arxiv="http://arxiv.org/schemas/atom" term="cs.CV" scheme="http://arxiv.org/schemas/atom"/>
    <category term="cs.CV" scheme="http://arxiv.org/schemas/atom"/>
  </entry>
</feed>

Last updated