# Ingest URI (Basic)

## Content: Ingest URI (Basic)

### User Intent

"I want to ingest a document, web page, or file from a URL into Graphlit"

### Operation

* **SDK Method**: `graphlit.ingestUri()`
* **GraphQL**: `ingestUri` mutation
* **Entity Type**: Content
* **Common Use Cases**: PDF ingestion, web page extraction, audio/video transcription, image processing

### TypeScript (Canonical)

```typescript
import { Graphlit } from 'graphlit-client';
import { ContentState, ContentTypes, FileTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Basic ingestion (asynchronous - returns immediately)
const response = await graphlit.ingestUri(
  'https://example.com/document.pdf'
);

const contentId = response.ingestUri.id;
console.log(`Content ingestion started: ${contentId}`);

// Synchronous ingestion (waits for completion)
const syncResponse = await graphlit.ingestUri(
  'https://example.com/document.pdf',
  undefined, // workflow (optional)
  undefined, // collections (optional)
  true       // isSynchronous
);

const completedContentId = syncResponse.ingestUri.id;
console.log(`Content ingested and processed: ${completedContentId}`);

// Retrieve the ingested content
const content = await graphlit.getContent(completedContentId);
console.log(`Content name: ${content.content.name}`);
console.log(`Content type: ${content.content.type}`);
```

## Synchronous ingestion (snake\_case method names)

response = await graphlit.ingestUri( uri="<https://example.com/document.pdf>", is\_synchronous=True )

content\_id = response.ingest\_uri.id if response.ingest\_uri else None

````

**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Synchronous ingestion (PascalCase method names)
var response = await graphlit.IngestUri(
    uri: "https://example.com/document.pdf",
    isSynchronous: true
);

var contentId = response.IngestUri?.Id;
````

### Parameters

#### Required

* **`uri`** (string): URL of the content to ingest
  * Supports: HTTP/HTTPS URLs
  * File types: PDF, DOCX, images, audio, video, web pages, etc.

#### Optional

* **`workflow`** (EntityReferenceInput): Workflow ID for custom extraction/preparation
* **`collections`** (EntityReferenceInput\[]): Collections to assign content to
* **`isSynchronous`** (boolean): Wait for ingestion to complete (default: false)
* **`correlationId`** (string): For tracking ingestion in production systems

### Response

```typescript
{
  ingestUri: {
    id: string;              // Content ID
    name: string;            // Extracted filename
    state: ContentState;     // AWAITING_EXTRACTION, FINISHED, ERROR
    type: ContentTypes;      // FILE, PAGE, EMAIL, etc.
    fileType: FileTypes;     // PDF, DOCX, IMAGE, AUDIO, VIDEO
    mimeType: string;        // MIME type of the content
    uri: string;             // Original URI
    markdown?: string;       // Extracted text (if available)
  }
}
```

### Variations

#### 1. Asynchronous Ingestion with Polling (Production Pattern)

For high-volume ingestion, use asynchronous mode and poll for completion:

```typescript
// Start ingestion (returns immediately)
const response = await graphlit.ingestUri(
  'https://example.com/large-video.mp4',
  undefined,  // name (optional)
  undefined,  // id (optional)
  undefined,  // identifier (optional)
  false       // isSynchronous - async mode
);

const contentId = response.ingestUri.id;

// Poll for completion using isContentDone
let isDone = false;
while (!isDone) {
  const status = await graphlit.isContentDone(contentId);
  isDone = status.isContentDone.result || false;
  
  if (!isDone) {
    await new Promise(resolve => setTimeout(resolve, 5000)); // Wait 5 seconds
    console.log('Still processing...');
  }
}

console.log('Content processing complete!');

// Now fetch the fully processed content
const content = await graphlit.getContent(contentId);
console.log(`Processed: ${content.content.name}`);
```

#### 2. Ingestion with Collections

Organize content during ingestion:

```typescript
// Create or reference a collection
const collectionResponse = await graphlit.createCollection({
  name: 'Product Documentation'
});

// Ingest into collection
const response = await graphlit.ingestUri(
  'https://example.com/user-guide.pdf',
  undefined, // workflow
  [{ id: collectionResponse.createCollection.id }], // collections
  true // isSynchronous
);
```

#### 3. Ingestion with Custom Workflow

Apply extraction or preparation during ingestion:

```typescript
// Reference a workflow (e.g., for entity extraction)
const response = await graphlit.ingestUri(
  'https://example.com/contract.pdf',
  { id: 'workflow-id-here' }, // workflow
  undefined, // collections
  true // isSynchronous
);

// Content will be processed through the workflow
const content = await graphlit.getContent(response.ingestUri.id);
console.log(`Entities extracted: ${content.content.observations?.length || 0}`);
```

### Common Issues

**Issue**: `Error: Failed to download content from URI`\
**Solution**: Ensure the URL is publicly accessible or provide authentication via workflow configuration.

**Issue**: `Content state is ERROR`\
**Solution**: Check `content.error` for details. Common causes:

* Unsupported file format
* File too large (check project limits)
* Corrupt file
* Network timeout

**Issue**: Synchronous ingestion timing out\
**Solution**: For large files (>100MB), use asynchronous mode and poll for completion instead.

### Production Example

**Server-side ingestion with all options**:

```typescript
const response = await graphlit.ingestUri(
  uri,
  name,
  undefined,  // id
  undefined,  // identifier
  isSynchronous,
  workflow ? { id: workflow } : undefined,
  collections?.map(id => ({ id }))
);
```

**Conditional workflow application**:

```typescript
// Apply different workflows based on file type
// Assumes you have created workflows beforehand:
// const docWorkflow = await graphlit.createWorkflow({ name: "Document Processing", extraction: {...} });
// const documentWorkflowId = docWorkflow.createWorkflow.id;

const isDocument = uri.endsWith('.pdf') || uri.endsWith('.docx');
const workflowId = isDocument ? documentWorkflowId : undefined;

const response = await graphlit.ingestUri(
  uri,
  undefined,  // name (auto-generated)
  undefined,  // id
  undefined,  // identifier  
  true,       // isSynchronous
  workflowId ? { id: workflowId } : undefined
);
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.graphlit.dev/api-guides/use-cases/content/content-ingest-uri-basic.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
