# Ingest Encoded File

## User Intent

"I want to upload a file directly from memory/buffer without using a URL"

## Operation

* **SDK Method**: `graphlit.ingestEncodedFile()`
* **GraphQL**: `ingestEncodedFile` mutation
* **Entity Type**: Content
* **Common Use Cases**: File uploads from web forms, email attachments, programmatically generated files, binary data

## TypeScript (Canonical)

```typescript
import { Graphlit } from 'graphlit-client';
import { ContentState, FileTypes } from 'graphlit-client/dist/generated/graphql-types';
import { readFileSync } from 'fs';

const graphlit = new Graphlit();

// Read file from disk
const fileBuffer = readFileSync('/path/to/document.pdf');
const base64Data = fileBuffer.toString('base64');

// Ingest encoded file
const response = await graphlit.ingestEncodedFile(
  'document.pdf',
  base64Data,
  'application/pdf',
  undefined,
  undefined,
  undefined,
  undefined,
  true,
  { id: workflowId },
  [{ id: collectionId }],
  undefined,
  'upload-demo'
);

const contentId = response.ingestEncodedFile.id;
console.log(`File ingested: ${contentId}`);

// Retrieve the content
const content = await graphlit.getContent(contentId);
console.log(`File type: ${content.content.fileType}`);
console.log(`Markdown extracted: ${content.content.markdown?.substring(0, 100)}...`);
```

## Parameters

### Required

* **`name`** (string): Filename (including extension)
  * Used to determine file type
  * Should include proper extension (.pdf, .docx, .jpg, etc.)
* **`data`** (string): Base64-encoded file data
  * Binary file content encoded as base64 string
  * No size limit in API, but consider network constraints
* **`mimeType`** (string): MIME type of the file
  * Examples: `application/pdf`, `image/jpeg`, `text/plain`, `application/vnd.openxmlformats-officedocument.wordprocessingml.document` (DOCX)
  * Must match the actual file type

### Optional

* **`fileCreationDate`** (DateTime): Original file creation date
* **`fileModifiedDate`** (DateTime): Original file modification date
* **`id`** (string): Custom ID for the content
* **`identifier`** (string): Custom identifier for deduplication
* **`isSynchronous`** (boolean): Wait for processing to complete
  * **Default**: `false`
  * **Recommended**: `true` for immediate access to extracted content
* **`workflow`** (EntityReferenceInput): Workflow for extraction/preparation
* **`collections`** (EntityReferenceInput\[]): Collections to add content to
* **`observations`** (ObservationReferenceInput\[]): Observations to link
* **`correlationId`** (string): For tracking in production systems

## Response

```typescript
{
  ingestEncodedFile: {
    id: string;              // Content ID
    name: string;            // Filename you provided
    state: ContentState;     // FINISHED (if synchronous)
    type: ContentFILE; // Always FILE
    fileType: FileTypes;     // PDF, DOCX, IMAGE, AUDIO, VIDEO, etc.
    mimeType: string;        // MIME type you provided
    markdown?: string;       // Extracted text (for documents)
    originalData?: string;   // Base64 data (if stored)
  }
}
```

## Developer Hints

### ingestEncodedFile vs ingestUri

| Aspect         | ingestEncodedFile               | ingestUri                      |
| -------------- | ------------------------------- | ------------------------------ |
| **Source**     | File in memory/buffer           | URL or file path               |
| **Encoding**   | Requires base64 encoding        | No encoding needed             |
| **Use Case**   | File uploads, email attachments | Web scraping, public URLs      |
| **Network**    | Uploads file data to Graphlit   | Graphlit downloads from URL    |
| **Size Limit** | Network/timeout constraints     | More efficient for large files |

### When to Use ingestEncodedFile

Use `ingestEncodedFile` when:

* Handling file uploads from users (web forms, mobile apps)
* Processing email attachments
* Working with programmatically generated files
* Files are in memory/buffer
* No public URL available

Use `ingestUri` when:

* File is at a public URL
* File is very large (>100MB)
* Want Graphlit to handle download

### Base64 Encoding Guide

```typescript
// Node.js (filesystem)
import { readFileSync } from 'fs';
const buffer = readFileSync('file.pdf');
const base64 = buffer.toString('base64');

// Browser (File input)
const file = input.files?.[0];
const base64 = await new Promise<string>((resolve) => {
  const reader = new FileReader();
  reader.onload = () => resolve((reader.result as string).split(',')[1]);
  reader.readAsDataURL(file);
});
```

### MIME Type Reference

Common MIME types:

* **PDF**: `application/pdf`
* **Word**: `application/vnd.openxmlformats-officedocument.wordprocessingml.document`
* **Excel**: `application/vnd.openxmlformats-officedocument.spreadsheetml.sheet`
* **PowerPoint**: `application/vnd.openxmlformats-officedocument.presentationml.presentation`
* **JPEG**: `image/jpeg`
* **PNG**: `image/png`
* **MP3**: `audio/mpeg`
* **MP4**: `video/mp4`
* **Plain Text**: `text/plain`

## Variations

### 1. Browser File Upload

Handle file uploads in web applications:

```typescript
// React/Next.js component
async function handleFileUpload(event: React.ChangeEvent<HTMLInputElement>) {
  const file = event.target.files?.[0];
  if (!file) return;

  // Convert to base64
  const base64 = await new Promise<string>((resolve) => {
    const reader = new FileReader();
    reader.onload = () => {
      const result = reader.result as string;
      // Remove data URL prefix (data:mime;base64,)
      const base64Data = result.split(',')[1];
      resolve(base64Data);
    };
    reader.readAsDataURL(file);
  });

  // Ingest file
  const response = await graphlit.ingestEncodedFile(
    file.name,
    base64,
    file.type,
    undefined,
    undefined,
    true
  );

  console.log(`File uploaded: ${response.ingestEncodedFile.id}`);
}
```

### 2. Email Attachment Processing

Ingest email attachments:

```typescript
// Process email with attachments
interface EmailAttachment {
  filename: string;
  mimeType: string;
  data: Buffer;
}

async function processEmailAttachments(attachments: EmailAttachment[]) {
  const contentIds: string[] = [];

  for (const attachment of attachments) {
    const base64Data = attachment.data.toString('base64');
    
    const response = await graphlit.ingestEncodedFile(
      attachment.filename,
      attachment.mimeType,
      base64Data,
      undefined,
      undefined,
      false  // Async for bulk processing
    );

    contentIds.push(response.ingestEncodedFile.id);
  }

  return contentIds;
}
```

### 3. Ingesting with Workflow

Apply extraction during upload:

```typescript
// Create workflow for document extraction
const workflowInput: WorkflowInput = {
  name: 'Document Extraction',
  preparation: {
    jobs: [
      {
        connector: {
          type: FilePreparationServiceTypes.ModelDocument,
          modelDocument: {
            includeImages: true  // Better extraction for scanned PDFs
          },
          fileTypes: [FileTypes.Document]
        }
      }
    ]
  }
};

const workflowResponse = await graphlit.createWorkflow(workflowInput);

// Read and encode file
const fileBuffer = fs.readFileSync('contract.pdf');
const base64Data = fileBuffer.toString('base64');

// Ingest with workflow
const response = await graphlit.ingestEncodedFile(
  'contract.pdf',
  'application/pdf',
  base64Data,
  { id: workflowResponse.createWorkflow.id },  // Apply workflow
  undefined,
  true  // Wait for extraction to complete
);

// Access extracted content
const content = await graphlit.getContent(response.ingestEncodedFile.id);
console.log(`Extracted text: ${content.content.markdown}`);
```

### 4. Batch File Upload

Upload multiple files efficiently:

```typescript
async function batchUploadFiles(filePaths: string[]) {
  const uploadPromises = filePaths.map(async (filePath) => {
    const fileBuffer = fs.readFileSync(filePath);
    const base64Data = fileBuffer.toString('base64');
    const fileName = filePath.split('/').pop() || 'unknown';
    
    // Detect MIME type (simplified)
    const ext = fileName.split('.').pop()?.toLowerCase();
    const mimeType = getMimeType(ext || '');

    return graphlit.ingestEncodedFile(
      fileName,
      mimeType,
      base64Data,
      undefined,
      undefined,
      false  // Async for parallel uploads
    );
  });

  const responses = await Promise.all(uploadPromises);
  return responses.map(r => r.ingestEncodedFile.id);
}

function getMimeType(extension: string): string {
  const mimeTypes: Record<string, string> = {
    'pdf': 'application/pdf',
    'docx': 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
    'jpg': 'image/jpeg',
    'jpeg': 'image/jpeg',
    'png': 'image/png',
    'txt': 'text/plain'
  };
  return mimeTypes[extension] || 'application/octet-stream';
}
```

### 5. Ingesting Programmatically Generated Files

Upload files created in code:

```typescript
// Generate a report and ingest it
import PDFDocument from 'pdfkit';

async function generateAndIngestReport() {
  const doc = new PDFDocument();
  const chunks: Buffer[] = [];

  doc.on('data', (chunk) => chunks.push(chunk));
  doc.on('end', async () => {
    const pdfBuffer = Buffer.concat(chunks);
    const base64Data = pdfBuffer.toString('base64');

    const response = await graphlit.ingestEncodedFile(
      'monthly-report.pdf',
      'application/pdf',
      base64Data,
      undefined,
      undefined,
      true
    );

    console.log(`Report ingested: ${response.ingestEncodedFile.id}`);
  });

  // Generate PDF content
  doc.fontSize(20).text('Monthly Report', 100, 100);
  doc.fontSize(12).text('Data and analysis...', 100, 150);
  doc.end();
}
```

## Common Issues

**Issue**: `Invalid base64 data` error\
**Solution**: Ensure data is properly base64 encoded. Remove any data URL prefixes (`data:mime;base64,`).

**Issue**: `Unsupported MIME type`\
**Solution**: Check MIME type spelling. Use exact MIME type strings from reference list above.

**Issue**: File ingested but no text extracted\
**Solution**: Ensure file is not corrupted. For scanned PDFs, use a workflow with `useVision: true`.

**Issue**: Large file upload times out\
**Solution**: For files >50MB, consider using `ingestUri` with a temporary signed URL instead, or split into chunks.

**Issue**: Filename has no extension\
**Solution**: Add proper extension to `name` parameter. Graphlit uses extension to determine file type.

## Production Example

**Email attachment ingestion**:

```typescript
const response = await graphlit.ingestEncodedFile(
  email.subject || 'Email Attachment',
  'message/rfc822',  // Email MIME type
  base64EncodedEmail,
  undefined,
  undefined,
  true
);
```

**File upload API endpoint pattern**:

```typescript
// Server-side file upload handler
await graphlit.ingestEncodedFile(
  fileName,
  mimeType,
  base64Data,  // From multipart form upload
  workflow ? { id: workflow } : undefined,
  collections?.map((id) => ({ id })),
  isSynchronous
);
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.graphlit.dev/api-guides/use-cases/content/content-ingest-encoded-file.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
