Ingest Encoded File
User Intent
"I want to upload a file directly from memory/buffer without using a URL"
Operation
SDK Method:
graphlit.ingestEncodedFile()GraphQL:
ingestEncodedFilemutationEntity Type: Content
Common Use Cases: File uploads from web forms, email attachments, programmatically generated files, binary data
TypeScript (Canonical)
import { Graphlit } from 'graphlit-client';
import { ContentState, FileTypes } from 'graphlit-client/dist/generated/graphql-types';
import { readFileSync } from 'fs';
const graphlit = new Graphlit();
// Read file from disk
const fileBuffer = readFileSync('/path/to/document.pdf');
const base64Data = fileBuffer.toString('base64');
// Ingest encoded file
const response = await graphlit.ingestEncodedFile(
'document.pdf',
base64Data,
'application/pdf',
undefined,
undefined,
undefined,
undefined,
true,
{ id: workflowId },
[{ id: collectionId }],
undefined,
'upload-demo'
);
const contentId = response.ingestEncodedFile.id;
console.log(`File ingested: ${contentId}`);
// Retrieve the content
const content = await graphlit.getContent(contentId);
console.log(`File type: ${content.content.fileType}`);
console.log(`Markdown extracted: ${content.content.markdown?.substring(0, 100)}...`);Parameters
Required
name(string): Filename (including extension)Used to determine file type
Should include proper extension (.pdf, .docx, .jpg, etc.)
data(string): Base64-encoded file dataBinary file content encoded as base64 string
No size limit in API, but consider network constraints
mimeType(string): MIME type of the fileExamples:
application/pdf,image/jpeg,text/plain,application/vnd.openxmlformats-officedocument.wordprocessingml.document(DOCX)Must match the actual file type
Optional
fileCreationDate(DateTime): Original file creation datefileModifiedDate(DateTime): Original file modification dateid(string): Custom ID for the contentidentifier(string): Custom identifier for deduplicationisSynchronous(boolean): Wait for processing to completeDefault:
falseRecommended:
truefor immediate access to extracted content
workflow(EntityReferenceInput): Workflow for extraction/preparationcollections(EntityReferenceInput[]): Collections to add content toobservations(ObservationReferenceInput[]): Observations to linkcorrelationId(string): For tracking in production systems
Response
{
ingestEncodedFile: {
id: string; // Content ID
name: string; // Filename you provided
state: ContentState; // FINISHED (if synchronous)
type: ContentFILE; // Always FILE
fileType: FileTypes; // PDF, DOCX, IMAGE, AUDIO, VIDEO, etc.
mimeType: string; // MIME type you provided
markdown?: string; // Extracted text (for documents)
originalData?: string; // Base64 data (if stored)
}
}Developer Hints
ingestEncodedFile vs ingestUri
Source
File in memory/buffer
URL or file path
Encoding
Requires base64 encoding
No encoding needed
Use Case
File uploads, email attachments
Web scraping, public URLs
Network
Uploads file data to Graphlit
Graphlit downloads from URL
Size Limit
Network/timeout constraints
More efficient for large files
When to Use ingestEncodedFile
Use ingestEncodedFile when:
Handling file uploads from users (web forms, mobile apps)
Processing email attachments
Working with programmatically generated files
Files are in memory/buffer
No public URL available
Use ingestUri when:
File is at a public URL
File is very large (>100MB)
Want Graphlit to handle download
Base64 Encoding Guide
// Node.js (filesystem)
import { readFileSync } from 'fs';
const buffer = readFileSync('file.pdf');
const base64 = buffer.toString('base64');
// Browser (File input)
const file = input.files?.[0];
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader();
reader.onload = () => resolve((reader.result as string).split(',')[1]);
reader.readAsDataURL(file);
});MIME Type Reference
Common MIME types:
PDF:
application/pdfWord:
application/vnd.openxmlformats-officedocument.wordprocessingml.documentExcel:
application/vnd.openxmlformats-officedocument.spreadsheetml.sheetPowerPoint:
application/vnd.openxmlformats-officedocument.presentationml.presentationJPEG:
image/jpegPNG:
image/pngMP3:
audio/mpegMP4:
video/mp4Plain Text:
text/plain
Variations
1. Browser File Upload
Handle file uploads in web applications:
// React/Next.js component
async function handleFileUpload(event: React.ChangeEvent<HTMLInputElement>) {
const file = event.target.files?.[0];
if (!file) return;
// Convert to base64
const base64 = await new Promise<string>((resolve) => {
const reader = new FileReader();
reader.onload = () => {
const result = reader.result as string;
// Remove data URL prefix (data:mime;base64,)
const base64Data = result.split(',')[1];
resolve(base64Data);
};
reader.readAsDataURL(file);
});
// Ingest file
const response = await graphlit.ingestEncodedFile(
file.name,
base64,
file.type,
undefined,
undefined,
true
);
console.log(`File uploaded: ${response.ingestEncodedFile.id}`);
}2. Email Attachment Processing
Ingest email attachments:
// Process email with attachments
interface EmailAttachment {
filename: string;
mimeType: string;
data: Buffer;
}
async function processEmailAttachments(attachments: EmailAttachment[]) {
const contentIds: string[] = [];
for (const attachment of attachments) {
const base64Data = attachment.data.toString('base64');
const response = await graphlit.ingestEncodedFile(
attachment.filename,
attachment.mimeType,
base64Data,
undefined,
undefined,
false // Async for bulk processing
);
contentIds.push(response.ingestEncodedFile.id);
}
return contentIds;
}3. Ingesting with Workflow
Apply extraction during upload:
// Create workflow for document extraction
const workflowInput: WorkflowInput = {
name: 'Document Extraction',
preparation: {
jobs: [
{
connector: {
type: FilePreparationServiceTypes.ModelDocument,
modelDocument: {
includeImages: true // Better extraction for scanned PDFs
},
fileTypes: [FileTypes.Pdf]
}
}
]
}
};
const workflowResponse = await graphlit.createWorkflow(workflowInput);
// Read and encode file
const fileBuffer = fs.readFileSync('contract.pdf');
const base64Data = fileBuffer.toString('base64');
// Ingest with workflow
const response = await graphlit.ingestEncodedFile(
'contract.pdf',
'application/pdf',
base64Data,
{ id: workflowResponse.createWorkflow.id }, // Apply workflow
undefined,
true // Wait for extraction to complete
);
// Access extracted content
const content = await graphlit.getContent(response.ingestEncodedFile.id);
console.log(`Extracted text: ${content.content.markdown}`);4. Batch File Upload
Upload multiple files efficiently:
async function batchUploadFiles(filePaths: string[]) {
const uploadPromises = filePaths.map(async (filePath) => {
const fileBuffer = fs.readFileSync(filePath);
const base64Data = fileBuffer.toString('base64');
const fileName = filePath.split('/').pop() || 'unknown';
// Detect MIME type (simplified)
const ext = fileName.split('.').pop()?.toLowerCase();
const mimeType = getMimeType(ext || '');
return graphlit.ingestEncodedFile(
fileName,
mimeType,
base64Data,
undefined,
undefined,
false // Async for parallel uploads
);
});
const responses = await Promise.all(uploadPromises);
return responses.map(r => r.ingestEncodedFile.id);
}
function getMimeType(extension: string): string {
const mimeTypes: Record<string, string> = {
'pdf': 'application/pdf',
'docx': 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
'jpg': 'image/jpeg',
'jpeg': 'image/jpeg',
'png': 'image/png',
'txt': 'text/plain'
};
return mimeTypes[extension] || 'application/octet-stream';
}5. Ingesting Programmatically Generated Files
Upload files created in code:
// Generate a report and ingest it
import PDFDocument from 'pdfkit';
async function generateAndIngestReport() {
const doc = new PDFDocument();
const chunks: Buffer[] = [];
doc.on('data', (chunk) => chunks.push(chunk));
doc.on('end', async () => {
const pdfBuffer = Buffer.concat(chunks);
const base64Data = pdfBuffer.toString('base64');
const response = await graphlit.ingestEncodedFile(
'monthly-report.pdf',
'application/pdf',
base64Data,
undefined,
undefined,
true
);
console.log(`Report ingested: ${response.ingestEncodedFile.id}`);
});
// Generate PDF content
doc.fontSize(20).text('Monthly Report', 100, 100);
doc.fontSize(12).text('Data and analysis...', 100, 150);
doc.end();
}Common Issues
Issue: Invalid base64 data error
Solution: Ensure data is properly base64 encoded. Remove any data URL prefixes (data:mime;base64,).
Issue: Unsupported MIME type
Solution: Check MIME type spelling. Use exact MIME type strings from reference list above.
Issue: File ingested but no text extracted
Solution: Ensure file is not corrupted. For scanned PDFs, use a workflow with useVision: true.
Issue: Large file upload times out
Solution: For files >50MB, consider using ingestUri with a temporary signed URL instead, or split into chunks.
Issue: Filename has no extension
Solution: Add proper extension to name parameter. Graphlit uses extension to determine file type.
Production Example
Email attachment ingestion:
const response = await graphlit.ingestEncodedFile(
email.subject || 'Email Attachment',
'message/rfc822', // Email MIME type
base64EncodedEmail,
undefined,
undefined,
true
);File upload API endpoint pattern:
// Server-side file upload handler
await graphlit.ingestEncodedFile(
fileName,
mimeType,
base64Data, // From multipart form upload
workflow ? { id: workflow } : undefined,
collections?.map((id) => ({ id })),
isSynchronous
);Last updated
Was this helpful?