Document Metadata Queries
Content: Document Metadata Queries
User Intent
"How do I query documents by pages, author, file type, etc.?"
Operation
SDK Method:
queryContents()with document-specific patternsGraphQL:
queryContentsqueryEntity Type: Content (type: FILE, fileType: DOCUMENT)
Common Use Cases: Find PDFs, filter by page count, search by author, encrypted documents
Document Metadata Structure
Documents (PDFs, Word, Excel, PowerPoint) have metadata in the document field:
interface DocumentMetadata {
title: string;
subject: string;
summary: string;
author: string;
lastModifiedBy: string;
publisher: string;
description: string;
keywords: string[];
pageCount: number;
worksheetCount: number; // Excel
slideCount: number; // PowerPoint
wordCount: number;
lineCount: number;
paragraphCount: number;
isEncrypted: boolean;
hasDigitalSignature: boolean;
}TypeScript (Canonical)
Query Patterns
1. Filter by Document Type
2. Filter by Page Count
3. Filter by Author
4. Filter by File Size
5. Excel-Specific Queries
6. PowerPoint-Specific Queries
7. Encrypted Documents
8. Content Analysis
Query documents
docs = await graphlit.queryContents( filter=ContentFilterInput( types=[ContentTypes.File], file_types=[FileTypes.Document] ) )
PDFs only
pdfs = await graphlit.queryContents( filter=ContentFilterInput( file_extensions=['pdf'] ) )
Access metadata
for doc in docs.contents.results: if doc.document: print(f"{doc.name}: {doc.document.page_count} pages") print(f" Author: {doc.document.author}") print(f" Words: {doc.document.word_count}")
Developer Hints
Page Count is Automatic
Excel vs Word vs PowerPoint
Author from Document Properties
Common Issues & Solutions
Issue: Need to filter by exact page count Solution: Query all, filter client-side
Issue: Want PDFs only Solution: Use fileExtensions filter
Issue: Need to count documents by file extension Solution: Query and aggregate
Production Example
Last updated
Was this helpful?