Create Preparation Workflow

Workflow: Create Preparation Workflow

User Intent

"I want to extract high-quality markdown from PDFs, images, or audio/video files"

Operation

  • SDK Method: graphlit.createWorkflow() with preparation stage

  • GraphQL: createWorkflow mutation

  • Entity Type: Workflow

  • Common Use Cases: PDF markdown extraction, document OCR, audio transcription, video processing

TypeScript (Canonical)

import { Graphlit } from 'graphlit-client';
import { EntityState, ModelServiceTypes, SpecificationTypes } from 'graphlit-client/dist/generated/graphql-types';

const graphlit = new Graphlit();

// Step 1: Create specification for preparation model
const specificationResponse = await graphlit.createSpecification({
  name: 'GPT-4o Vision for PDFs',
  type: SpecificationTypes.Preparation,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K
  }
});

const specId = specificationResponse.createSpecification.id;

// Step 2: Create preparation workflow
const workflowInput: WorkflowInput = {
  name: 'PDF Preparation with Vision',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: specId }
        }
      }
    }]
  }
};

const response = await graphlit.createWorkflow(workflowInput);
const workflowId = response.createWorkflow.id;

console.log(`Workflow created: ${workflowId}`);

// Step 3: Use workflow during PDF ingestion
const contentResponse = await graphlit.ingestUri(
  'https://example.com/document.pdf',
  undefined,  // name
  undefined,  // id
  undefined,  // identifier
  true,       // isSynchronous
  { id: workflowId }  // workflow
);

// Step 4: Get extracted markdown
const content = await graphlit.getContent(contentResponse.ingestUri.id);
console.log(content.content.markdown); // High-quality markdown from PDF

Create specification

spec_response = await graphlit.createSpecification( input_types.SpecificationInput( name="GPT-4o Vision for PDFs", type=SpecificationTypes.Preparation, service_type=ModelServiceTypes.OpenAi, open_ai=input_types.OpenAIModelPropertiesInput( model=OpenAiModels.Gpt4OMini_128K ) ) )

Create preparation workflow (snake_case)

workflow_input = input_types.WorkflowInput( name="PDF Preparation with Vision", preparation=input_types.PreparationWorkflowStageInput( jobs=[ input_types.PreparationWorkflowJobInput( connector=input_types.FilePreparationConnectorInput( type=FilePreparationServiceTypes.ModelDocument, model_document=input_types.ModelDocumentPreparationPropertiesInput( specification=input_types.EntityReferenceInput( id=spec_response.create_specification.id ) ) ) ) ] ) )

response = await graphlit.createWorkflow(workflow_input) workflow_id = response.create_workflow.id


**C#**:
```csharp
using Graphlit;

var client = new Graphlit();

// Create specification
var specResponse = await graphlit.CreateSpecification(new SpecificationInput {
    Name = "GPT-4o Vision for PDFs",
    Type = SpecificationTypes.Preparation,
    ServiceType = ModelServiceTypes.OpenAi,
    OpenAI = new OpenAIModelPropertiesInput {
        Model = OpenAiModels.Gpt4O_128K
    }
});

// Create preparation workflow (PascalCase)
var workflowInput = new WorkflowInput {
    Name = "PDF Preparation with Vision",
    Preparation = new PreparationWorkflowStageInput {
        Jobs = new[] {
            new PreparationWorkflowJobInput {
                Connector = new FilePreparationConnectorInput {
                    Type = FilePreparationServiceTypes.ModelDocument,
                    ModelDocument = new ModelDocumentPreparationPropertiesInput {
                        Specification = new EntityReferenceInput {
                            Id = specResponse.CreateSpecification.Id
                        }
                    }
                }
            }
        }
    }
};

var response = await graphlit.CreateWorkflow(workflowInput);
var workflowId = response.CreateWorkflow.Id;

Parameters

WorkflowInput (Required)

  • name (string): Workflow name

  • preparation (PreparationWorkflowStageInput): Preparation configuration

PreparationWorkflowStageInput

  • jobs (PreparationWorkflowJobInput[]): Array of preparation jobs

    • Multiple jobs for different file types

PreparationWorkflowJobInput

  • connector (FilePreparationConnectorInput): Preparation connector configuration

FilePreparationConnectorInput

  • type (FilePreparationServiceTypes): Preparation service type

    • MODEL_DOCUMENT - Vision models for PDFs/images (recommended)

    • DEEPGRAM - Audio transcription

    • ASSEMBLY_AI - Audio transcription

    • AZURE_DOCUMENT_INTELLIGENCE - Azure OCR

  • modelDocument (ModelDocumentPreparationPropertiesInput): Vision model config

    • specification (EntityReferenceInput): Reference to preparation specification

    • includeImages (boolean): Include images in markdown (default: true)

    • includeTables (boolean): Extract tables (default: true)

  • deepgram (DeepgramAudioPreparationPropertiesInput): Audio transcription config

    • model (DeepgramModels): e.g., NOVA_2

  • assemblyAI (AssemblyAIAudioPreparationPropertiesInput): Audio transcription config

Response

{
  createWorkflow: {
    id: string;                           // Workflow ID
    name: string;                         // Workflow name
    state: EntityState;                   // ENABLED
    preparation: {
      jobs: PreparationWorkflowJob[];
    }
  }
}

Developer Hints

Vision Models vs Traditional OCR

Vision Models (recommended for PDFs):

  • Use multimodal LLMs (GPT-4o, Claude Sonnet, Gemini)

  • Better layout understanding

  • Handles complex documents (tables, charts, multi-column)

  • Higher quality markdown output

  • More expensive per page

Traditional OCR:

  • Faster, cheaper

  • Good for simple text documents

  • May struggle with complex layouts

// Vision model (best quality)
const visionWorkflow = {
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: gpt4oSpecId }
        }
      }
    }]
  }
};

// Azure OCR (faster, cheaper)
const ocrWorkflow = {
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.AzureDocumentIntelligence,
        azureDocument: {
          model: AzureDocumentIntelligenceModels.Layout
        }
      }
    }]
  }
};

Best Vision Models for Preparation

Best for PDFs:

  • GPT-4o - Best overall, good speed/quality balance

  • Claude Sonnet 3.7 - Excellent for complex documents

  • Gemini 2.0 Flash - Fast, good quality, lower cost

Best for Audio/Video:

  • Deepgram Nova 2 - Fast, accurate transcription

  • Assembly AI - Good for speaker diarization

// GPT-4o for PDFs
const gpt4oSpec = await graphlit.createSpecification({
  name: 'GPT-4o Preparation',
  type: SpecificationTypes.Preparation,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K
  }
});

// Deepgram for audio
const audioWorkflow = {
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        deepgram: {
          model: DeepgramModels.Nova2
        }
      }
    }]
  }
};

Include Images and Tables

const workflowInput: WorkflowInput = {
  name: 'PDF with Images and Tables',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: specId },
          includeImages: true,   // Extract images
          includeTables: true    // Parse tables into markdown
        }
      }
    }]
  }
};

Multi-Job Preparation

// Different preparation for different file types
const workflowInput: WorkflowInput = {
  name: 'Multi-Format Preparation',
  preparation: {
    jobs: [
      {
        // Job 1: PDFs and images with vision model
        connector: {
          type: FilePreparationServiceTypes.ModelDocument,
          modelDocument: {
            specification: { id: visionSpecId }
          }
        }
      },
      {
        // Job 2: Audio/video with Deepgram
        connector: {
          type: FilePreparationServiceTypes.Deepgram,
          deepgram: {
            model: DeepgramModels.Nova2
          }
        }
      }
    ]
  }
};

Variations

1. Basic PDF Preparation

Simplest preparation workflow:

const workflowInput: WorkflowInput = {
  name: 'Basic PDF Prep',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: specId }
        }
      }
    }]
  }
};

const response = await graphlit.createWorkflow(workflowInput);

2. Audio Transcription with Deepgram

Transcribe audio files:

const workflowInput: WorkflowInput = {
  name: 'Audio Transcription',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.Deepgram,
        deepgram: {
          model: DeepgramModels.Nova2,
          enableSpeakerDiarization: true  // Identify speakers
        }
      }
    }]
  }
};

3. Combined Preparation + Extraction

Prepare content then extract entities:

const workflowInput: WorkflowInput = {
  name: 'Prepare and Extract',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: visionSpecId }
        }
      }
    }]
  },
  extraction: {
    jobs: [{
      connector: {
        type: EntityExtractionServiceTypes.ModelText,
        modelText: {
          specification: { id: extractionSpecId }
        }
      }
    }]
  }
};

// Preparation runs first, extraction after

4. High-Quality PDF Extraction with Claude

Use Claude for best quality:

// Create Claude specification
const claudeSpec = await graphlit.createSpecification({
  name: 'Claude Sonnet 3.7',
  type: SpecificationTypes.Preparation,
  serviceType: ModelServiceTypes.Anthropic,
  anthropic: {
    model: AnthropicModels.Claude_3_7Sonnet
  }
});

// Create workflow
const workflowInput: WorkflowInput = {
  name: 'High-Quality PDF Prep',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: claudeSpec.createSpecification.id },
          includeImages: true,
          includeTables: true
        }
      }
    }]
  }
};

5. Budget-Friendly with Gemini Flash

Lower cost with Gemini:

const geminiSpec = await graphlit.createSpecification({
  name: 'Gemini 2.0 Flash',
  type: SpecificationTypes.Preparation,
  serviceType: ModelServiceTypes.Google,
  google: {
    model: GoogleModels.Gemini_2_0_Flash
  }
});

const workflowInput: WorkflowInput = {
  name: 'Budget PDF Prep',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: geminiSpec.createSpecification.id }
        }
      }
    }]
  }
};

6. Azure Document Intelligence

Traditional OCR approach:

const workflowInput: WorkflowInput = {
  name: 'Azure OCR',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.AzureDocumentIntelligence,
        azureDocument: {
          model: AzureDocumentIntelligenceModels.Layout
        }
      }
    }]
  }
};

Common Issues

Issue: Poor markdown quality from PDFs Solution: Use vision models (GPT-4o, Claude) instead of traditional OCR. Enable includeImages and includeTables.

Issue: Specification not found error Solution: Create specification with type: SpecificationPreparation before creating workflow.

Issue: Preparation too slow Solution: Use faster models (Gemini Flash, GPT-4o-mini) or accept async processing.

Issue: Tables not extracted properly Solution: Ensure includeTables: true and use vision models. Traditional OCR struggles with tables.

Issue: Workflow doesn't apply to content Solution: Pass workflow during ingestUri(). Workflows only apply during ingestion, not retroactively.

Production Example

Complete preparation pipeline:

// 1. Create vision specification
const spec = await graphlit.createSpecification({
  name: 'GPT-4o Vision',
  type: SpecificationTypes.Preparation,
  serviceType: ModelServiceTypes.OpenAi,
  openAI: {
    model: OpenAiModels.Gpt4O_128K
  }
});

// 2. Create preparation workflow
const workflow = await graphlit.createWorkflow({
  name: 'PDF Preparation',
  preparation: {
    jobs: [{
      connector: {
        type: FilePreparationServiceTypes.ModelDocument,
        modelDocument: {
          specification: { id: spec.createSpecification.id },
          includeImages: true,
          includeTables: true
        }
      }
    }]
  }
});

// 3. Ingest PDF with workflow
const content = await graphlit.ingestUri(
  pdfUri,
  undefined, undefined, undefined,
  true,
  { id: workflow.createWorkflow.id }
);

// 4. Get extracted markdown
const result = await graphlit.getContent(content.ingestUri.id);
console.log(result.content.markdown); // High-quality markdown

Last updated

Was this helpful?