Document Processing
Overview
The Document Processing L3 constructs provides a layered architectural approach for intelligent document processing workflows. The system offers multiple implementation levels to provide various functionality and enable users to customized at varying layers depending on their requirements - from abstract base classes to fully-featured agentic processing with tool integration.
You can leverage the following constructs:
- BaseDocumentProcessing: Abstract foundation requiring custom step implementations
- BedrockDocumentProcessing: Ready-to-use genAI document processing implementation with Amazon Bedrock
- AgenticDocumentProcessing: Advanced agentic capabilities with S3 tool storage
All implementations share common infrastructure: Step Functions workflow, DynamoDB metadata storage, EventBridge integration, and built-in observability.
Components
The following are the key components of this L3 Construct:
Ingress Adapter
The ingress adapter is an interface that allows you to define where the data source would be coming from. There's a default implementation already that you can use as a reference: QueuedS3Adapter
.
The QueuedS3Adapter
basically does the following:
- Creates a new S3 Bucket (if one is not provided during instantiation)
- Creates 2 SQS Queues, the primary SQS Queue that would receive events from the S3 Bucket, and the Dead Letter Queue incase of processing failure.
- Creates a Lambda function that will consume from SQS and trigger the Document Processing State Machine.
- Provides State Machine
chain
to handle both success and failure scenarios. In the case of theQueuedS3Adapter
, the following are the expected behavior:- Success: move the file to the
processed
prefix and delete from theraw
prefix. - Failure: move the file to the
failed
prefix and delete from theraw
prefix.
- Success: move the file to the
- IAM
PolicyStatement
andKMS
encrypt and decrypt permissions for the classification/processing Lambda functions as well as the State Machine role.
If no Ingress Adapter is provided, the Document Processing workflow would use the QueuedS3Adapter
as the default implementation. That means that users would use S3 as the point of input for the document processing workflow to trigger.
Supporting other types of ingress (eg. streaming, micro-batching, even on-prem data sources) would require implementing the IAdapter
interface. Once implemented, instantiate the new ingress adapter and pass it to the document processing L3 construct.
Workflow
At a high-level, regardless which implementation you're using, the core workflow's structure are as follows:
- Classification: Determines document type/category for routing decisions
- Processing / Extraction: Extracts and processes information from the document
- Enrichment: Enhances extracted data with additional context or validation
- Post Processing: Final processing for formatting output or triggering downstream systems
Here is an example of the workflow and customisability points:
Payload Structure
For S3 based ingress, the following is an example payload that would be sent to the state machine:
{
"documentId": "auto-generated document id",
"contentType": "file",
"content": {
"location": "s3",
"bucket": "s3 bucket name",
"key": "s3 key including prefix",
"filename": "filename"
},
"eventTime": "s3 event time",
"eventName": "s3 event name",
"source": "sqs-consumer"
}
For non-file based ingress (eg. streaming), the following is an example payload:
{
"documentId": "auto-generated document id",
"contentType": "data",
"content": {
"data": "<content>"
},
"eventTime": "s3 event time",
"eventName": "s3 event name",
"source": "sqs-consumer"
}
Events (via EventBridge)
If an EventBridge broker is configured as part of the parameters of the document processing L3 Construct, the deployed workflow would automatically include points where the workflow would send events to the configured event bus.
The following are example structure of the event:
Successful
{
"Detail": {
"documentId": "sample-invoice-1759811188513",
"classification": "INVOICE",
"contentType": "file",
"content": "{\"location\":\"s3\",\"bucket\":\"bedrockdocumentprocessing-bedrockdocumentprocessin-24sh7hz30zoi\",\"key\":\"raw/sample-invoice.jpg\",\"filename\":\"sample-invoice.jpg\"}"
},
"DetailType": "document-processed-successful",
"EventBusName": "<ARN of the event bus>",
"Source": "intelligent-document-processing"
}
Failure
{
"Detail": {
"documentId": "sample-invoice-1759811188513",
"contentType": "file",
"content": "{\"location\":\"s3\",\"bucket\":\"bedrockdocumentprocessing-bedrockdocumentprocessin-24sh7hz30zoi\",\"key\":\"raw/sample-invoice.jpg\",\"filename\":\"sample-invoice.jpg\"}"
},
"DetailType": "document-processing-failed",
"EventBusName": "<ARN of the event bus>",
"Source": "intelligent-document-processing"
}
BaseDocumentProcessing
Construct
The BaseDocumentProcessing
construct is the foundational abstract class for all document processing implementations. It provides complete serverless document processing infrastructure and takes care of the following:
- Initializes and calls the necessary hooks to properly integrate the Ingress Adapter
- Initializes the DynamoDB metadata table
- Initializes and configures the various Observability related configuration
- Provides the core workflow scaffolding
Implementation Requirements
If you're directly extending this abstract class, you must provide concrete implementations of the following:
classificationStep()
: Document type classification (required)ResultPath
should be$.classificationResult
processingStep()
: Data extraction and processing (required)ResultPath
should be$.processingResult
enrichmentStep()
: Optional data enrichmentResultPath
should be$.enrichedResult
postProcessingStep()
: Optional final processingResultPath
should be$.postProcessedResult
Each function must return one of the following:
Configuration Options
- Ingress Adapter: Custom trigger mechanism (default:
QueuedS3Adapter
) - Workflow Timeout: Maximum execution time (default: 30 minutes)
- Network: Optional VPC deployment with subnet selection
- Encryption Key: Custom KMS key or auto-generated
- EventBridge Broker: Optional event publishing for integration
- Observability: Enable logging, tracing, and metrics
BedrockDocumentProcessing
Construct
The BedrockDocumentProcessing
construct extends BaseDocumentProcessing and uses Amazon Bedrock's InvokeModel for the classification and processing steps.
Key Features
- Inherits: All base infrastructure (S3, SQS, DynamoDB, Step Functions)
- Implements: Classification and processing steps using Bedrock models
- Adds: Cross-region inference, custom prompts, Lambda integration
Configuration Options
You can customize the following:
- Classification Model: Bedrock model for document classification (default: Claude 3.7 Sonnet)
- Processing Model: Bedrock model for data extraction (default: Claude 3.7 Sonnet)
- Custom Prompts: Override default classification and processing prompts
- Cross-Region Inference: Enable inference profiles for high availability
- Step Timeouts: Individual step timeout configuration (default: 5 minutes)
- Lambda Functions: Optional enrichment and post-processing functions
Example Implementations
AgenticDocumentProcessing
Construct
The AgenticDocumentProcessing
construct extends BedrockDocumentProcessing to provide advanced agentic capabilities with dynamic tool integration.
Key Features
- Inherits: All Bedrock functionality (models, prompts, cross-region inference)
- Reuses: Classification step from parent class unchanged
- Overrides: Processing step with agentic capabilities and tool integration
- Enhances: Memory allocation (1024MB) for complex tool operations
Tools (and their dependencies) can be provided as part of the parameter for this L3 construct, expanding what the agent can do.
Configuration Options
- Tools Bucket: S3 bucket containing processing tools and utilities
- Tools Location: Array of S3 paths to specific tool sets
- Agent System Prompt: Custom instructions for agent behavior
- Lambda Layers: Additional dependencies for tool execution
- Processing Prompt: Override default processing instructions