@cdklabs/genai-idp-bedrock-llm-processor
Constructs
BedrockLlmProcessor
- Implements: IBedrockLlmProcessor
This processor implements an intelligent document processing workflow that uses Amazon Bedrock with Nova or Claude models for both page classification/grouping and information extraction.
The workflow consists of three main processing steps:
- OCR processing using Amazon Textract
- Page classification and grouping using Claude via Amazon Bedrock
- Field extraction using Claude via Amazon Bedrock
Initializers
import { BedrockLlmProcessor } from '@cdklabs/genai-idp-bedrock-llm-processor'
new BedrockLlmProcessor(scope: Construct, id: string, props: BedrockLlmProcessorProps)
Name | Type | Description |
---|---|---|
scope |
constructs.Construct |
No description. |
id |
string |
No description. |
props |
BedrockLlmProcessorProps |
No description. |
scope
Required
- Type: constructs.Construct
id
Required
- Type: string
props
Required
- Type: BedrockLlmProcessorProps
Methods
Name | Description |
---|---|
toString |
Returns a string representation of this construct. |
metricBedrockEmbeddingMaxRetriesExceeded |
Creates a CloudWatch metric for Bedrock embedding requests that exceeded max retries. |
metricBedrockEmbeddingNonRetryableErrors |
Creates a CloudWatch metric for Bedrock embedding non-retryable errors. |
metricBedrockEmbeddingRequestLatency |
Creates a CloudWatch metric for Bedrock embedding request latency. |
metricBedrockEmbeddingRequestsFailed |
Creates a CloudWatch metric for failed Bedrock embedding requests. |
metricBedrockEmbeddingRequestsSucceeded |
Creates a CloudWatch metric for successful Bedrock embedding requests. |
metricBedrockEmbeddingRequestsTotal |
Creates a CloudWatch metric for total Bedrock embedding requests. |
metricBedrockEmbeddingThrottles |
Creates a CloudWatch metric for Bedrock embedding request throttles. |
metricBedrockEmbeddingUnexpectedErrors |
Creates a CloudWatch metric for Bedrock embedding unexpected errors. |
metricBedrockMaxRetriesExceeded |
Creates a CloudWatch metric for Bedrock requests that exceeded max retries. |
metricBedrockNonRetryableErrors |
Creates a CloudWatch metric for Bedrock non-retryable errors. |
metricBedrockRequestLatency |
Creates a CloudWatch metric for Bedrock request latency. |
metricBedrockRequestsFailed |
Creates a CloudWatch metric for failed Bedrock requests. |
metricBedrockRequestsSucceeded |
Creates a CloudWatch metric for successful Bedrock requests. |
metricBedrockRequestsTotal |
Creates a CloudWatch metric for total Bedrock requests. |
metricBedrockRetrySuccess |
Creates a CloudWatch metric for successful Bedrock request retries. |
metricBedrockThrottles |
Creates a CloudWatch metric for Bedrock request throttles. |
metricBedrockTotalLatency |
Creates a CloudWatch metric for total Bedrock request latency. |
metricBedrockUnexpectedErrors |
Creates a CloudWatch metric for Bedrock unexpected errors. |
metricCacheReadInputTokens |
Creates a CloudWatch metric for cache read input tokens. |
metricCacheWriteInputTokens |
Creates a CloudWatch metric for cache write input tokens. |
metricInputDocumentPages |
Creates a CloudWatch metric for input document pages processed. |
metricInputDocuments |
Creates a CloudWatch metric for input documents processed. |
metricInputTokens |
Creates a CloudWatch metric for input tokens consumed. |
metricOutputTokens |
Creates a CloudWatch metric for output tokens generated. |
metricTotalTokens |
Creates a CloudWatch metric for total tokens used. |
toString
public toString(): string
Returns a string representation of this construct.
metricBedrockEmbeddingMaxRetriesExceeded
public metricBedrockEmbeddingMaxRetriesExceeded(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock embedding requests that exceeded max retries.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingNonRetryableErrors
public metricBedrockEmbeddingNonRetryableErrors(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock embedding non-retryable errors.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingRequestLatency
public metricBedrockEmbeddingRequestLatency(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock embedding request latency.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingRequestsFailed
public metricBedrockEmbeddingRequestsFailed(props?: MetricOptions): Metric
Creates a CloudWatch metric for failed Bedrock embedding requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingRequestsSucceeded
public metricBedrockEmbeddingRequestsSucceeded(props?: MetricOptions): Metric
Creates a CloudWatch metric for successful Bedrock embedding requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingRequestsTotal
public metricBedrockEmbeddingRequestsTotal(props?: MetricOptions): Metric
Creates a CloudWatch metric for total Bedrock embedding requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingThrottles
public metricBedrockEmbeddingThrottles(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock embedding request throttles.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockEmbeddingUnexpectedErrors
public metricBedrockEmbeddingUnexpectedErrors(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock embedding unexpected errors.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockMaxRetriesExceeded
public metricBedrockMaxRetriesExceeded(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock requests that exceeded max retries.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockNonRetryableErrors
public metricBedrockNonRetryableErrors(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock non-retryable errors.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockRequestLatency
public metricBedrockRequestLatency(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock request latency.
Measures individual request processing time.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockRequestsFailed
public metricBedrockRequestsFailed(props?: MetricOptions): Metric
Creates a CloudWatch metric for failed Bedrock requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockRequestsSucceeded
public metricBedrockRequestsSucceeded(props?: MetricOptions): Metric
Creates a CloudWatch metric for successful Bedrock requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockRequestsTotal
public metricBedrockRequestsTotal(props?: MetricOptions): Metric
Creates a CloudWatch metric for total Bedrock requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockRetrySuccess
public metricBedrockRetrySuccess(props?: MetricOptions): Metric
Creates a CloudWatch metric for successful Bedrock request retries.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockThrottles
public metricBedrockThrottles(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock request throttles.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockTotalLatency
public metricBedrockTotalLatency(props?: MetricOptions): Metric
Creates a CloudWatch metric for total Bedrock request latency.
Measures total request processing time including retries.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricBedrockUnexpectedErrors
public metricBedrockUnexpectedErrors(props?: MetricOptions): Metric
Creates a CloudWatch metric for Bedrock unexpected errors.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricCacheReadInputTokens
public metricCacheReadInputTokens(props?: MetricOptions): Metric
Creates a CloudWatch metric for cache read input tokens.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricCacheWriteInputTokens
public metricCacheWriteInputTokens(props?: MetricOptions): Metric
Creates a CloudWatch metric for cache write input tokens.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricInputDocumentPages
public metricInputDocumentPages(props?: MetricOptions): Metric
Creates a CloudWatch metric for input document pages processed.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricInputDocuments
public metricInputDocuments(props?: MetricOptions): Metric
Creates a CloudWatch metric for input documents processed.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricInputTokens
public metricInputTokens(props?: MetricOptions): Metric
Creates a CloudWatch metric for input tokens consumed.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricOutputTokens
public metricOutputTokens(props?: MetricOptions): Metric
Creates a CloudWatch metric for output tokens generated.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricTotalTokens
public metricTotalTokens(props?: MetricOptions): Metric
Creates a CloudWatch metric for total tokens used.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
Static Functions
Name | Description |
---|---|
isConstruct |
Checks if x is a construct. |
isConstruct
import { BedrockLlmProcessor } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessor.isConstruct(x: any)
Checks if x
is a construct.
Use this method instead of instanceof
to properly detect Construct
instances, even when the construct library is symlinked.
Explanation: in JavaScript, multiple copies of the constructs
library on
disk are seen as independent, completely different libraries. As a
consequence, the class Construct
in each copy of the constructs
library
is seen as a different class, and an instance of one class will not test as
instanceof
the other class. npm install
will not create installations
like this, but users may manually symlink construct libraries together or
use a monorepo tool: in those cases, multiple copies of the constructs
library can be accidentally installed, and instanceof
will behave
unpredictably. It is safest to avoid using instanceof
, and using
this type-testing method instead.
x
Required
- Type: any
Any object.
Properties
Name | Type | Description |
---|---|---|
node |
constructs.Node |
The tree node. |
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
stateMachine |
aws-cdk-lib.aws_stepfunctions.IStateMachine |
The Step Functions state machine that orchestrates the document processing workflow. |
node
Required
public readonly node: Node;
- Type: constructs.Node
The tree node.
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Required
public readonly maxProcessingConcurrency: number;
- Type: number
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
stateMachine
Required
public readonly stateMachine: IStateMachine;
- Type: aws-cdk-lib.aws_stepfunctions.IStateMachine
The Step Functions state machine that orchestrates the document processing workflow.
Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.
Structs
BedrockLlmProcessorConfigurationDefinitionOptions
Options for configuring the Bedrock LLM processor configuration definition.
Allows customization of classification, extraction, evaluation, summarization, and OCR stages.
Initializer
import { BedrockLlmProcessorConfigurationDefinitionOptions } from '@cdklabs/genai-idp-bedrock-llm-processor'
const bedrockLlmProcessorConfigurationDefinitionOptions: BedrockLlmProcessorConfigurationDefinitionOptions = { ... }
Properties
Name | Type | Description |
---|---|---|
assessmentModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the assessment stage. |
classificationMethod |
ClassificationMethod |
Optional classification method to use for document categorization. |
classificationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the classification stage. |
evaluationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the evaluation stage. |
extractionModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the extraction stage. |
ocrModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the OCR stage when using Bedrock-based OCR. |
summarizationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional model for the summarization stage. |
assessmentModel
Optional
public readonly assessmentModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the assessment stage.
classificationMethod
Optional
public readonly classificationMethod: ClassificationMethod;
- Type: ClassificationMethod
Optional classification method to use for document categorization.
Determines how documents are analyzed and categorized before extraction.
classificationModel
Optional
public readonly classificationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the classification stage.
evaluationModel
Optional
public readonly evaluationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the evaluation stage.
extractionModel
Optional
public readonly extractionModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the extraction stage.
ocrModel
Optional
public readonly ocrModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the OCR stage when using Bedrock-based OCR.
Only used when the OCR backend is set to 'bedrock' in the configuration.
summarizationModel
Optional
public readonly summarizationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional model for the summarization stage.
BedrockLlmProcessorProps
Configuration properties for the Bedrock LLM document processor.
Bedrock LLM Processor uses custom extraction with Amazon Bedrock models, providing flexible document processing capabilities for a wide range of document types. This processor is ideal when you need more control over the extraction process and want to implement custom classification and extraction logic using foundation models directly.
Bedrock LLM Processor offers a balance between customization and implementation complexity, allowing you to define custom extraction schemas and prompts while leveraging the power of Amazon Bedrock foundation models.
Initializer
import { BedrockLlmProcessorProps } from '@cdklabs/genai-idp-bedrock-llm-processor'
const bedrockLlmProcessorProps: BedrockLlmProcessorProps = { ... }
Properties
Name | Type | Description |
---|---|---|
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
configuration |
IBedrockLlmProcessorConfiguration |
Configuration for the Bedrock LLM document processor. |
assessmentGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to assessment model interactions. |
classificationGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to classification model interactions. |
classificationMaxWorkers |
number |
The maximum number of concurrent workers for document classification. |
customPromptGenerator |
@cdklabs/genai-idp.ICustomPromptGenerator |
Optional custom prompt generator for injecting business logic into extraction processing. |
enableHitl |
boolean |
Enable Human In The Loop (A2I) for document review. |
evaluationBaselineBucket |
aws-cdk-lib.aws_s3.IBucket |
Optional S3 bucket containing baseline documents for evaluation. |
extractionGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to extraction model interactions. |
ocrGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to OCR model interactions. |
ocrMaxWorkers |
number |
The maximum number of concurrent workers for OCR processing. |
sageMakerA2IReviewPortalUrl |
string |
Optional SageMaker A2I Review Portal URL for HITL workflows. |
summarizationGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to summarization model interactions. |
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Optional
public readonly maxProcessingConcurrency: number;
- Type: number
- Default: 100 concurrent workflows
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
configuration
Required
public readonly configuration: IBedrockLlmProcessorConfiguration;
Configuration for the Bedrock LLM document processor.
Provides customization options for the processing workflow, including schema definitions, prompts, and evaluation settings.
assessmentGuardrail
Optional
public readonly assessmentGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to assessment model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
classificationGuardrail
Optional
public readonly classificationGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to classification model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
classificationMaxWorkers
Optional
public readonly classificationMaxWorkers: number;
- Type: number
- Default: 20
The maximum number of concurrent workers for document classification.
Controls parallelism during the classification phase to optimize throughput while managing resource utilization.
customPromptGenerator
Optional
public readonly customPromptGenerator: ICustomPromptGenerator;
- Type: @cdklabs/genai-idp.ICustomPromptGenerator
- Default: No custom prompt generator is used
Optional custom prompt generator for injecting business logic into extraction processing.
When provided, this Lambda function will be called to customize prompts based on document content, business rules, or external system integrations.
enableHitl
Optional
public readonly enableHitl: boolean;
- Type: boolean
- Default: false
Enable Human In The Loop (A2I) for document review.
evaluationBaselineBucket
Optional
public readonly evaluationBaselineBucket: IBucket;
- Type: aws-cdk-lib.aws_s3.IBucket
- Default: No evaluation baseline bucket is configured
Optional S3 bucket containing baseline documents for evaluation.
Used as ground truth when evaluating extraction accuracy by comparing extraction results against known correct values.
extractionGuardrail
Optional
public readonly extractionGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to extraction model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
ocrGuardrail
Optional
public readonly ocrGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to OCR model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
ocrMaxWorkers
Optional
public readonly ocrMaxWorkers: number;
- Type: number
- Default: 20
The maximum number of concurrent workers for OCR processing.
Controls parallelism during the text extraction phase to optimize throughput while managing resource utilization.
sageMakerA2IReviewPortalUrl
Optional
public readonly sageMakerA2IReviewPortalUrl: string;
- Type: string
- Default: No A2I review portal URL is configured
Optional SageMaker A2I Review Portal URL for HITL workflows.
Used to provide human reviewers with access to the A2I review interface for document validation and correction workflows.
summarizationGuardrail
Optional
public readonly summarizationGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to summarization model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
Classes
BedrockLlmProcessorConfiguration
- Implements: IBedrockLlmProcessorConfiguration
Configuration management for Bedrock LLM document processing using custom extraction with Bedrock models.
This construct creates and manages the configuration for Bedrock LLM document processing, including schema definitions, classification prompts, extraction prompts, and configuration values. It provides a centralized way to manage document classes, extraction schemas, and model parameters.
Initializers
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
new BedrockLlmProcessorConfiguration(definition: IBedrockLlmProcessorConfigurationDefinition)
Name | Type | Description |
---|---|---|
definition |
IBedrockLlmProcessorConfigurationDefinition |
The configuration definition instance. |
definition
Required
The configuration definition instance.
Methods
Name | Description |
---|---|
bind |
Binds the configuration to a processor instance. |
bind
public bind(processor: IBedrockLlmProcessor): IBedrockLlmProcessorConfigurationDefinition
Binds the configuration to a processor instance.
This method applies the configuration to the processor.
processor
Required
- Type: IBedrockLlmProcessor
Static Functions
Name | Description |
---|---|
bankStatementSample |
Creates a configuration for bank statement processing. |
checkboxedAttributesExtraction |
Creates a configuration for checkbox extraction. |
criteriaValidation |
Creates a configuration for criteria validation. |
fewShotExampleWithMultimodalPageClassification |
Creates a configuration with few-shot examples and multimodal page classification. |
fromFile |
Creates a configuration from a YAML file. |
lendingPackageSample |
Creates a configuration for lending package processing. |
medicalRecordsSummarization |
Creates a configuration for medical records summarization. |
rvlCdipPackageSample |
Creates a configuration for RVL-CDIP package processing. |
rvlCdipPackageSampleWithFewShotExamples |
Creates a configuration for RVL-CDIP package processing with few-shot examples. |
bankStatementSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.bankStatementSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for bank statement processing.
options
Optional
Optional configuration options.
checkboxedAttributesExtraction
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.checkboxedAttributesExtraction(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for checkbox extraction.
options
Optional
Optional configuration options.
criteriaValidation
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.criteriaValidation(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for criteria validation.
options
Optional
Optional configuration options.
fewShotExampleWithMultimodalPageClassification
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.fewShotExampleWithMultimodalPageClassification(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration with few-shot examples and multimodal page classification.
options
Optional
Optional configuration options.
fromFile
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.fromFile(filePath: string, options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration from a YAML file.
filePath
Required
- Type: string
Path to the YAML configuration file.
options
Optional
Optional configuration options to override file settings.
lendingPackageSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.lendingPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for lending package processing.
options
Optional
Optional configuration options.
medicalRecordsSummarization
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.medicalRecordsSummarization(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for medical records summarization.
options
Optional
Optional configuration options.
rvlCdipPackageSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.rvlCdipPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for RVL-CDIP package processing.
options
Optional
Optional configuration options.
rvlCdipPackageSampleWithFewShotExamples
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfiguration.rvlCdipPackageSampleWithFewShotExamples(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration for RVL-CDIP package processing with few-shot examples.
options
Optional
Optional configuration options.
BedrockLlmProcessorConfigurationDefinition
Configuration definition for Pattern 2 document processing.
Provides methods to create and customize configuration for Bedrock LLM processing.
Initializers
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
new BedrockLlmProcessorConfigurationDefinition()
Name | Type | Description |
---|---|---|
Static Functions
Name | Description |
---|---|
bankStatementSample |
Creates a configuration definition for bank statement sample processing. |
checkboxedAttributesExtraction |
Creates a configuration definition optimized for checkbox attribute extraction. |
criteriaValidation |
Creates a configuration definition for criteria validation processing. |
fewShotExampleWithMultimodalPageClassification |
Creates a configuration definition with few-shot examples for multimodal page classification. |
fromFile |
Creates a configuration definition from a YAML file. |
lendingPackageSample |
Creates a configuration definition for lending package sample processing. |
medicalRecordsSummarization |
Creates a configuration definition optimized for medical records summarization. |
rvlCdipPackageSample |
Creates a configuration definition for RVL-CDIP package sample processing. |
rvlCdipPackageSampleWithFewShotExamples |
Creates a configuration definition for RVL-CDIP package sample with few-shot examples. |
bankStatementSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.bankStatementSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition for bank statement sample processing.
This configuration includes settings for classification, extraction, evaluation, and summarization optimized for bank statement documents.
options
Optional
Optional customization for processing stages.
checkboxedAttributesExtraction
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.checkboxedAttributesExtraction(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition optimized for checkbox attribute extraction.
This configuration includes specialized prompts and settings for detecting and extracting checkbox states from documents.
options
Optional
Optional customization for processing stages.
criteriaValidation
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.criteriaValidation(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition for criteria validation processing.
This configuration includes settings for validating documents against specific criteria and requirements.
options
Optional
Optional customization for processing stages.
fewShotExampleWithMultimodalPageClassification
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.fewShotExampleWithMultimodalPageClassification(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition with few-shot examples for multimodal page classification.
This configuration includes example prompts that demonstrate how to classify document pages using both visual and textual information.
options
Optional
Optional customization for processing stages.
fromFile
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.fromFile(filePath: string, options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition from a YAML file.
Allows users to provide custom configuration files for document processing.
filePath
Required
- Type: string
Path to the YAML configuration file.
options
Optional
Optional customization for processing stages.
lendingPackageSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.lendingPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition for lending package sample processing.
This configuration includes settings for classification, extraction, evaluation, and summarization optimized for lending documents.
options
Optional
Optional customization for processing stages.
medicalRecordsSummarization
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.medicalRecordsSummarization(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition optimized for medical records summarization.
This configuration includes specialized prompts and settings for extracting and summarizing key information from medical documents.
options
Optional
Optional customization for processing stages.
rvlCdipPackageSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.rvlCdipPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition for RVL-CDIP package sample processing.
This configuration includes settings for classification, extraction, evaluation, and summarization optimized for RVL-CDIP documents.
options
Optional
Optional customization for processing stages.
rvlCdipPackageSampleWithFewShotExamples
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'
BedrockLlmProcessorConfigurationDefinition.rvlCdipPackageSampleWithFewShotExamples(options?: BedrockLlmProcessorConfigurationDefinitionOptions)
Creates a configuration definition for RVL-CDIP package sample with few-shot examples.
This configuration includes few-shot examples to improve classification and extraction accuracy for RVL-CDIP documents.
options
Optional
Optional customization for processing stages.
BedrockLlmProcessorConfigurationSchema
- Implements: IBedrockLlmProcessorConfigurationSchema
Schema definition for Bedrock LLM processor configuration. Provides JSON Schema validation rules for the configuration UI and API.
This class defines the structure, validation rules, and UI presentation for the Bedrock LLM processor configuration, including document classes, attributes, classification settings, extraction parameters, evaluation criteria, and summarization options.
Initializers
import { BedrockLlmProcessorConfigurationSchema } from '@cdklabs/genai-idp-bedrock-llm-processor'
new BedrockLlmProcessorConfigurationSchema()
Name | Type | Description |
---|---|---|
Methods
Name | Description |
---|---|
bind |
Binds the configuration schema to a processor instance. |
bind
public bind(processor: IBedrockLlmProcessor): void
Binds the configuration schema to a processor instance.
Creates a custom resource that updates the schema in the configuration table.
processor
Required
- Type: IBedrockLlmProcessor
The Bedrock LLM document processor to apply the schema to.
Protocols
IBedrockLlmProcessor
-
Extends: @cdklabs/genai-idp.IDocumentProcessor
-
Implemented By: BedrockLlmProcessor, IBedrockLlmProcessor
Interface for Bedrock LLM document processor implementation.
Bedrock LLM Processor uses custom extraction with Amazon Bedrock models for flexible document processing. This processor provides more control over the extraction process and is ideal for custom document types or complex extraction needs that require fine-grained control over the processing workflow.
Use Bedrock LLM Processor when: - Processing custom or complex document types not well-handled by BDA Processor - You need more control over the extraction process and prompting - You want to leverage foundation models directly with custom prompts - You need to implement custom classification logic
Properties
Name | Type | Description |
---|---|---|
node |
constructs.Node |
The tree node. |
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
stateMachine |
aws-cdk-lib.aws_stepfunctions.IStateMachine |
The Step Functions state machine that orchestrates the document processing workflow. |
node
Required
public readonly node: Node;
- Type: constructs.Node
The tree node.
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Required
public readonly maxProcessingConcurrency: number;
- Type: number
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
stateMachine
Required
public readonly stateMachine: IStateMachine;
- Type: aws-cdk-lib.aws_stepfunctions.IStateMachine
The Step Functions state machine that orchestrates the document processing workflow.
Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.
IBedrockLlmProcessorConfiguration
- Implemented By: BedrockLlmProcessorConfiguration, IBedrockLlmProcessorConfiguration
Interface for Bedrock LLM document processor configuration.
Provides configuration management for custom extraction with Bedrock models.
Methods
Name | Description |
---|---|
bind |
Binds the configuration to a processor instance. |
bind
public bind(processor: IBedrockLlmProcessor): IBedrockLlmProcessorConfigurationDefinition
Binds the configuration to a processor instance.
This method applies the configuration to the processor.
processor
Required
- Type: IBedrockLlmProcessor
The Bedrock LLM document processor to apply to.
IBedrockLlmProcessorConfigurationDefinition
-
Extends: @cdklabs/genai-idp.IConfigurationDefinition
-
Implemented By: IBedrockLlmProcessorConfigurationDefinition
Properties
Name | Type | Description |
---|---|---|
classificationMethod |
ClassificationMethod |
The method used for document classification. |
classificationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
The invokable model used for document classification. |
extractionModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
The invokable model used for information extraction. |
ocrBackend |
string |
OCR backend to use for text extraction. |
assessmentModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for evaluating assessment results. |
evaluationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for evaluating extraction results. |
ocrModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for OCR when using Bedrock-based OCR. |
summarizationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for document summarization. |
classificationMethod
Required
public readonly classificationMethod: ClassificationMethod;
- Type: ClassificationMethod
- Default: as defined in the definition file
The method used for document classification.
Determines how documents are analyzed and categorized before extraction. Different methods offer varying levels of accuracy and performance.
classificationModel
Required
public readonly classificationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
The invokable model used for document classification.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Determines document types and categories based on content analysis, enabling targeted extraction strategies for different document types.
extractionModel
Required
public readonly extractionModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
The invokable model used for information extraction.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Extracts structured data from documents based on defined schemas, transforming unstructured content into structured information.
ocrBackend
Required
public readonly ocrBackend: string;
- Type: string
- Default: "textract"
OCR backend to use for text extraction.
Determines whether to use Amazon Textract or Bedrock for OCR processing.
assessmentModel
Optional
public readonly assessmentModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
Optional invokable model used for evaluating assessment results.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing assessment results against expected values.
evaluationModel
Optional
public readonly evaluationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
Optional invokable model used for evaluating extraction results.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing extraction results against expected values.
ocrModel
Optional
public readonly ocrModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
Optional invokable model used for OCR when using Bedrock-based OCR.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Only used when the OCR backend is set to 'bedrock' in the configuration. Provides vision-based text extraction capabilities for document processing.
summarizationModel
Optional
public readonly summarizationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
Optional invokable model used for document summarization.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. When provided, enables automatic generation of document summaries that capture key information from processed documents.
IBedrockLlmProcessorConfigurationSchema
Interface for Bedrock LLM configuration schema.
Defines the structure and validation rules for Bedrock LLM processor configuration.
Methods
Name | Description |
---|---|
bind |
Binds the configuration schema to a processor instance. |
bind
public bind(processor: IBedrockLlmProcessor): void
Binds the configuration schema to a processor instance.
This method applies the schema definition to the processor's configuration table.
processor
Required
- Type: IBedrockLlmProcessor
The Bedrock LLM document processor to apply the schema to.
Enums
ClassificationMethod
Defines the methods available for document classification in Pattern 2 processing.
Document classification is a critical step in the IDP workflow that determines how documents are categorized and processed. Different classification methods offer varying levels of accuracy, performance, and capabilities.
Members
Name | Description |
---|---|
MULTIMODAL_PAGE_LEVEL_CLASSIFICATION |
Uses multimodal models to classify documents at the page level. |
TEXTBASED_HOLISTIC_CLASSIFICATION |
Uses text-based analysis to classify the entire document holistically. Considers the full document text content for classification decisions. |
MULTIMODAL_PAGE_LEVEL_CLASSIFICATION
Uses multimodal models to classify documents at the page level.
Analyzes both text and visual elements on each page for classification.
This method is effective for documents where each page may belong to a different document type or category. It provides high accuracy for complex layouts by considering both textual content and visual structure of each page individually.
TEXTBASED_HOLISTIC_CLASSIFICATION
Uses text-based analysis to classify the entire document holistically. Considers the full document text content for classification decisions.
This method is more efficient and cost-effective as it only processes the extracted text. It works well for text-heavy documents where the document type is consistent across all pages and visual elements are less important for classification.