Skip to content

@cdklabs/genai-idp-bedrock-llm-processor

Constructs

BedrockLlmProcessor

This processor implements an intelligent document processing workflow that uses Amazon Bedrock with Nova or Claude models for both page classification/grouping and information extraction.

The workflow consists of three main processing steps:

  • OCR processing using Amazon Textract
  • Page classification and grouping using Claude via Amazon Bedrock
  • Field extraction using Claude via Amazon Bedrock

Initializers

import { BedrockLlmProcessor } from '@cdklabs/genai-idp-bedrock-llm-processor'

new BedrockLlmProcessor(scope: Construct, id: string, props: BedrockLlmProcessorProps)
Name Type Description
scope constructs.Construct No description.
id string No description.
props BedrockLlmProcessorProps No description.

scopeRequired
  • Type: constructs.Construct

idRequired
  • Type: string

propsRequired

Methods

Name Description
toString Returns a string representation of this construct.
metricBedrockEmbeddingMaxRetriesExceeded Creates a CloudWatch metric for Bedrock embedding requests that exceeded max retries.
metricBedrockEmbeddingNonRetryableErrors Creates a CloudWatch metric for Bedrock embedding non-retryable errors.
metricBedrockEmbeddingRequestLatency Creates a CloudWatch metric for Bedrock embedding request latency.
metricBedrockEmbeddingRequestsFailed Creates a CloudWatch metric for failed Bedrock embedding requests.
metricBedrockEmbeddingRequestsSucceeded Creates a CloudWatch metric for successful Bedrock embedding requests.
metricBedrockEmbeddingRequestsTotal Creates a CloudWatch metric for total Bedrock embedding requests.
metricBedrockEmbeddingThrottles Creates a CloudWatch metric for Bedrock embedding request throttles.
metricBedrockEmbeddingUnexpectedErrors Creates a CloudWatch metric for Bedrock embedding unexpected errors.
metricBedrockMaxRetriesExceeded Creates a CloudWatch metric for Bedrock requests that exceeded max retries.
metricBedrockNonRetryableErrors Creates a CloudWatch metric for Bedrock non-retryable errors.
metricBedrockRequestLatency Creates a CloudWatch metric for Bedrock request latency.
metricBedrockRequestsFailed Creates a CloudWatch metric for failed Bedrock requests.
metricBedrockRequestsSucceeded Creates a CloudWatch metric for successful Bedrock requests.
metricBedrockRequestsTotal Creates a CloudWatch metric for total Bedrock requests.
metricBedrockRetrySuccess Creates a CloudWatch metric for successful Bedrock request retries.
metricBedrockThrottles Creates a CloudWatch metric for Bedrock request throttles.
metricBedrockTotalLatency Creates a CloudWatch metric for total Bedrock request latency.
metricBedrockUnexpectedErrors Creates a CloudWatch metric for Bedrock unexpected errors.
metricCacheReadInputTokens Creates a CloudWatch metric for cache read input tokens.
metricCacheWriteInputTokens Creates a CloudWatch metric for cache write input tokens.
metricInputDocumentPages Creates a CloudWatch metric for input document pages processed.
metricInputDocuments Creates a CloudWatch metric for input documents processed.
metricInputTokens Creates a CloudWatch metric for input tokens consumed.
metricOutputTokens Creates a CloudWatch metric for output tokens generated.
metricTotalTokens Creates a CloudWatch metric for total tokens used.

toString
public toString(): string

Returns a string representation of this construct.

metricBedrockEmbeddingMaxRetriesExceeded
public metricBedrockEmbeddingMaxRetriesExceeded(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock embedding requests that exceeded max retries.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingNonRetryableErrors
public metricBedrockEmbeddingNonRetryableErrors(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock embedding non-retryable errors.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingRequestLatency
public metricBedrockEmbeddingRequestLatency(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock embedding request latency.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingRequestsFailed
public metricBedrockEmbeddingRequestsFailed(props?: MetricOptions): Metric

Creates a CloudWatch metric for failed Bedrock embedding requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingRequestsSucceeded
public metricBedrockEmbeddingRequestsSucceeded(props?: MetricOptions): Metric

Creates a CloudWatch metric for successful Bedrock embedding requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingRequestsTotal
public metricBedrockEmbeddingRequestsTotal(props?: MetricOptions): Metric

Creates a CloudWatch metric for total Bedrock embedding requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingThrottles
public metricBedrockEmbeddingThrottles(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock embedding request throttles.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockEmbeddingUnexpectedErrors
public metricBedrockEmbeddingUnexpectedErrors(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock embedding unexpected errors.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockMaxRetriesExceeded
public metricBedrockMaxRetriesExceeded(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock requests that exceeded max retries.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockNonRetryableErrors
public metricBedrockNonRetryableErrors(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock non-retryable errors.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockRequestLatency
public metricBedrockRequestLatency(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock request latency.

Measures individual request processing time.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockRequestsFailed
public metricBedrockRequestsFailed(props?: MetricOptions): Metric

Creates a CloudWatch metric for failed Bedrock requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockRequestsSucceeded
public metricBedrockRequestsSucceeded(props?: MetricOptions): Metric

Creates a CloudWatch metric for successful Bedrock requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockRequestsTotal
public metricBedrockRequestsTotal(props?: MetricOptions): Metric

Creates a CloudWatch metric for total Bedrock requests.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockRetrySuccess
public metricBedrockRetrySuccess(props?: MetricOptions): Metric

Creates a CloudWatch metric for successful Bedrock request retries.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockThrottles
public metricBedrockThrottles(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock request throttles.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockTotalLatency
public metricBedrockTotalLatency(props?: MetricOptions): Metric

Creates a CloudWatch metric for total Bedrock request latency.

Measures total request processing time including retries.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricBedrockUnexpectedErrors
public metricBedrockUnexpectedErrors(props?: MetricOptions): Metric

Creates a CloudWatch metric for Bedrock unexpected errors.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricCacheReadInputTokens
public metricCacheReadInputTokens(props?: MetricOptions): Metric

Creates a CloudWatch metric for cache read input tokens.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricCacheWriteInputTokens
public metricCacheWriteInputTokens(props?: MetricOptions): Metric

Creates a CloudWatch metric for cache write input tokens.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricInputDocumentPages
public metricInputDocumentPages(props?: MetricOptions): Metric

Creates a CloudWatch metric for input document pages processed.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricInputDocuments
public metricInputDocuments(props?: MetricOptions): Metric

Creates a CloudWatch metric for input documents processed.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricInputTokens
public metricInputTokens(props?: MetricOptions): Metric

Creates a CloudWatch metric for input tokens consumed.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricOutputTokens
public metricOutputTokens(props?: MetricOptions): Metric

Creates a CloudWatch metric for output tokens generated.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


metricTotalTokens
public metricTotalTokens(props?: MetricOptions): Metric

Creates a CloudWatch metric for total tokens used.

propsOptional
  • Type: aws-cdk-lib.aws_cloudwatch.MetricOptions

Optional metric configuration properties.


Static Functions

Name Description
isConstruct Checks if x is a construct.

isConstruct
import { BedrockLlmProcessor } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessor.isConstruct(x: any)

Checks if x is a construct.

Use this method instead of instanceof to properly detect Construct instances, even when the construct library is symlinked.

Explanation: in JavaScript, multiple copies of the constructs library on disk are seen as independent, completely different libraries. As a consequence, the class Construct in each copy of the constructs library is seen as a different class, and an instance of one class will not test as instanceof the other class. npm install will not create installations like this, but users may manually symlink construct libraries together or use a monorepo tool: in those cases, multiple copies of the constructs library can be accidentally installed, and instanceof will behave unpredictably. It is safest to avoid using instanceof, and using this type-testing method instead.

xRequired
  • Type: any

Any object.


Properties

Name Type Description
node constructs.Node The tree node.
environment @cdklabs/genai-idp.IProcessingEnvironment The processing environment that provides shared infrastructure and services.
maxProcessingConcurrency number The maximum number of documents that can be processed concurrently.
stateMachine aws-cdk-lib.aws_stepfunctions.IStateMachine The Step Functions state machine that orchestrates the document processing workflow.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


environmentRequired
public readonly environment: IProcessingEnvironment;
  • Type: @cdklabs/genai-idp.IProcessingEnvironment

The processing environment that provides shared infrastructure and services.

Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.


maxProcessingConcurrencyRequired
public readonly maxProcessingConcurrency: number;
  • Type: number

The maximum number of documents that can be processed concurrently.

Controls the throughput and resource utilization of the document processing system.


stateMachineRequired
public readonly stateMachine: IStateMachine;
  • Type: aws-cdk-lib.aws_stepfunctions.IStateMachine

The Step Functions state machine that orchestrates the document processing workflow.

Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.


Structs

BedrockLlmProcessorConfigurationDefinitionOptions

Options for configuring the Bedrock LLM processor configuration definition.

Allows customization of classification, extraction, evaluation, summarization, and OCR stages.

Initializer

import { BedrockLlmProcessorConfigurationDefinitionOptions } from '@cdklabs/genai-idp-bedrock-llm-processor'

const bedrockLlmProcessorConfigurationDefinitionOptions: BedrockLlmProcessorConfigurationDefinitionOptions = { ... }

Properties

Name Type Description
assessmentModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the assessment stage.
classificationMethod ClassificationMethod Optional classification method to use for document categorization.
classificationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the classification stage.
evaluationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the evaluation stage.
extractionModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the extraction stage.
ocrModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the OCR stage when using Bedrock-based OCR.
summarizationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional model for the summarization stage.

assessmentModelOptional
public readonly assessmentModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the assessment stage.


classificationMethodOptional
public readonly classificationMethod: ClassificationMethod;

Optional classification method to use for document categorization.

Determines how documents are analyzed and categorized before extraction.


classificationModelOptional
public readonly classificationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the classification stage.


evaluationModelOptional
public readonly evaluationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the evaluation stage.


extractionModelOptional
public readonly extractionModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the extraction stage.


ocrModelOptional
public readonly ocrModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the OCR stage when using Bedrock-based OCR.

Only used when the OCR backend is set to 'bedrock' in the configuration.


summarizationModelOptional
public readonly summarizationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable

Optional model for the summarization stage.


BedrockLlmProcessorProps

Configuration properties for the Bedrock LLM document processor.

Bedrock LLM Processor uses custom extraction with Amazon Bedrock models, providing flexible document processing capabilities for a wide range of document types. This processor is ideal when you need more control over the extraction process and want to implement custom classification and extraction logic using foundation models directly.

Bedrock LLM Processor offers a balance between customization and implementation complexity, allowing you to define custom extraction schemas and prompts while leveraging the power of Amazon Bedrock foundation models.

Initializer

import { BedrockLlmProcessorProps } from '@cdklabs/genai-idp-bedrock-llm-processor'

const bedrockLlmProcessorProps: BedrockLlmProcessorProps = { ... }

Properties

Name Type Description
environment @cdklabs/genai-idp.IProcessingEnvironment The processing environment that provides shared infrastructure and services.
maxProcessingConcurrency number The maximum number of documents that can be processed concurrently.
configuration IBedrockLlmProcessorConfiguration Configuration for the Bedrock LLM document processor.
assessmentGuardrail @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail Optional Bedrock guardrail to apply to assessment model interactions.
classificationGuardrail @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail Optional Bedrock guardrail to apply to classification model interactions.
classificationMaxWorkers number The maximum number of concurrent workers for document classification.
customPromptGenerator @cdklabs/genai-idp.ICustomPromptGenerator Optional custom prompt generator for injecting business logic into extraction processing.
enableHitl boolean Enable Human In The Loop (A2I) for document review.
evaluationBaselineBucket aws-cdk-lib.aws_s3.IBucket Optional S3 bucket containing baseline documents for evaluation.
extractionGuardrail @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail Optional Bedrock guardrail to apply to extraction model interactions.
ocrGuardrail @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail Optional Bedrock guardrail to apply to OCR model interactions.
ocrMaxWorkers number The maximum number of concurrent workers for OCR processing.
sageMakerA2IReviewPortalUrl string Optional SageMaker A2I Review Portal URL for HITL workflows.
summarizationGuardrail @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail Optional Bedrock guardrail to apply to summarization model interactions.

environmentRequired
public readonly environment: IProcessingEnvironment;
  • Type: @cdklabs/genai-idp.IProcessingEnvironment

The processing environment that provides shared infrastructure and services.

Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.


maxProcessingConcurrencyOptional
public readonly maxProcessingConcurrency: number;
  • Type: number
  • Default: 100 concurrent workflows

The maximum number of documents that can be processed concurrently.

Controls the throughput and resource utilization of the document processing system.


configurationRequired
public readonly configuration: IBedrockLlmProcessorConfiguration;

Configuration for the Bedrock LLM document processor.

Provides customization options for the processing workflow, including schema definitions, prompts, and evaluation settings.


assessmentGuardrailOptional
public readonly assessmentGuardrail: IGuardrail;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
  • Default: No guardrail is applied

Optional Bedrock guardrail to apply to assessment model interactions.

Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.


classificationGuardrailOptional
public readonly classificationGuardrail: IGuardrail;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
  • Default: No guardrail is applied

Optional Bedrock guardrail to apply to classification model interactions.

Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.


classificationMaxWorkersOptional
public readonly classificationMaxWorkers: number;
  • Type: number
  • Default: 20

The maximum number of concurrent workers for document classification.

Controls parallelism during the classification phase to optimize throughput while managing resource utilization.


customPromptGeneratorOptional
public readonly customPromptGenerator: ICustomPromptGenerator;
  • Type: @cdklabs/genai-idp.ICustomPromptGenerator
  • Default: No custom prompt generator is used

Optional custom prompt generator for injecting business logic into extraction processing.

When provided, this Lambda function will be called to customize prompts based on document content, business rules, or external system integrations.


enableHitlOptional
public readonly enableHitl: boolean;
  • Type: boolean
  • Default: false

Enable Human In The Loop (A2I) for document review.


evaluationBaselineBucketOptional
public readonly evaluationBaselineBucket: IBucket;
  • Type: aws-cdk-lib.aws_s3.IBucket
  • Default: No evaluation baseline bucket is configured

Optional S3 bucket containing baseline documents for evaluation.

Used as ground truth when evaluating extraction accuracy by comparing extraction results against known correct values.


extractionGuardrailOptional
public readonly extractionGuardrail: IGuardrail;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
  • Default: No guardrail is applied

Optional Bedrock guardrail to apply to extraction model interactions.

Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.


ocrGuardrailOptional
public readonly ocrGuardrail: IGuardrail;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
  • Default: No guardrail is applied

Optional Bedrock guardrail to apply to OCR model interactions.

Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.


ocrMaxWorkersOptional
public readonly ocrMaxWorkers: number;
  • Type: number
  • Default: 20

The maximum number of concurrent workers for OCR processing.

Controls parallelism during the text extraction phase to optimize throughput while managing resource utilization.


sageMakerA2IReviewPortalUrlOptional
public readonly sageMakerA2IReviewPortalUrl: string;
  • Type: string
  • Default: No A2I review portal URL is configured

Optional SageMaker A2I Review Portal URL for HITL workflows.

Used to provide human reviewers with access to the A2I review interface for document validation and correction workflows.


summarizationGuardrailOptional
public readonly summarizationGuardrail: IGuardrail;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
  • Default: No guardrail is applied

Optional Bedrock guardrail to apply to summarization model interactions.

Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.


Classes

BedrockLlmProcessorConfiguration

Configuration management for Bedrock LLM document processing using custom extraction with Bedrock models.

This construct creates and manages the configuration for Bedrock LLM document processing, including schema definitions, classification prompts, extraction prompts, and configuration values. It provides a centralized way to manage document classes, extraction schemas, and model parameters.

Initializers

import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

new BedrockLlmProcessorConfiguration(definition: IBedrockLlmProcessorConfigurationDefinition)
Name Type Description
definition IBedrockLlmProcessorConfigurationDefinition The configuration definition instance.

definitionRequired

The configuration definition instance.


Methods

Name Description
bind Binds the configuration to a processor instance.

bind
public bind(processor: IBedrockLlmProcessor): IBedrockLlmProcessorConfigurationDefinition

Binds the configuration to a processor instance.

This method applies the configuration to the processor.

processorRequired

Static Functions

Name Description
bankStatementSample Creates a configuration for bank statement processing.
checkboxedAttributesExtraction Creates a configuration for checkbox extraction.
criteriaValidation Creates a configuration for criteria validation.
fewShotExampleWithMultimodalPageClassification Creates a configuration with few-shot examples and multimodal page classification.
fromFile Creates a configuration from a YAML file.
lendingPackageSample Creates a configuration for lending package processing.
medicalRecordsSummarization Creates a configuration for medical records summarization.
rvlCdipPackageSample Creates a configuration for RVL-CDIP package processing.
rvlCdipPackageSampleWithFewShotExamples Creates a configuration for RVL-CDIP package processing with few-shot examples.

bankStatementSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.bankStatementSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for bank statement processing.

optionsOptional

Optional configuration options.


checkboxedAttributesExtraction
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.checkboxedAttributesExtraction(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for checkbox extraction.

optionsOptional

Optional configuration options.


criteriaValidation
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.criteriaValidation(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for criteria validation.

optionsOptional

Optional configuration options.


fewShotExampleWithMultimodalPageClassification
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.fewShotExampleWithMultimodalPageClassification(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration with few-shot examples and multimodal page classification.

optionsOptional

Optional configuration options.


fromFile
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.fromFile(filePath: string, options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration from a YAML file.

filePathRequired
  • Type: string

Path to the YAML configuration file.


optionsOptional

Optional configuration options to override file settings.


lendingPackageSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.lendingPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for lending package processing.

optionsOptional

Optional configuration options.


medicalRecordsSummarization
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.medicalRecordsSummarization(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for medical records summarization.

optionsOptional

Optional configuration options.


rvlCdipPackageSample
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.rvlCdipPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for RVL-CDIP package processing.

optionsOptional

Optional configuration options.


rvlCdipPackageSampleWithFewShotExamples
import { BedrockLlmProcessorConfiguration } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfiguration.rvlCdipPackageSampleWithFewShotExamples(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration for RVL-CDIP package processing with few-shot examples.

optionsOptional

Optional configuration options.


BedrockLlmProcessorConfigurationDefinition

Configuration definition for Pattern 2 document processing.

Provides methods to create and customize configuration for Bedrock LLM processing.

Initializers

import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

new BedrockLlmProcessorConfigurationDefinition()
Name Type Description

Static Functions

Name Description
bankStatementSample Creates a configuration definition for bank statement sample processing.
checkboxedAttributesExtraction Creates a configuration definition optimized for checkbox attribute extraction.
criteriaValidation Creates a configuration definition for criteria validation processing.
fewShotExampleWithMultimodalPageClassification Creates a configuration definition with few-shot examples for multimodal page classification.
fromFile Creates a configuration definition from a YAML file.
lendingPackageSample Creates a configuration definition for lending package sample processing.
medicalRecordsSummarization Creates a configuration definition optimized for medical records summarization.
rvlCdipPackageSample Creates a configuration definition for RVL-CDIP package sample processing.
rvlCdipPackageSampleWithFewShotExamples Creates a configuration definition for RVL-CDIP package sample with few-shot examples.

bankStatementSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.bankStatementSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition for bank statement sample processing.

This configuration includes settings for classification, extraction, evaluation, and summarization optimized for bank statement documents.

optionsOptional

Optional customization for processing stages.


checkboxedAttributesExtraction
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.checkboxedAttributesExtraction(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition optimized for checkbox attribute extraction.

This configuration includes specialized prompts and settings for detecting and extracting checkbox states from documents.

optionsOptional

Optional customization for processing stages.


criteriaValidation
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.criteriaValidation(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition for criteria validation processing.

This configuration includes settings for validating documents against specific criteria and requirements.

optionsOptional

Optional customization for processing stages.


fewShotExampleWithMultimodalPageClassification
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.fewShotExampleWithMultimodalPageClassification(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition with few-shot examples for multimodal page classification.

This configuration includes example prompts that demonstrate how to classify document pages using both visual and textual information.

optionsOptional

Optional customization for processing stages.


fromFile
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.fromFile(filePath: string, options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition from a YAML file.

Allows users to provide custom configuration files for document processing.

filePathRequired
  • Type: string

Path to the YAML configuration file.


optionsOptional

Optional customization for processing stages.


lendingPackageSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.lendingPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition for lending package sample processing.

This configuration includes settings for classification, extraction, evaluation, and summarization optimized for lending documents.

optionsOptional

Optional customization for processing stages.


medicalRecordsSummarization
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.medicalRecordsSummarization(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition optimized for medical records summarization.

This configuration includes specialized prompts and settings for extracting and summarizing key information from medical documents.

optionsOptional

Optional customization for processing stages.


rvlCdipPackageSample
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.rvlCdipPackageSample(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition for RVL-CDIP package sample processing.

This configuration includes settings for classification, extraction, evaluation, and summarization optimized for RVL-CDIP documents.

optionsOptional

Optional customization for processing stages.


rvlCdipPackageSampleWithFewShotExamples
import { BedrockLlmProcessorConfigurationDefinition } from '@cdklabs/genai-idp-bedrock-llm-processor'

BedrockLlmProcessorConfigurationDefinition.rvlCdipPackageSampleWithFewShotExamples(options?: BedrockLlmProcessorConfigurationDefinitionOptions)

Creates a configuration definition for RVL-CDIP package sample with few-shot examples.

This configuration includes few-shot examples to improve classification and extraction accuracy for RVL-CDIP documents.

optionsOptional

Optional customization for processing stages.


BedrockLlmProcessorConfigurationSchema

Schema definition for Bedrock LLM processor configuration. Provides JSON Schema validation rules for the configuration UI and API.

This class defines the structure, validation rules, and UI presentation for the Bedrock LLM processor configuration, including document classes, attributes, classification settings, extraction parameters, evaluation criteria, and summarization options.

Initializers

import { BedrockLlmProcessorConfigurationSchema } from '@cdklabs/genai-idp-bedrock-llm-processor'

new BedrockLlmProcessorConfigurationSchema()
Name Type Description

Methods

Name Description
bind Binds the configuration schema to a processor instance.

bind
public bind(processor: IBedrockLlmProcessor): void

Binds the configuration schema to a processor instance.

Creates a custom resource that updates the schema in the configuration table.

processorRequired

The Bedrock LLM document processor to apply the schema to.


Protocols

IBedrockLlmProcessor

Interface for Bedrock LLM document processor implementation.

Bedrock LLM Processor uses custom extraction with Amazon Bedrock models for flexible document processing. This processor provides more control over the extraction process and is ideal for custom document types or complex extraction needs that require fine-grained control over the processing workflow.

Use Bedrock LLM Processor when: - Processing custom or complex document types not well-handled by BDA Processor - You need more control over the extraction process and prompting - You want to leverage foundation models directly with custom prompts - You need to implement custom classification logic

Properties

Name Type Description
node constructs.Node The tree node.
environment @cdklabs/genai-idp.IProcessingEnvironment The processing environment that provides shared infrastructure and services.
maxProcessingConcurrency number The maximum number of documents that can be processed concurrently.
stateMachine aws-cdk-lib.aws_stepfunctions.IStateMachine The Step Functions state machine that orchestrates the document processing workflow.

nodeRequired
public readonly node: Node;
  • Type: constructs.Node

The tree node.


environmentRequired
public readonly environment: IProcessingEnvironment;
  • Type: @cdklabs/genai-idp.IProcessingEnvironment

The processing environment that provides shared infrastructure and services.

Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.


maxProcessingConcurrencyRequired
public readonly maxProcessingConcurrency: number;
  • Type: number

The maximum number of documents that can be processed concurrently.

Controls the throughput and resource utilization of the document processing system.


stateMachineRequired
public readonly stateMachine: IStateMachine;
  • Type: aws-cdk-lib.aws_stepfunctions.IStateMachine

The Step Functions state machine that orchestrates the document processing workflow.

Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.


IBedrockLlmProcessorConfiguration

Interface for Bedrock LLM document processor configuration.

Provides configuration management for custom extraction with Bedrock models.

Methods

Name Description
bind Binds the configuration to a processor instance.

bind
public bind(processor: IBedrockLlmProcessor): IBedrockLlmProcessorConfigurationDefinition

Binds the configuration to a processor instance.

This method applies the configuration to the processor.

processorRequired

The Bedrock LLM document processor to apply to.


IBedrockLlmProcessorConfigurationDefinition

Properties

Name Type Description
classificationMethod ClassificationMethod The method used for document classification.
classificationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable The invokable model used for document classification.
extractionModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable The invokable model used for information extraction.
ocrBackend string OCR backend to use for text extraction.
assessmentModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional invokable model used for evaluating assessment results.
evaluationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional invokable model used for evaluating extraction results.
ocrModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional invokable model used for OCR when using Bedrock-based OCR.
summarizationModel @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable Optional invokable model used for document summarization.

classificationMethodRequired
public readonly classificationMethod: ClassificationMethod;

The method used for document classification.

Determines how documents are analyzed and categorized before extraction. Different methods offer varying levels of accuracy and performance.


classificationModelRequired
public readonly classificationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

The invokable model used for document classification.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Determines document types and categories based on content analysis, enabling targeted extraction strategies for different document types.


extractionModelRequired
public readonly extractionModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

The invokable model used for information extraction.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Extracts structured data from documents based on defined schemas, transforming unstructured content into structured information.


ocrBackendRequired
public readonly ocrBackend: string;
  • Type: string
  • Default: "textract"

OCR backend to use for text extraction.

Determines whether to use Amazon Textract or Bedrock for OCR processing.


assessmentModelOptional
public readonly assessmentModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

Optional invokable model used for evaluating assessment results.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing assessment results against expected values.


evaluationModelOptional
public readonly evaluationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

Optional invokable model used for evaluating extraction results.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing extraction results against expected values.


ocrModelOptional
public readonly ocrModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

Optional invokable model used for OCR when using Bedrock-based OCR.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Only used when the OCR backend is set to 'bedrock' in the configuration. Provides vision-based text extraction capabilities for document processing.


summarizationModelOptional
public readonly summarizationModel: IInvokable;
  • Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
  • Default: as defined in the definition file

Optional invokable model used for document summarization.

Can be a Bedrock foundation model, Bedrock inference profile, or custom model. When provided, enables automatic generation of document summaries that capture key information from processed documents.


IBedrockLlmProcessorConfigurationSchema

Interface for Bedrock LLM configuration schema.

Defines the structure and validation rules for Bedrock LLM processor configuration.

Methods

Name Description
bind Binds the configuration schema to a processor instance.

bind
public bind(processor: IBedrockLlmProcessor): void

Binds the configuration schema to a processor instance.

This method applies the schema definition to the processor's configuration table.

processorRequired

The Bedrock LLM document processor to apply the schema to.


Enums

ClassificationMethod

Defines the methods available for document classification in Pattern 2 processing.

Document classification is a critical step in the IDP workflow that determines how documents are categorized and processed. Different classification methods offer varying levels of accuracy, performance, and capabilities.

Members

Name Description
MULTIMODAL_PAGE_LEVEL_CLASSIFICATION Uses multimodal models to classify documents at the page level.
TEXTBASED_HOLISTIC_CLASSIFICATION Uses text-based analysis to classify the entire document holistically. Considers the full document text content for classification decisions.

MULTIMODAL_PAGE_LEVEL_CLASSIFICATION

Uses multimodal models to classify documents at the page level.

Analyzes both text and visual elements on each page for classification.

This method is effective for documents where each page may belong to a different document type or category. It provides high accuracy for complex layouts by considering both textual content and visual structure of each page individually.


TEXTBASED_HOLISTIC_CLASSIFICATION

Uses text-based analysis to classify the entire document holistically. Considers the full document text content for classification decisions.

This method is more efficient and cost-effective as it only processes the extracted text. It works well for text-heavy documents where the document type is consistent across all pages and visual elements are less important for classification.