@cdklabs/genai-idp-sagemaker-udop-processor
Constructs
BasicSagemakerClassifier
A basic SageMaker-based document classifier for the Pattern 3 document processor.
This construct provides a simple way to deploy a SageMaker endpoint with a document classification model that can categorize documents based on their content and structure. It supports models like RVL-CDIP or UDOP for specialized document classification tasks.
The basic classifier includes standard auto-scaling capabilities and sensible defaults for common use cases. For more advanced configurations, consider creating your own SageMaker endpoint and passing it directly to the SagemakerUdopProcessor.
Example
const classifier = new BasicSagemakerClassifier(this, 'Classifier', {
outputBucket: bucket,
modelData: ModelData.fromAsset('./model'),
instanceType: InstanceType.ML_G4DN_XLARGE,
});
const processor = new SagemakerUdopProcessor(this, 'Processor', {
environment,
classifierEndpoint: classifier.endpoint,
// ... other configuration
});
Initializers
import { BasicSagemakerClassifier } from '@cdklabs/genai-idp-sagemaker-udop-processor'
new BasicSagemakerClassifier(scope: Construct, id: string, props: BasicSagemakerClassifierProps)
Name | Type | Description |
---|---|---|
scope |
constructs.Construct |
No description. |
id |
string |
No description. |
props |
BasicSagemakerClassifierProps |
No description. |
scope
Required
- Type: constructs.Construct
id
Required
- Type: string
props
Required
Methods
Name | Description |
---|---|
toString |
Returns a string representation of this construct. |
toString
public toString(): string
Returns a string representation of this construct.
Static Functions
Name | Description |
---|---|
isConstruct |
Checks if x is a construct. |
isConstruct
import { BasicSagemakerClassifier } from '@cdklabs/genai-idp-sagemaker-udop-processor'
BasicSagemakerClassifier.isConstruct(x: any)
Checks if x
is a construct.
Use this method instead of instanceof
to properly detect Construct
instances, even when the construct library is symlinked.
Explanation: in JavaScript, multiple copies of the constructs
library on
disk are seen as independent, completely different libraries. As a
consequence, the class Construct
in each copy of the constructs
library
is seen as a different class, and an instance of one class will not test as
instanceof
the other class. npm install
will not create installations
like this, but users may manually symlink construct libraries together or
use a monorepo tool: in those cases, multiple copies of the constructs
library can be accidentally installed, and instanceof
will behave
unpredictably. It is safest to avoid using instanceof
, and using
this type-testing method instead.
x
Required
- Type: any
Any object.
Properties
Name | Type | Description |
---|---|---|
node |
constructs.Node |
The tree node. |
endpoint |
@aws-cdk/aws-sagemaker-alpha.IEndpoint |
The SageMaker endpoint that hosts the document classification model. |
node
Required
public readonly node: Node;
- Type: constructs.Node
The tree node.
endpoint
Required
public readonly endpoint: IEndpoint;
- Type: @aws-cdk/aws-sagemaker-alpha.IEndpoint
The SageMaker endpoint that hosts the document classification model.
This endpoint is invoked during document processing to determine document types and categories.
SagemakerUdopProcessor
- Implements: ISagemakerUdopProcessor
SageMaker UDOP document processor implementation that uses specialized models for document processing.
This processor implements an intelligent document processing workflow that uses specialized models like UDOP (Unified Document Processing) or RVL-CDIP deployed on SageMaker for document classification, followed by foundation models for information extraction.
SageMaker UDOP Processor is ideal for specialized document types that require custom classification models beyond what's possible with foundation models alone, such as complex forms, technical documents, or domain-specific content. It provides the highest level of customization for document classification while maintaining the flexibility of foundation models for extraction.
Initializers
import { SagemakerUdopProcessor } from '@cdklabs/genai-idp-sagemaker-udop-processor'
new SagemakerUdopProcessor(scope: Construct, id: string, props: SagemakerUdopProcessorProps)
Name | Type | Description |
---|---|---|
scope |
constructs.Construct |
No description. |
id |
string |
No description. |
props |
SagemakerUdopProcessorProps |
No description. |
scope
Required
- Type: constructs.Construct
id
Required
- Type: string
props
Required
Methods
Name | Description |
---|---|
toString |
Returns a string representation of this construct. |
metricClassificationRequestsTotal |
Creates a CloudWatch metric for total classification requests. |
metricInputDocumentPages |
Creates a CloudWatch metric for input document pages processed. |
metricInputDocuments |
Creates a CloudWatch metric for input documents processed. |
toString
public toString(): string
Returns a string representation of this construct.
metricClassificationRequestsTotal
public metricClassificationRequestsTotal(props?: MetricOptions): Metric
Creates a CloudWatch metric for total classification requests.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricInputDocumentPages
public metricInputDocumentPages(props?: MetricOptions): Metric
Creates a CloudWatch metric for input document pages processed.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
metricInputDocuments
public metricInputDocuments(props?: MetricOptions): Metric
Creates a CloudWatch metric for input documents processed.
props
Optional
- Type: aws-cdk-lib.aws_cloudwatch.MetricOptions
Optional metric configuration properties.
Static Functions
Name | Description |
---|---|
isConstruct |
Checks if x is a construct. |
isConstruct
import { SagemakerUdopProcessor } from '@cdklabs/genai-idp-sagemaker-udop-processor'
SagemakerUdopProcessor.isConstruct(x: any)
Checks if x
is a construct.
Use this method instead of instanceof
to properly detect Construct
instances, even when the construct library is symlinked.
Explanation: in JavaScript, multiple copies of the constructs
library on
disk are seen as independent, completely different libraries. As a
consequence, the class Construct
in each copy of the constructs
library
is seen as a different class, and an instance of one class will not test as
instanceof
the other class. npm install
will not create installations
like this, but users may manually symlink construct libraries together or
use a monorepo tool: in those cases, multiple copies of the constructs
library can be accidentally installed, and instanceof
will behave
unpredictably. It is safest to avoid using instanceof
, and using
this type-testing method instead.
x
Required
- Type: any
Any object.
Properties
Name | Type | Description |
---|---|---|
node |
constructs.Node |
The tree node. |
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
stateMachine |
aws-cdk-lib.aws_stepfunctions.IStateMachine |
The Step Functions state machine that orchestrates the document processing workflow. |
node
Required
public readonly node: Node;
- Type: constructs.Node
The tree node.
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Required
public readonly maxProcessingConcurrency: number;
- Type: number
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
stateMachine
Required
public readonly stateMachine: IStateMachine;
- Type: aws-cdk-lib.aws_stepfunctions.IStateMachine
The Step Functions state machine that orchestrates the document processing workflow.
Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.
Structs
BasicSagemakerClassifierProps
Configuration properties for the basic SageMaker-based document classifier.
This classifier uses a SageMaker endpoint to categorize documents based on their content and structure, enabling targeted extraction strategies.
Initializer
import { BasicSagemakerClassifierProps } from '@cdklabs/genai-idp-sagemaker-udop-processor'
const basicSagemakerClassifierProps: BasicSagemakerClassifierProps = { ... }
Properties
Name | Type | Description |
---|---|---|
instanceType |
@aws-cdk/aws-sagemaker-alpha.InstanceType |
The instance type to use for the SageMaker endpoint. |
modelData |
@aws-cdk/aws-sagemaker-alpha.ModelData |
The model data for the SageMaker endpoint. |
outputBucket |
aws-cdk-lib.aws_s3.IBucket |
The S3 bucket where classification outputs will be stored. |
key |
aws-cdk-lib.aws_kms.IKey |
Optional KMS key for encrypting classifier resources. |
maxInstanceCount |
number |
The maximum number of instances for the SageMaker endpoint. |
minInstanceCount |
number |
The minimum number of instances for the SageMaker endpoint. |
scaleInCooldown |
aws-cdk-lib.Duration |
The cooldown period after scaling in before another scale-in action can occur. |
scaleOutCooldown |
aws-cdk-lib.Duration |
The cooldown period after scaling out before another scale-out action can occur. |
targetInvocationsPerInstancePerMinute |
number |
The target number of invocations per instance per minute. |
instanceType
Required
public readonly instanceType: InstanceType;
- Type: @aws-cdk/aws-sagemaker-alpha.InstanceType
The instance type to use for the SageMaker endpoint.
Determines the computational resources available for document classification. For deep learning models, GPU instances are typically recommended.
modelData
Required
public readonly modelData: ModelData;
- Type: @aws-cdk/aws-sagemaker-alpha.ModelData
The model data for the SageMaker endpoint.
Contains the trained model artifacts that will be deployed to the endpoint. This can be a pre-trained document classification model like RVL-CDIP or UDOP.
outputBucket
Required
public readonly outputBucket: IBucket;
- Type: aws-cdk-lib.aws_s3.IBucket
The S3 bucket where classification outputs will be stored.
Contains intermediate results from the document classification process.
key
Optional
public readonly key: IKey;
- Type: aws-cdk-lib.aws_kms.IKey
Optional KMS key for encrypting classifier resources.
When provided, ensures data security for the SageMaker endpoint and associated resources.
maxInstanceCount
Optional
public readonly maxInstanceCount: number;
- Type: number
- Default: 4
The maximum number of instances for the SageMaker endpoint.
Controls the maximum capacity for document classification during high load.
minInstanceCount
Optional
public readonly minInstanceCount: number;
- Type: number
- Default: 1
The minimum number of instances for the SageMaker endpoint.
Controls the baseline capacity for document classification.
scaleInCooldown
Optional
public readonly scaleInCooldown: Duration;
- Type: aws-cdk-lib.Duration
- Default: cdk.Duration.minutes(5)
The cooldown period after scaling in before another scale-in action can occur.
Prevents rapid fluctuations in endpoint capacity.
scaleOutCooldown
Optional
public readonly scaleOutCooldown: Duration;
- Type: aws-cdk-lib.Duration
- Default: cdk.Duration.minutes(1)
The cooldown period after scaling out before another scale-out action can occur.
Prevents rapid fluctuations in endpoint capacity.
targetInvocationsPerInstancePerMinute
Optional
public readonly targetInvocationsPerInstancePerMinute: number;
- Type: number
- Default: 20
The target number of invocations per instance per minute.
Used to determine when to scale the endpoint in or out.
SagemakerUdopProcessorConfigurationDefinitionOptions
Options for configuring the SageMaker UDOP processor configuration definition.
Allows customization of extraction, evaluation, and summarization stages.
Initializer
import { SagemakerUdopProcessorConfigurationDefinitionOptions } from '@cdklabs/genai-idp-sagemaker-udop-processor'
const sagemakerUdopProcessorConfigurationDefinitionOptions: SagemakerUdopProcessorConfigurationDefinitionOptions = { ... }
Properties
Name | Type | Description |
---|---|---|
assessmentModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for evaluating assessment results. |
evaluationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional configuration for the evaluation stage. |
extractionModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional configuration for the extraction stage. |
summarizationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional configuration for the summarization stage. |
assessmentModel
Optional
public readonly assessmentModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
- Default: as defined in the definition file
Optional invokable model used for evaluating assessment results.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing assessment results against expected values.
evaluationModel
Optional
public readonly evaluationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional configuration for the evaluation stage.
Defines the model and parameters used for evaluating extraction accuracy.
extractionModel
Optional
public readonly extractionModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional configuration for the extraction stage.
Defines the model and parameters used for information extraction.
summarizationModel
Optional
public readonly summarizationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional configuration for the summarization stage.
Defines the model and parameters used for generating document summaries.
SagemakerUdopProcessorProps
Configuration properties for the SageMaker UDOP document processor.
SageMaker UDOP Processor uses specialized document processing with SageMaker endpoints for document classification, combined with foundation models for extraction. This processor is ideal for specialized document types that require custom classification models for accurate document categorization before extraction.
SageMaker UDOP Processor offers the highest level of customization for document processing, allowing you to deploy and use specialized models for document classification while still leveraging foundation models for extraction tasks. This processor is particularly useful for domain-specific document processing needs.
Initializer
import { SagemakerUdopProcessorProps } from '@cdklabs/genai-idp-sagemaker-udop-processor'
const sagemakerUdopProcessorProps: SagemakerUdopProcessorProps = { ... }
Properties
Name | Type | Description |
---|---|---|
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
classifierEndpoint |
@aws-cdk/aws-sagemaker-alpha.IEndpoint |
The SageMaker endpoint used for document classification. |
configuration |
ISagemakerUdopProcessorConfiguration |
Configuration for the SageMaker UDOP document processor. |
assessmentGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to assessment model interactions. |
classificationGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to classification model interactions. |
customPromptGenerator |
@cdklabs/genai-idp.ICustomPromptGenerator |
Optional custom prompt generator for injecting business logic into extraction processing. |
evaluationBaselineBucket |
aws-cdk-lib.aws_s3.IBucket |
Optional S3 bucket containing baseline documents for evaluation. |
evaluationEnabled |
boolean |
Controls whether extraction results are evaluated for accuracy. |
extractionGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to extraction model interactions. |
ocrMaxWorkers |
number |
The maximum number of concurrent workers for OCR processing. |
summarizationGuardrail |
@cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail |
Optional Bedrock guardrail to apply to summarization model interactions. |
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Optional
public readonly maxProcessingConcurrency: number;
- Type: number
- Default: 100 concurrent workflows
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
classifierEndpoint
Required
public readonly classifierEndpoint: IEndpoint;
- Type: @aws-cdk/aws-sagemaker-alpha.IEndpoint
The SageMaker endpoint used for document classification.
Determines document types based on content and structure analysis using specialized models like RVL-CDIP or UDOP deployed on SageMaker.
This is a key component of Pattern 3, enabling specialized document classification beyond what's possible with foundation models alone. Users can create their own SageMaker endpoint using any method (CDK constructs, existing endpoints, etc.) and pass it directly to the processor.
configuration
Required
public readonly configuration: ISagemakerUdopProcessorConfiguration;
Configuration for the SageMaker UDOP document processor.
Provides customization options for the processing workflow, including schema definitions, prompts, and evaluation settings.
assessmentGuardrail
Optional
public readonly assessmentGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to assessment model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
classificationGuardrail
Optional
public readonly classificationGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to classification model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
customPromptGenerator
Optional
public readonly customPromptGenerator: ICustomPromptGenerator;
- Type: @cdklabs/genai-idp.ICustomPromptGenerator
- Default: No custom prompt generator is used
Optional custom prompt generator for injecting business logic into extraction processing.
When provided, this Lambda function will be called to customize prompts based on document content, business rules, or external system integrations.
evaluationBaselineBucket
Optional
public readonly evaluationBaselineBucket: IBucket;
- Type: aws-cdk-lib.aws_s3.IBucket
- Default: No evaluation baseline bucket is configured
Optional S3 bucket containing baseline documents for evaluation.
Used as ground truth when evaluating extraction accuracy by comparing extraction results against known correct values.
Required when evaluationEnabled is true.
evaluationEnabled
Optional
public readonly evaluationEnabled: boolean;
- Type: boolean
- Default: false
Controls whether extraction results are evaluated for accuracy.
When enabled, compares extraction results against expected values to measure extraction quality and identify improvement areas.
extractionGuardrail
Optional
public readonly extractionGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to extraction model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
ocrMaxWorkers
Optional
public readonly ocrMaxWorkers: number;
- Type: number
- Default: 20
The maximum number of concurrent workers for OCR processing.
Controls parallelism during the text extraction phase to optimize throughput while managing resource utilization.
summarizationGuardrail
Optional
public readonly summarizationGuardrail: IGuardrail;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IGuardrail
- Default: No guardrail is applied
Optional Bedrock guardrail to apply to summarization model interactions.
Helps ensure model outputs adhere to content policies and guidelines by filtering inappropriate content and enforcing usage policies.
Classes
SagemakerUdopProcessorConfiguration
- Implements: ISagemakerUdopProcessorConfiguration
Configuration management for SageMaker UDOP document processing using SageMaker for classification.
This construct creates and manages the configuration for SageMaker UDOP document processing, including schema definitions, extraction prompts, and configuration values. It provides a centralized way to manage document classes, extraction schemas, and model parameters for specialized document processing with SageMaker.
Initializers
import { SagemakerUdopProcessorConfiguration } from '@cdklabs/genai-idp-sagemaker-udop-processor'
new SagemakerUdopProcessorConfiguration(definition: ISagemakerUdopProcessorConfigurationDefinition)
Name | Type | Description |
---|---|---|
definition |
ISagemakerUdopProcessorConfigurationDefinition |
The configuration definition instance. |
definition
Required
The configuration definition instance.
Methods
Name | Description |
---|---|
bind |
Binds the configuration to a processor instance. |
bind
public bind(processor: ISagemakerUdopProcessor): ISagemakerUdopProcessorConfigurationDefinition
Binds the configuration to a processor instance.
This method applies the configuration to the processor.
processor
Required
- Type: ISagemakerUdopProcessor
Static Functions
Name | Description |
---|---|
fromFile |
Creates a configuration from a YAML file. |
rvlCdipPackageSample |
Creates a default configuration with standard settings. |
fromFile
import { SagemakerUdopProcessorConfiguration } from '@cdklabs/genai-idp-sagemaker-udop-processor'
SagemakerUdopProcessorConfiguration.fromFile(filePath: string, options?: SagemakerUdopProcessorConfigurationDefinitionOptions)
Creates a configuration from a YAML file.
filePath
Required
- Type: string
Path to the YAML configuration file.
options
Optional
Optional configuration options to override file settings.
rvlCdipPackageSample
import { SagemakerUdopProcessorConfiguration } from '@cdklabs/genai-idp-sagemaker-udop-processor'
SagemakerUdopProcessorConfiguration.rvlCdipPackageSample(options?: SagemakerUdopProcessorConfigurationDefinitionOptions)
Creates a default configuration with standard settings.
options
Optional
Optional configuration options.
SagemakerUdopProcessorConfigurationDefinition
Configuration definition for SageMaker UDOP document processing.
Provides methods to create and customize configuration for SageMaker UDOP processing.
Initializers
import { SagemakerUdopProcessorConfigurationDefinition } from '@cdklabs/genai-idp-sagemaker-udop-processor'
new SagemakerUdopProcessorConfigurationDefinition()
Name | Type | Description |
---|---|---|
Static Functions
Name | Description |
---|---|
fromFile |
Creates a configuration definition from a YAML file. |
rvlCdipPackageSample |
Creates a default configuration definition for SageMaker UDOP processing. |
fromFile
import { SagemakerUdopProcessorConfigurationDefinition } from '@cdklabs/genai-idp-sagemaker-udop-processor'
SagemakerUdopProcessorConfigurationDefinition.fromFile(filePath: string, options?: SagemakerUdopProcessorConfigurationDefinitionOptions)
Creates a configuration definition from a YAML file.
Allows users to provide custom configuration files for document processing.
filePath
Required
- Type: string
Path to the YAML configuration file.
options
Optional
Optional customization for processing stages.
rvlCdipPackageSample
import { SagemakerUdopProcessorConfigurationDefinition } from '@cdklabs/genai-idp-sagemaker-udop-processor'
SagemakerUdopProcessorConfigurationDefinition.rvlCdipPackageSample(options?: SagemakerUdopProcessorConfigurationDefinitionOptions)
Creates a default configuration definition for SageMaker UDOP processing.
This configuration includes basic settings for extraction, evaluation, and summarization when using SageMaker for document classification.
options
Optional
Optional customization for processing stages.
SagemakerUdopProcessorConfigurationSchema
- Implements: ISagemakerUdopProcessorConfigurationSchema
Schema definition for SageMaker UDOP processor configuration. Provides JSON Schema validation rules for the configuration UI and API.
This class defines the structure, validation rules, and UI presentation for the SageMaker UDOP processor configuration, including document classes, attributes, extraction parameters, evaluation criteria, and summarization options. It's specialized for use with SageMaker endpoints for document classification.
Initializers
import { SagemakerUdopProcessorConfigurationSchema } from '@cdklabs/genai-idp-sagemaker-udop-processor'
new SagemakerUdopProcessorConfigurationSchema()
Name | Type | Description |
---|---|---|
Methods
Name | Description |
---|---|
bind |
Binds the configuration schema to a processor instance. |
bind
public bind(processor: SagemakerUdopProcessor): void
Binds the configuration schema to a processor instance.
Creates a custom resource that updates the schema in the configuration table.
processor
Required
- Type: SagemakerUdopProcessor
The SageMaker UDOP document processor to apply the schema to.
Protocols
ISagemakerUdopProcessor
-
Extends: @cdklabs/genai-idp.IDocumentProcessor
-
Implemented By: SagemakerUdopProcessor, ISagemakerUdopProcessor
Interface for SageMaker UDOP document processor implementation.
SageMaker UDOP Processor uses specialized document processing with SageMaker endpoints for document classification, combined with foundation models for extraction. This processor is ideal for specialized document types that require custom classification models like RVL-CDIP or UDOP for accurate document categorization before extraction.
Use SageMaker UDOP Processor when: - Processing highly specialized or complex document types - You need custom classification models beyond what foundation models can provide - You have domain-specific document types requiring specialized handling - You want to leverage fine-tuned models for specific document domains
Properties
Name | Type | Description |
---|---|---|
node |
constructs.Node |
The tree node. |
environment |
@cdklabs/genai-idp.IProcessingEnvironment |
The processing environment that provides shared infrastructure and services. |
maxProcessingConcurrency |
number |
The maximum number of documents that can be processed concurrently. |
stateMachine |
aws-cdk-lib.aws_stepfunctions.IStateMachine |
The Step Functions state machine that orchestrates the document processing workflow. |
node
Required
public readonly node: Node;
- Type: constructs.Node
The tree node.
environment
Required
public readonly environment: IProcessingEnvironment;
- Type: @cdklabs/genai-idp.IProcessingEnvironment
The processing environment that provides shared infrastructure and services.
Contains input/output buckets, tracking tables, API endpoints, and other resources needed for document processing operations.
maxProcessingConcurrency
Required
public readonly maxProcessingConcurrency: number;
- Type: number
The maximum number of documents that can be processed concurrently.
Controls the throughput and resource utilization of the document processing system.
stateMachine
Required
public readonly stateMachine: IStateMachine;
- Type: aws-cdk-lib.aws_stepfunctions.IStateMachine
The Step Functions state machine that orchestrates the document processing workflow.
Manages the sequence of processing steps and handles error conditions. This state machine is triggered for each document that needs processing and coordinates the entire extraction pipeline.
ISagemakerUdopProcessorConfiguration
- Implemented By: SagemakerUdopProcessorConfiguration, ISagemakerUdopProcessorConfiguration
Interface for SageMaker UDOP document processor configuration.
Provides configuration management for specialized document processing with SageMaker.
Methods
Name | Description |
---|---|
bind |
Binds the configuration to a processor instance. |
bind
public bind(processor: ISagemakerUdopProcessor): ISagemakerUdopProcessorConfigurationDefinition
Binds the configuration to a processor instance.
This method applies the configuration to the processor.
processor
Required
- Type: ISagemakerUdopProcessor
The SageMaker UDOP document processor to apply to.
ISagemakerUdopProcessorConfigurationDefinition
-
Extends: @cdklabs/genai-idp.IConfigurationDefinition
-
Implemented By: ISagemakerUdopProcessorConfigurationDefinition
Interface for SageMaker UDOP processor configuration definition.
Defines the structure and capabilities of configuration for SageMaker UDOP processing.
Properties
Name | Type | Description |
---|---|---|
extractionModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
The invokable model used for information extraction. |
assessmentModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for document assessment. |
evaluationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for evaluating extraction results. |
summarizationModel |
@cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable |
Optional invokable model used for document summarization. |
extractionModel
Required
public readonly extractionModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
The invokable model used for information extraction.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Extracts structured data from documents based on defined schemas, transforming unstructured content into structured information.
assessmentModel
Optional
public readonly assessmentModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional invokable model used for document assessment.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model.
evaluationModel
Optional
public readonly evaluationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional invokable model used for evaluating extraction results.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. Used to assess the quality and accuracy of extracted information by comparing extraction results against expected values.
summarizationModel
Optional
public readonly summarizationModel: IInvokable;
- Type: @cdklabs/generative-ai-cdk-constructs.bedrock.IInvokable
Optional invokable model used for document summarization.
Can be a Bedrock foundation model, Bedrock inference profile, or custom model. When provided, enables automatic generation of document summaries that capture key information from processed documents.
ISagemakerUdopProcessorConfigurationSchema
- Implemented By: SagemakerUdopProcessorConfigurationSchema, ISagemakerUdopProcessorConfigurationSchema
Interface for SageMaker UDOP configuration schema.
Defines the structure and validation rules for SageMaker UDOP processor configuration.
Methods
Name | Description |
---|---|
bind |
Binds the configuration schema to a processor instance. |
bind
public bind(processor: SagemakerUdopProcessor): void
Binds the configuration schema to a processor instance.
This method applies the schema definition to the processor's configuration table.
processor
Required
- Type: SagemakerUdopProcessor
The SageMaker UDOP document processor to apply the schema to.