aws-cdk-lib
Version:
Version 2 of the AWS Cloud Development Kit library
1,470 lines (1,168 loc) • 67.9 kB
Markdown
# Tasks for AWS Step Functions
[AWS Step Functions](https://docs.aws.amazon.com/step-functions/latest/dg/welcome.html) is a web service that enables you to coordinate the
components of distributed applications and microservices using visual workflows.
You build applications from individual components that each perform a discrete
function, or task, allowing you to scale and change applications quickly.
A [Task](https://docs.aws.amazon.com/step-functions/latest/dg/amazon-states-language-task-state.html) state represents a single unit of work performed by a state machine.
All work in your state machine is performed by tasks. This module contains a collection of classes that allow you to call various AWS services
from your Step Functions state machine.
Be sure to familiarize yourself with the [`aws-stepfunctions` module documentation](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-cdk-lib.aws_stepfunctions-readme.html) first.
This module is part of the [AWS Cloud Development Kit](https://github.com/aws/aws-cdk) project.
## Table Of Contents
- [Tasks for AWS Step Functions](#tasks-for-aws-step-functions)
- [Table Of Contents](#table-of-contents)
- [Paths](#paths)
- [Evaluate Expression](#evaluate-expression)
- [API Gateway](#api-gateway)
- [Call REST API Endpoint](#call-rest-api-endpoint)
- [Call HTTP API Endpoint](#call-http-api-endpoint)
- [AWS SDK](#aws-sdk)
- [Cross-region AWS API call](#cross-region-aws-api-call)
- [Athena](#athena)
- [StartQueryExecution](#startqueryexecution)
- [GetQueryExecution](#getqueryexecution)
- [GetQueryResults](#getqueryresults)
- [StopQueryExecution](#stopqueryexecution)
- [Batch](#batch)
- [SubmitJob](#submitjob)
- [Bedrock](#bedrock)
- [InvokeModel](#invokemodel)
- [CodeBuild](#codebuild)
- [StartBuild](#startbuild)
- [DynamoDB](#dynamodb)
- [GetItem](#getitem)
- [PutItem](#putitem)
- [DeleteItem](#deleteitem)
- [UpdateItem](#updateitem)
- [ECS](#ecs)
- [RunTask](#runtask)
- [EC2](#ec2)
- [Fargate](#fargate)
- [EMR](#emr)
- [Create Cluster](#create-cluster)
- [Termination Protection](#termination-protection)
- [Terminate Cluster](#terminate-cluster)
- [Add Step](#add-step)
- [Cancel Step](#cancel-step)
- [Modify Instance Fleet](#modify-instance-fleet)
- [Modify Instance Group](#modify-instance-group)
- [EMR on EKS](#emr-on-eks)
- [Create Virtual Cluster](#create-virtual-cluster)
- [Delete Virtual Cluster](#delete-virtual-cluster)
- [Start Job Run](#start-job-run)
- [EKS](#eks)
- [Call](#call)
- [EventBridge](#eventbridge)
- [Put Events](#put-events)
- [Glue](#glue)
- [Start Job Run](#start-job-run)
- [Start Crawler Run](#startcrawlerrun)
- [Glue DataBrew](#glue-databrew)
- [Start Job Run](#start-job-run-1)
- [Lambda](#lambda)
- [Invoke](#invoke)
- [MediaConvert](#mediaconvert)
- [Create Job](#create-job)
- [SageMaker](#sagemaker)
- [Create Training Job](#create-training-job)
- [Create Transform Job](#create-transform-job)
- [Create Endpoint](#create-endpoint)
- [Create Endpoint Config](#create-endpoint-config)
- [Create Model](#create-model)
- [Update Endpoint](#update-endpoint)
- [SNS](#sns)
- [Publish](#publish)
- [Step Functions](#step-functions)
- [Start Execution](#start-execution)
- [Invoke Activity](#invoke-activity)
- [SQS](#sqs)
- [Send Message](#send-message)
## Paths
Learn more about input and output processing in Step Functions [here](https://docs.aws.amazon.com/step-functions/latest/dg/concepts-input-output-filtering.html)
## Evaluate Expression
Use the `EvaluateExpression` to perform simple operations referencing state paths. The
`expression` referenced in the task will be evaluated in a Lambda function
(`eval()`). This allows you to not have to write Lambda code for simple operations.
Example: convert a wait time from milliseconds to seconds, concat this in a message and wait:
```ts
const convertToSeconds = new tasks.EvaluateExpression(this, 'Convert to seconds', {
expression: '$.waitMilliseconds / 1000',
resultPath: '$.waitSeconds',
});
const createMessage = new tasks.EvaluateExpression(this, 'Create message', {
// Note: this is a string inside a string.
expression: '`Now waiting ${$.waitSeconds} seconds...`',
runtime: lambda.Runtime.NODEJS_LATEST,
resultPath: '$.message',
});
const publishMessage = new tasks.SnsPublish(this, 'Publish message', {
topic: new sns.Topic(this, 'cool-topic'),
message: sfn.TaskInput.fromJsonPathAt('$.message'),
resultPath: '$.sns',
});
const wait = new sfn.Wait(this, 'Wait', {
time: sfn.WaitTime.secondsPath('$.waitSeconds'),
});
new sfn.StateMachine(this, 'StateMachine', {
definition: convertToSeconds
.next(createMessage)
.next(publishMessage)
.next(wait),
});
```
The `EvaluateExpression` supports a `runtime` prop to specify the Lambda
runtime to use to evaluate the expression. Currently, only runtimes
of the Node.js family are supported.
## API Gateway
Step Functions supports [API Gateway](https://docs.aws.amazon.com/step-functions/latest/dg/connect-api-gateway.html) through the service integration pattern.
HTTP APIs are designed for low-latency, cost-effective integrations with AWS services, including AWS Lambda, and HTTP endpoints.
HTTP APIs support OIDC and OAuth 2.0 authorization, and come with built-in support for CORS and automatic deployments.
Previous-generation REST APIs currently offer more features. More details can be found [here](https://docs.aws.amazon.com/apigateway/latest/developerguide/http-api-vs-rest.html).
### Call REST API Endpoint
The `CallApiGatewayRestApiEndpoint` calls the REST API endpoint.
```ts
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
const restApi = new apigateway.RestApi(this, 'MyRestApi');
const invokeTask = new tasks.CallApiGatewayRestApiEndpoint(this, 'Call REST API', {
api: restApi,
stageName: 'prod',
method: tasks.HttpMethod.GET,
});
```
By default, the API endpoint URI will be constructed using the AWS region of
the stack in which the provided `api` is created.
To construct the endpoint with a different region, use the `region` parameter:
```ts
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
const restApi = new apigateway.RestApi(this, 'MyRestApi');
const invokeTask = new tasks.CallApiGatewayRestApiEndpoint(this, 'Call REST API', {
api: restApi,
stageName: 'prod',
method: tasks.HttpMethod.GET,
region: 'us-west-2',
});
```
Be aware that the header values must be arrays. When passing the Task Token
in the headers field `WAIT_FOR_TASK_TOKEN` integration, use
`JsonPath.array()` to wrap the token in an array:
```ts
import * as apigateway from 'aws-cdk-lib/aws-apigateway';
declare const api: apigateway.RestApi;
new tasks.CallApiGatewayRestApiEndpoint(this, 'Endpoint', {
api,
stageName: 'Stage',
method: tasks.HttpMethod.PUT,
integrationPattern: sfn.IntegrationPattern.WAIT_FOR_TASK_TOKEN,
headers: sfn.TaskInput.fromObject({
TaskToken: sfn.JsonPath.array(sfn.JsonPath.taskToken),
}),
});
```
### Call HTTP API Endpoint
The `CallApiGatewayHttpApiEndpoint` calls the HTTP API endpoint.
```ts
import * as apigatewayv2 from 'aws-cdk-lib/aws-apigatewayv2';
const httpApi = new apigatewayv2.HttpApi(this, 'MyHttpApi');
const invokeTask = new tasks.CallApiGatewayHttpApiEndpoint(this, 'Call HTTP API', {
apiId: httpApi.apiId,
apiStack: Stack.of(httpApi),
method: tasks.HttpMethod.GET,
});
```
## AWS SDK
Step Functions supports calling [AWS service's API actions](https://docs.aws.amazon.com/step-functions/latest/dg/supported-services-awssdk.html)
through the service integration pattern.
You can use Step Functions' AWS SDK integrations to call any of the over two hundred AWS services
directly from your state machine, giving you access to over nine thousand API actions.
```ts
declare const myBucket: s3.Bucket;
const getObject = new tasks.CallAwsService(this, 'GetObject', {
service: 's3',
action: 'getObject',
parameters: {
Bucket: myBucket.bucketName,
Key: sfn.JsonPath.stringAt('$.key')
},
iamResources: [myBucket.arnForObjects('*')],
});
```
Use camelCase for actions and PascalCase for parameter names.
The task automatically adds an IAM statement to the state machine role's policy based on the
service and action called. The resources for this statement must be specified in `iamResources`.
Use the `iamAction` prop to manually specify the IAM action name in the case where the IAM
action name does not match with the API service/action name:
```ts
const listBuckets = new tasks.CallAwsService(this, 'ListBuckets', {
service: 's3',
action: 'listBuckets',
iamResources: ['*'],
iamAction: 's3:ListAllMyBuckets',
});
```
Use the `additionalIamStatements` prop to pass additional IAM statements that will be added to the
state machine role's policy. Use it in the case where the call requires more than a single statement
to be executed:
```ts
const detectLabels = new tasks.CallAwsService(this, 'DetectLabels', {
service: 'rekognition',
action: 'detectLabels',
iamResources: ['*'],
additionalIamStatements: [
new iam.PolicyStatement({
actions: ['s3:getObject'],
resources: ['arn:aws:s3:::amzn-s3-demo-bucket/*'],
}),
],
});
```
### Cross-region AWS API call
You can call AWS API in a different region from your state machine by using the `CallAwsServiceCrossRegion` construct. In addition to the properties for `CallAwsService` construct, you have to set `region` property to call the API.
```ts
declare const myBucket: s3.Bucket;
const getObject = new tasks.CallAwsServiceCrossRegion(this, 'GetObject', {
region: 'ap-northeast-1',
service: 's3',
action: 'getObject',
parameters: {
Bucket: myBucket.bucketName,
Key: sfn.JsonPath.stringAt('$.key')
},
iamResources: [myBucket.arnForObjects('*')],
});
```
Other properties such as `additionalIamStatements` can be used in the same way as the `CallAwsService` task.
Note that when you use `integrationPattern.WAIT_FOR_TASK_TOKEN`, the output path changes under `Payload` property.
## Athena
Step Functions supports [Athena](https://docs.aws.amazon.com/step-functions/latest/dg/connect-athena.html) through the service integration pattern.
### StartQueryExecution
The [StartQueryExecution](https://docs.aws.amazon.com/athena/latest/APIReference/API_StartQueryExecution.html) API runs the SQL query statement.
```ts
const startQueryExecutionJob = new tasks.AthenaStartQueryExecution(this, 'Start Athena Query', {
queryString: sfn.JsonPath.stringAt('$.queryString'),
queryExecutionContext: {
databaseName: 'mydatabase',
},
resultConfiguration: {
encryptionConfiguration: {
encryptionOption: tasks.EncryptionOption.S3_MANAGED,
},
outputLocation: {
bucketName: 'amzn-s3-demo-bucket',
objectKey: 'folder',
},
},
executionParameters: ['param1', 'param2'],
});
```
You can reuse the query results by setting the `resultReuseConfigurationMaxAge` property.
```ts
const startQueryExecutionJob = new tasks.AthenaStartQueryExecution(this, 'Start Athena Query', {
queryString: sfn.JsonPath.stringAt('$.queryString'),
queryExecutionContext: {
databaseName: 'mydatabase',
},
resultConfiguration: {
encryptionConfiguration: {
encryptionOption: tasks.EncryptionOption.S3_MANAGED,
},
outputLocation: {
bucketName: 'query-results-bucket',
objectKey: 'folder',
},
},
executionParameters: ['param1', 'param2'],
resultReuseConfigurationMaxAge: Duration.minutes(100),
});
```
### GetQueryExecution
The [GetQueryExecution](https://docs.aws.amazon.com/athena/latest/APIReference/API_GetQueryExecution.html) API gets information about a single execution of a query.
```ts
const getQueryExecutionJob = new tasks.AthenaGetQueryExecution(this, 'Get Query Execution', {
queryExecutionId: sfn.JsonPath.stringAt('$.QueryExecutionId'),
});
```
### GetQueryResults
The [GetQueryResults](https://docs.aws.amazon.com/athena/latest/APIReference/API_GetQueryResults.html) API that streams the results of a single query execution specified by QueryExecutionId from S3.
```ts
const getQueryResultsJob = new tasks.AthenaGetQueryResults(this, 'Get Query Results', {
queryExecutionId: sfn.JsonPath.stringAt('$.QueryExecutionId'),
});
```
### StopQueryExecution
The [StopQueryExecution](https://docs.aws.amazon.com/athena/latest/APIReference/API_StopQueryExecution.html) API that stops a query execution.
```ts
const stopQueryExecutionJob = new tasks.AthenaStopQueryExecution(this, 'Stop Query Execution', {
queryExecutionId: sfn.JsonPath.stringAt('$.QueryExecutionId'),
});
```
## Batch
Step Functions supports [Batch](https://docs.aws.amazon.com/step-functions/latest/dg/connect-batch.html) through the service integration pattern.
### SubmitJob
The [SubmitJob](https://docs.aws.amazon.com/batch/latest/APIReference/API_SubmitJob.html) API submits an AWS Batch job from a job definition.
```ts
import * as batch from 'aws-cdk-lib/aws-batch';
declare const batchJobDefinition: batch.EcsJobDefinition;
declare const batchQueue: batch.JobQueue;
const task = new tasks.BatchSubmitJob(this, 'Submit Job', {
jobDefinitionArn: batchJobDefinition.jobDefinitionArn,
jobName: 'MyJob',
jobQueueArn: batchQueue.jobQueueArn,
});
```
## Bedrock
Step Functions supports [Bedrock](https://docs.aws.amazon.com/step-functions/latest/dg/connect-bedrock.html) through the service integration pattern.
### InvokeModel
The [InvokeModel](https://docs.aws.amazon.com/bedrock/latest/APIReference/API_runtime_InvokeModel.html) API
invokes the specified Bedrock model to run inference using the input provided.
The format of the input body and the response body depend on the model selected.
```ts
import * as bedrock from 'aws-cdk-lib/aws-bedrock';
const model = bedrock.FoundationModel.fromFoundationModelId(
this,
'Model',
bedrock.FoundationModelIdentifier.AMAZON_TITAN_TEXT_G1_EXPRESS_V1,
);
const task = new tasks.BedrockInvokeModel(this, 'Prompt Model', {
model,
body: sfn.TaskInput.fromObject(
{
inputText: 'Generate a list of five first names.',
textGenerationConfig: {
maxTokenCount: 100,
temperature: 1,
},
},
),
resultSelector: {
names: sfn.JsonPath.stringAt('$.Body.results[0].outputText'),
},
});
```
### Using Input Path for S3 URI
Provide S3 URI as an input or output path to invoke a model
To specify the S3 URI as JSON path to your input or output fields, use props `s3InputUri` and `s3OutputUri` under BedrockInvokeModelProps and set
feature flag `@aws-cdk/aws-stepfunctions-tasks:useNewS3UriParametersForBedrockInvokeModelTask` to true.
If this flag is not enabled, the code will populate the S3Uri using `InputPath` and `OutputPath` fields, which is not recommended.
```ts
import * as bedrock from 'aws-cdk-lib/aws-bedrock';
const model = bedrock.FoundationModel.fromFoundationModelId(
this,
'Model',
bedrock.FoundationModelIdentifier.AMAZON_TITAN_TEXT_G1_EXPRESS_V1,
);
const task = new tasks.BedrockInvokeModel(this, 'Prompt Model', {
model,
input : { s3InputUri: sfn.JsonPath.stringAt('$.prompt') },
output: { s3OutputUri: sfn.JsonPath.stringAt('$.prompt') },
});
```
### Using Input Path
Provide S3 URI as an input or output path to invoke a model
Currently, input and output Path provided in the BedrockInvokeModelProps input is defined as S3URI field under task definition of state machine.
To modify the existing behaviour, set `@aws-cdk/aws-stepfunctions-tasks:useNewS3UriParametersForBedrockInvokeModelTask` to true.
If this feature flag is enabled, S3URI fields will be generated from other Props(`s3InputUri` and `s3OutputUri`), and the given inputPath, OutputPath will be rendered as
it is in the JSON task definition.
If the feature flag is set to `false`, the behavior will be to populate the S3Uri using the `InputPath` and `OutputPath` fields, which is not recommended.
```ts
import * as bedrock from 'aws-cdk-lib/aws-bedrock';
const model = bedrock.FoundationModel.fromFoundationModelId(
this,
'Model',
bedrock.FoundationModelIdentifier.AMAZON_TITAN_TEXT_G1_EXPRESS_V1,
);
const task = new tasks.BedrockInvokeModel(this, 'Prompt Model', {
model,
inputPath: sfn.JsonPath.stringAt('$.prompt'),
outputPath: sfn.JsonPath.stringAt('$.prompt'),
});
```
You can apply a guardrail to the invocation by setting `guardrail`.
```ts
import * as bedrock from 'aws-cdk-lib/aws-bedrock';
const model = bedrock.FoundationModel.fromFoundationModelId(
this,
'Model',
bedrock.FoundationModelIdentifier.AMAZON_TITAN_TEXT_G1_EXPRESS_V1,
);
const task = new tasks.BedrockInvokeModel(this, 'Prompt Model with guardrail', {
model,
body: sfn.TaskInput.fromObject(
{
inputText: 'Generate a list of five first names.',
textGenerationConfig: {
maxTokenCount: 100,
temperature: 1,
},
},
),
guardrail: tasks.Guardrail.enable('guardrailId', 1),
resultSelector: {
names: sfn.JsonPath.stringAt('$.Body.results[0].outputText'),
},
});
```
## CodeBuild
Step Functions supports [CodeBuild](https://docs.aws.amazon.com/step-functions/latest/dg/connect-codebuild.html) through the service integration pattern.
### StartBuild
[StartBuild](https://docs.aws.amazon.com/codebuild/latest/APIReference/API_StartBuild.html) starts a CodeBuild Project by Project Name.
```ts
import * as codebuild from 'aws-cdk-lib/aws-codebuild';
const codebuildProject = new codebuild.Project(this, 'Project', {
projectName: 'MyTestProject',
buildSpec: codebuild.BuildSpec.fromObject({
version: '0.2',
phases: {
build: {
commands: [
'echo "Hello, CodeBuild!"',
],
},
},
}),
});
const task = new tasks.CodeBuildStartBuild(this, 'Task', {
project: codebuildProject,
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
environmentVariablesOverride: {
ZONE: {
type: codebuild.BuildEnvironmentVariableType.PLAINTEXT,
value: sfn.JsonPath.stringAt('$.envVariables.zone'),
},
},
});
```
### StartBuildBatch
[StartBuildBatch](https://docs.aws.amazon.com/codebuild/latest/APIReference/API_StartBuildBatch.html) starts a batch CodeBuild for a project by project name.
It is necessary to enable the batch build feature in the CodeBuild project.
```ts
import * as codebuild from 'aws-cdk-lib/aws-codebuild';
const project = new codebuild.Project(this, 'Project', {
projectName: 'MyTestProject',
buildSpec: codebuild.BuildSpec.fromObjectToYaml({
version: 0.2,
batch: {
'build-list': [
{
identifier: 'id',
buildspec: 'version: 0.2\nphases:\n build:\n commands:\n - echo "Hello, from small!"',
},
],
},
}),
});
project.enableBatchBuilds();
const task = new tasks.CodeBuildStartBuildBatch(this, 'buildBatchTask', {
project,
integrationPattern: sfn.IntegrationPattern.REQUEST_RESPONSE,
environmentVariablesOverride: {
test: {
type: codebuild.BuildEnvironmentVariableType.PLAINTEXT,
value: 'testValue',
},
},
});
```
**Note**: `enableBatchBuilds()` will do nothing for imported projects.
If you are using an imported project, you must ensure that the project is already configured for batch builds.
## DynamoDB
You can call DynamoDB APIs from a `Task` state.
Read more about calling DynamoDB APIs [here](https://docs.aws.amazon.com/step-functions/latest/dg/connect-ddb.html)
### GetItem
The [GetItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_GetItem.html) operation returns a set of attributes for the item with the given primary key.
```ts
declare const myTable: dynamodb.Table;
new tasks.DynamoGetItem(this, 'Get Item', {
key: { messageId: tasks.DynamoAttributeValue.fromString('message-007') },
table: myTable,
});
```
### PutItem
The [PutItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_PutItem.html) operation creates a new item, or replaces an old item with a new item.
```ts
declare const myTable: dynamodb.Table;
new tasks.DynamoPutItem(this, 'PutItem', {
item: {
MessageId: tasks.DynamoAttributeValue.fromString('message-007'),
Text: tasks.DynamoAttributeValue.fromString(sfn.JsonPath.stringAt('$.bar')),
TotalCount: tasks.DynamoAttributeValue.fromNumber(10),
},
table: myTable,
});
```
### DeleteItem
The [DeleteItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_DeleteItem.html) operation deletes a single item in a table by primary key.
```ts
declare const myTable: dynamodb.Table;
new tasks.DynamoDeleteItem(this, 'DeleteItem', {
key: { MessageId: tasks.DynamoAttributeValue.fromString('message-007') },
table: myTable,
resultPath: sfn.JsonPath.DISCARD,
});
```
### UpdateItem
The [UpdateItem](https://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_UpdateItem.html) operation edits an existing item's attributes, or adds a new item
to the table if it does not already exist.
```ts
declare const myTable: dynamodb.Table;
new tasks.DynamoUpdateItem(this, 'UpdateItem', {
key: {
MessageId: tasks.DynamoAttributeValue.fromString('message-007')
},
table: myTable,
expressionAttributeValues: {
':val': tasks.DynamoAttributeValue.numberFromString(sfn.JsonPath.stringAt('$.Item.TotalCount.N')),
':rand': tasks.DynamoAttributeValue.fromNumber(20),
},
updateExpression: 'SET TotalCount = :val + :rand',
});
```
## ECS
Step Functions supports [ECS/Fargate](https://docs.aws.amazon.com/step-functions/latest/dg/connect-ecs.html) through the service integration pattern.
### RunTask
[RunTask](https://docs.aws.amazon.com/step-functions/latest/dg/connect-ecs.html) starts a new task using the specified task definition.
#### EC2
The EC2 launch type allows you to run your containerized applications on a cluster
of Amazon EC2 instances that you manage.
When a task that uses the EC2 launch type is launched, Amazon ECS must determine where
to place the task based on the requirements specified in the task definition, such as
CPU and memory. Similarly, when you scale down the task count, Amazon ECS must determine
which tasks to terminate. You can apply task placement strategies and constraints to
customize how Amazon ECS places and terminates tasks. Learn more about [task placement](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/task-placement.html)
The latest ACTIVE revision of the passed task definition is used for running the task.
The following example runs a job from a task definition on EC2
```ts
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', {
isDefault: true,
});
const cluster = new ecs.Cluster(this, 'Ec2Cluster', { vpc });
cluster.addCapacity('DefaultAutoScalingGroup', {
instanceType: new ec2.InstanceType('t2.micro'),
vpcSubnets: { subnetType: ec2.SubnetType.PUBLIC },
});
const taskDefinition = new ecs.TaskDefinition(this, 'TD', {
compatibility: ecs.Compatibility.EC2,
});
taskDefinition.addContainer('TheContainer', {
image: ecs.ContainerImage.fromRegistry('foo/bar'),
memoryLimitMiB: 256,
});
const runTask = new tasks.EcsRunTask(this, 'Run', {
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
cluster,
taskDefinition,
launchTarget: new tasks.EcsEc2LaunchTarget({
placementStrategies: [
ecs.PlacementStrategy.spreadAcrossInstances(),
ecs.PlacementStrategy.packedByCpu(),
ecs.PlacementStrategy.randomly(),
],
placementConstraints: [
ecs.PlacementConstraint.memberOf('blieptuut'),
],
}),
propagatedTagSource: ecs.PropagatedTagSource.TASK_DEFINITION,
});
```
#### Fargate
AWS Fargate is a serverless compute engine for containers that works with Amazon
Elastic Container Service (ECS). Fargate makes it easy for you to focus on building
your applications. Fargate removes the need to provision and manage servers, lets you
specify and pay for resources per application, and improves security through application
isolation by design. Learn more about [Fargate](https://aws.amazon.com/fargate/)
The Fargate launch type allows you to run your containerized applications without the need
to provision and manage the backend infrastructure. Just register your task definition and
Fargate launches the container for you. The latest ACTIVE revision of the passed
task definition is used for running the task. Learn more about
[Fargate Versioning](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_DescribeTaskDefinition.html)
The following example runs a job from a task definition on Fargate
```ts
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', {
isDefault: true,
});
const cluster = new ecs.Cluster(this, 'FargateCluster', { vpc });
const taskDefinition = new ecs.TaskDefinition(this, 'TD', {
memoryMiB: '512',
cpu: '256',
compatibility: ecs.Compatibility.FARGATE,
});
const containerDefinition = taskDefinition.addContainer('TheContainer', {
image: ecs.ContainerImage.fromRegistry('foo/bar'),
memoryLimitMiB: 256,
});
const runTask = new tasks.EcsRunTask(this, 'RunFargate', {
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
cluster,
taskDefinition,
assignPublicIp: true,
containerOverrides: [{
containerDefinition,
environment: [{ name: 'SOME_KEY', value: sfn.JsonPath.stringAt('$.SomeKey') }],
}],
launchTarget: new tasks.EcsFargateLaunchTarget(),
propagatedTagSource: ecs.PropagatedTagSource.TASK_DEFINITION,
});
```
#### Override CPU and Memory Parameter
By setting the property cpu or memoryMiB, you can override the Fargate or EC2 task instance size at runtime.
see: https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_TaskOverride.html
```ts
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', {
isDefault: true,
});
const cluster = new ecs.Cluster(this, 'ECSCluster', { vpc });
const taskDefinition = new ecs.TaskDefinition(this, 'TD', {
compatibility: ecs.Compatibility.FARGATE,
cpu: '256',
memoryMiB: '512'
});
taskDefinition.addContainer('TheContainer', {
image: ecs.ContainerImage.fromRegistry('foo/bar'),
});
const runTask = new tasks.EcsRunTask(this, 'Run', {
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
cluster,
taskDefinition,
launchTarget: new tasks.EcsFargateLaunchTarget(),
cpu: '1024',
memoryMiB: '1048'
});
```
#### ECS enable Exec
By setting the property [`enableExecuteCommand`](https://docs.aws.amazon.com/AmazonECS/latest/APIReference/API_RunTask.html#ECS-RunTask-request-enableExecuteCommand) to `true`, you can enable the [ECS Exec feature](https://docs.aws.amazon.com/AmazonECS/latest/developerguide/ecs-exec.html) for the task for either Fargate or EC2 launch types.
```ts
const vpc = ec2.Vpc.fromLookup(this, 'Vpc', {
isDefault: true,
});
const cluster = new ecs.Cluster(this, 'ECSCluster', { vpc });
const taskDefinition = new ecs.TaskDefinition(this, 'TD', {
compatibility: ecs.Compatibility.EC2,
});
taskDefinition.addContainer('TheContainer', {
image: ecs.ContainerImage.fromRegistry('foo/bar'),
memoryLimitMiB: 256,
});
const runTask = new tasks.EcsRunTask(this, 'Run', {
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
cluster,
taskDefinition,
launchTarget: new tasks.EcsEc2LaunchTarget(),
enableExecuteCommand: true,
});
```
## EMR
Step Functions supports Amazon EMR through the service integration pattern.
The service integration APIs correspond to Amazon EMR APIs but differ in the
parameters that are used.
[Read more](https://docs.aws.amazon.com/step-functions/latest/dg/connect-emr.html) about the differences when using these service integrations.
### Create Cluster
Creates and starts running a cluster (job flow).
Corresponds to the [`runJobFlow`](https://docs.aws.amazon.com/emr/latest/APIReference/API_RunJobFlow.html) API in EMR.
```ts
const clusterRole = new iam.Role(this, 'ClusterRole', {
assumedBy: new iam.ServicePrincipal('ec2.amazonaws.com'),
});
const serviceRole = new iam.Role(this, 'ServiceRole', {
assumedBy: new iam.ServicePrincipal('elasticmapreduce.amazonaws.com'),
});
const autoScalingRole = new iam.Role(this, 'AutoScalingRole', {
assumedBy: new iam.ServicePrincipal('elasticmapreduce.amazonaws.com'),
});
autoScalingRole.assumeRolePolicy?.addStatements(
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
principals: [
new iam.ServicePrincipal('application-autoscaling.amazonaws.com'),
],
actions: [
'sts:AssumeRole',
],
}));
)
new tasks.EmrCreateCluster(this, 'Create Cluster', {
instances: {},
clusterRole,
name: sfn.TaskInput.fromJsonPathAt('$.ClusterName').value,
serviceRole,
autoScalingRole,
});
```
You can use the launch specification for On-Demand and Spot instances in the fleet.
```ts
new tasks.EmrCreateCluster(this, 'OnDemandSpecification', {
instances: {
instanceFleets: [{
instanceFleetType: tasks.EmrCreateCluster.InstanceRoleType.MASTER,
launchSpecifications: {
onDemandSpecification: {
allocationStrategy: tasks.EmrCreateCluster.OnDemandAllocationStrategy.LOWEST_PRICE,
},
},
}],
},
name: 'OnDemandCluster',
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
});
new tasks.EmrCreateCluster(this, 'SpotSpecification', {
instances: {
instanceFleets: [{
instanceFleetType: tasks.EmrCreateCluster.InstanceRoleType.MASTER,
launchSpecifications: {
spotSpecification: {
allocationStrategy: tasks.EmrCreateCluster.SpotAllocationStrategy.CAPACITY_OPTIMIZED,
timeoutAction: tasks.EmrCreateCluster.SpotTimeoutAction.TERMINATE_CLUSTER,
timeout: Duration.minutes(5),
},
},
}],
},
name: 'SpotCluster',
integrationPattern: sfn.IntegrationPattern.RUN_JOB,
});
```
If you want to run multiple steps in [parallel](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-concurrent-steps.html),
you can specify the `stepConcurrencyLevel` property. The concurrency range is between 1
and 256 inclusive, where the default concurrency of 1 means no step concurrency is allowed.
`stepConcurrencyLevel` requires the EMR release label to be 5.28.0 or above.
```ts
new tasks.EmrCreateCluster(this, 'Create Cluster', {
instances: {},
name: sfn.TaskInput.fromJsonPathAt('$.ClusterName').value,
stepConcurrencyLevel: 10,
});
```
If you want to use an [auto-termination policy](https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-auto-termination-policy.html),
you can specify the `autoTerminationPolicyIdleTimeout` property.
Specifies the amount of idle time after which the cluster automatically terminates. You can specify a minimum of 60 seconds and a maximum of 604800 seconds (seven days).
```ts
new tasks.EmrCreateCluster(this, 'Create Cluster', {
instances: {},
name: 'ClusterName',
autoTerminationPolicyIdleTimeout: Duration.seconds(100),
});
```
### Termination Protection
Locks a cluster (job flow) so the EC2 instances in the cluster cannot be
terminated by user intervention, an API call, or a job-flow error.
Corresponds to the [`setTerminationProtection`](https://docs.aws.amazon.com/step-functions/latest/dg/connect-emr.html) API in EMR.
```ts
new tasks.EmrSetClusterTerminationProtection(this, 'Task', {
clusterId: 'ClusterId',
terminationProtected: false,
});
```
### Terminate Cluster
Shuts down a cluster (job flow).
Corresponds to the [`terminateJobFlows`](https://docs.aws.amazon.com/emr/latest/APIReference/API_TerminateJobFlows.html) API in EMR.
```ts
new tasks.EmrTerminateCluster(this, 'Task', {
clusterId: 'ClusterId',
});
```
### Add Step
Adds a new step to a running cluster.
Corresponds to the [`addJobFlowSteps`](https://docs.aws.amazon.com/emr/latest/APIReference/API_AddJobFlowSteps.html) API in EMR.
```ts
new tasks.EmrAddStep(this, 'Task', {
clusterId: 'ClusterId',
name: 'StepName',
jar: 'Jar',
actionOnFailure: tasks.ActionOnFailure.CONTINUE,
});
```
To specify a custom runtime role use the `executionRoleArn` property.
**Note:** The EMR cluster must be created with a security configuration and the runtime role must have a specific trust policy.
See this [blog post](https://aws.amazon.com/blogs/big-data/introducing-runtime-roles-for-amazon-emr-steps-use-iam-roles-and-aws-lake-formation-for-access-control-with-amazon-emr/) for more details.
```ts
import * as emr from 'aws-cdk-lib/aws-emr';
const cfnSecurityConfiguration = new emr.CfnSecurityConfiguration(this, 'EmrSecurityConfiguration', {
name: 'AddStepRuntimeRoleSecConfig',
securityConfiguration: JSON.parse(`
{
"AuthorizationConfiguration": {
"IAMConfiguration": {
"EnableApplicationScopedIAMRole": true,
"ApplicationScopedIAMRoleConfiguration":
{
"PropagateSourceIdentity": true
}
},
"LakeFormationConfiguration": {
"AuthorizedSessionTagValue": "Amazon EMR"
}
}
}`),
});
const task = new tasks.EmrCreateCluster(this, 'Create Cluster', {
instances: {},
name: sfn.TaskInput.fromJsonPathAt('$.ClusterName').value,
securityConfiguration: cfnSecurityConfiguration.name,
});
const executionRole = new iam.Role(this, 'Role', {
assumedBy: new iam.ArnPrincipal(task.clusterRole.roleArn),
});
executionRole.assumeRolePolicy?.addStatements(
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
principals: [
task.clusterRole,
],
actions: [
'sts:SetSourceIdentity',
],
}),
new iam.PolicyStatement({
effect: iam.Effect.ALLOW,
principals: [
task.clusterRole,
],
actions: [
'sts:TagSession',
],
conditions: {
StringEquals: {
'aws:RequestTag/LakeFormationAuthorizedCaller': 'Amazon EMR',
},
},
}),
);
new tasks.EmrAddStep(this, 'Task', {
clusterId: 'ClusterId',
executionRoleArn: executionRole.roleArn,
name: 'StepName',
jar: 'Jar',
actionOnFailure: tasks.ActionOnFailure.CONTINUE,
});
```
### Cancel Step
Cancels a pending step in a running cluster.
Corresponds to the [`cancelSteps`](https://docs.aws.amazon.com/emr/latest/APIReference/API_CancelSteps.html) API in EMR.
```ts
new tasks.EmrCancelStep(this, 'Task', {
clusterId: 'ClusterId',
stepId: 'StepId',
});
```
### Modify Instance Fleet
Modifies the target On-Demand and target Spot capacities for the instance
fleet with the specified InstanceFleetName.
Corresponds to the [`modifyInstanceFleet`](https://docs.aws.amazon.com/emr/latest/APIReference/API_ModifyInstanceFleet.html) API in EMR.
```ts
new tasks.EmrModifyInstanceFleetByName(this, 'Task', {
clusterId: 'ClusterId',
instanceFleetName: 'InstanceFleetName',
targetOnDemandCapacity: 2,
targetSpotCapacity: 0,
});
```
### Modify Instance Group
Modifies the number of nodes and configuration settings of an instance group.
Corresponds to the [`modifyInstanceGroups`](https://docs.aws.amazon.com/emr/latest/APIReference/API_ModifyInstanceGroups.html) API in EMR.
```ts
new tasks.EmrModifyInstanceGroupByName(this, 'Task', {
clusterId: 'ClusterId',
instanceGroupName: sfn.JsonPath.stringAt('$.InstanceGroupName'),
instanceGroup: {
instanceCount: 1,
},
});
```
## EMR on EKS
Step Functions supports Amazon EMR on EKS through the service integration pattern.
The service integration APIs correspond to Amazon EMR on EKS APIs, but differ in the parameters that are used.
[Read more](https://docs.aws.amazon.com/step-functions/latest/dg/connect-emr-eks.html) about the differences when using these service integrations.
[Setting up](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up.html) the EKS cluster is required.
### Create Virtual Cluster
The [CreateVirtualCluster](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_CreateVirtualCluster.html) API creates a single virtual cluster that's mapped to a single Kubernetes namespace.
The EKS cluster containing the Kubernetes namespace where the virtual cluster will be mapped can be passed in from the task input.
```ts
new tasks.EmrContainersCreateVirtualCluster(this, 'Create a Virtual Cluster', {
eksCluster: tasks.EksClusterInput.fromTaskInput(sfn.TaskInput.fromText('clusterId')),
});
```
The EKS cluster can also be passed in directly.
```ts
import * as eks from 'aws-cdk-lib/aws-eks';
declare const eksCluster: eks.Cluster;
new tasks.EmrContainersCreateVirtualCluster(this, 'Create a Virtual Cluster', {
eksCluster: tasks.EksClusterInput.fromCluster(eksCluster),
});
```
By default, the Kubernetes namespace that a virtual cluster maps to is "default", but a specific namespace within an EKS cluster can be selected.
```ts
new tasks.EmrContainersCreateVirtualCluster(this, 'Create a Virtual Cluster', {
eksCluster: tasks.EksClusterInput.fromTaskInput(sfn.TaskInput.fromText('clusterId')),
eksNamespace: 'specified-namespace',
});
```
### Delete Virtual Cluster
The [DeleteVirtualCluster](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_DeleteVirtualCluster.html) API deletes a virtual cluster.
```ts
new tasks.EmrContainersDeleteVirtualCluster(this, 'Delete a Virtual Cluster', {
virtualClusterId: sfn.TaskInput.fromJsonPathAt('$.virtualCluster'),
});
```
### Start Job Run
The [StartJobRun](https://docs.aws.amazon.com/emr-on-eks/latest/APIReference/API_StartJobRun.html) API starts a job run. A job is a unit of work that you submit to Amazon EMR on EKS for execution. The work performed by the job can be defined by a Spark jar, PySpark script, or SparkSQL query. A job run is an execution of the job on the virtual cluster.
Required setup:
- If not done already, follow the [steps](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up.html) to setup EMR on EKS and [create an EKS Cluster](https://docs.aws.amazon.com/cdk/api/latest/docs/aws-eks-readme.html#quick-start).
- Enable [Cluster access](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-cluster-access.html)
- Enable [IAM Role access](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-enable-IAM.html)
The following actions must be performed if the virtual cluster ID is supplied from the task input. Otherwise, if it is supplied statically in the state machine definition, these actions will be done automatically.
- Create an [IAM role](https://docs.aws.amazon.com/cdk/api/latest/docs/@aws-cdk_aws-iam.Role.html)
- Update the [Role Trust Policy](https://docs.aws.amazon.com/emr/latest/EMR-on-EKS-DevelopmentGuide/setting-up-trust-policy.html) of the Job Execution Role.
The job can be configured with spark submit parameters:
```ts
new tasks.EmrContainersStartJobRun(this, 'EMR Containers Start Job Run', {
virtualCluster: tasks.VirtualClusterInput.fromVirtualClusterId('de92jdei2910fwedz'),
releaseLabel: tasks.ReleaseLabel.EMR_6_2_0,
jobDriver: {
sparkSubmitJobDriver: {
entryPoint: sfn.TaskInput.fromText('local:///usr/lib/spark/examples/src/main/python/pi.py'),
sparkSubmitParameters: '--conf spark.executor.instances=2 --conf spark.executor.memory=2G --conf spark.executor.cores=2 --conf spark.driver.cores=1',
},
},
});
```
Configuring the job can also be done via application configuration:
```ts
new tasks.EmrContainersStartJobRun(this, 'EMR Containers Start Job Run', {
virtualCluster: tasks.VirtualClusterInput.fromVirtualClusterId('de92jdei2910fwedz'),
releaseLabel: tasks.ReleaseLabel.EMR_6_2_0,
jobName: 'EMR-Containers-Job',
jobDriver: {
sparkSubmitJobDriver: {
entryPoint: sfn.TaskInput.fromText('local:///usr/lib/spark/examples/src/main/python/pi.py'),
},
},
applicationConfig: [{
classification: tasks.Classification.SPARK_DEFAULTS,
properties: {
'spark.executor.instances': '1',
'spark.executor.memory': '512M',
},
}],
});
```
Job monitoring can be enabled if `monitoring.logging` is set true. This automatically generates an S3 bucket and CloudWatch logs.
```ts
new tasks.EmrContainersStartJobRun(this, 'EMR Containers Start Job Run', {
virtualCluster: tasks.VirtualClusterInput.fromVirtualClusterId('de92jdei2910fwedz'),
releaseLabel: tasks.ReleaseLabel.EMR_6_2_0,
jobDriver: {
sparkSubmitJobDriver: {
entryPoint: sfn.TaskInput.fromText('local:///usr/lib/spark/examples/src/main/python/pi.py'),
sparkSubmitParameters: '--conf spark.executor.instances=2 --conf spark.executor.memory=2G --conf spark.executor.cores=2 --conf spark.driver.cores=1',
},
},
monitoring: {
logging: true,
},
});
```
Otherwise, providing monitoring for jobs with existing log groups and log buckets is also available.
```ts
import * as logs from 'aws-cdk-lib/aws-logs';
const logGroup = new logs.LogGroup(this, 'Log Group');
const logBucket = new s3.Bucket(this, 'S3 Bucket')
new tasks.EmrContainersStartJobRun(this, 'EMR Containers Start Job Run', {
virtualCluster: tasks.VirtualClusterInput.fromVirtualClusterId('de92jdei2910fwedz'),
releaseLabel: tasks.ReleaseLabel.EMR_6_2_0,
jobDriver: {
sparkSubmitJobDriver: {
entryPoint: sfn.TaskInput.fromText('local:///usr/lib/spark/examples/src/main/python/pi.py'),
sparkSubmitParameters: '--conf spark.executor.instances=2 --conf spark.executor.memory=2G --conf spark.executor.cores=2 --conf spark.driver.cores=1',
},
},
monitoring: {
logGroup: logGroup,
logBucket: logBucket,
},
});
```
Users can provide their own existing Job Execution Role.
```ts
new tasks.EmrContainersStartJobRun(this, 'EMR Containers Start Job Run', {
virtualCluster:tasks.VirtualClusterInput.fromTaskInput(sfn.TaskInput.fromJsonPathAt('$.VirtualClusterId')),
releaseLabel: tasks.ReleaseLabel.EMR_6_2_0,
jobName: 'EMR-Containers-Job',
executionRole: iam.Role.fromRoleArn(this, 'Job-Execution-Role', 'arn:aws:iam::xxxxxxxxxxxx:role/JobExecutionRole'),
jobDriver: {
sparkSubmitJobDriver: {
entryPoint: sfn.TaskInput.fromText('local:///usr/lib/spark/examples/src/main/python/pi.py'),
sparkSubmitParameters: '--conf spark.executor.instances=2 --conf spark.executor.memory=2G --conf spark.executor.cores=2 --conf spark.driver.cores=1',
},
},
});
```
## EKS
Step Functions supports Amazon EKS through the service integration pattern.
The service integration APIs correspond to Amazon EKS APIs.
[Read more](https://docs.aws.amazon.com/step-functions/latest/dg/connect-eks.html) about the differences when using these service integrations.
### Call
Read and write Kubernetes resource objects via a Kubernetes API endpoint.
Corresponds to the [`call`](https://docs.aws.amazon.com/step-functions/latest/dg/connect-eks.html) API in Step Functions Connector.
The following code snippet includes a Task state that uses eks:call to list the pods.
```ts
import * as eks from 'aws-cdk-lib/aws-eks';
import { KubectlV32Layer } from '@aws-cdk/lambda-layer-kubectl-v32';
const myEksCluster = new eks.Cluster(this, 'my sample cluster', {
version: eks.KubernetesVersion.V1_32,
clusterName: 'myEksCluster',
kubectlLayer: new KubectlV32Layer(this, 'kubectl'),
});
new tasks.EksCall(this, 'Call a EKS Endpoint', {
cluster: myEksCluster,
httpMethod: tasks.HttpMethods.GET,
httpPath: '/api/v1/namespaces/default/pods',
});
```
## EventBridge
Step Functions supports Amazon EventBridge through the service integration pattern.
The service integration APIs correspond to Amazon EventBridge APIs.
[Read more](https://docs.aws.amazon.com/step-functions/latest/dg/connect-eventbridge.html) about the differences when using these service integrations.
### Put Events
Send events to an EventBridge bus.
Corresponds to the [`put-events`](https://docs.aws.amazon.com/step-functions/latest/dg/connect-eventbridge.html) API in Step Functions Connector.
The following code snippet includes a Task state that uses events:putevents to send an event to the default bus.
```ts
import * as events from 'aws-cdk-lib/aws-events';
const myEventBus = new events.EventBus(this, 'EventBus', {
eventBusName: 'MyEventBus1',
});
new tasks.EventBridgePutEvents(this, 'Send an event to EventBridge', {
entries: [{
detail: sfn.TaskInput.fromObject({
Message: 'Hello from Step Functions!',
}),
eventBus: myEventBus,
detailType: 'MessageFromStepFunctions',
source: 'step.functions',
}],
});
```
## EventBridge Scheduler
You can call EventBridge Scheduler APIs from a `Task` state.
Read more about calling Scheduler APIs [here](https://docs.aws.amazon.com/scheduler/latest/APIReference/API_Operations.html)
### Create Scheduler
The [CreateSchedule](https://docs.aws.amazon.com/scheduler/latest/APIReference/API_CreateSchedule.html) API creates a new schedule.
Here is an example of how to create a schedule that puts an event to SQS queue every 5 minutes:
```ts
import * as scheduler from 'aws-cdk-lib/aws-scheduler';
import * as kms from 'aws-cdk-lib/aws-kms';
declare const key: kms.Key;
declare const scheduleGroup: scheduler.CfnScheduleGroup;
declare const targetQueue: sqs.Queue;
declare const deadLetterQueue: sqs.Queue;
const schedulerRole = new iam.Role(this, 'SchedulerRole', {
assumedBy: new iam.ServicePrincipal('scheduler.amazonaws.com'),
});
// To send the message to the queue
// This policy changes depending on the type of target.
schedulerRole.addToPrincipalPolicy(new iam.PolicyStatement({
actions: ['sqs:SendMessage'],
resources: [targetQueue.queueArn],
}));
const createScheduleTask1 = new tasks.EventBridgeSchedulerCreateScheduleTask(this, 'createSchedule', {
scheduleName: 'TestSchedule',
actionAfterCompletion: tasks.ActionAfterCompletion.NONE,
clientToken: 'testToken',
description: 'TestDescription',
startDate: new Date(),
endDate: new Date(new Date().getTime() + 1000 * 60 * 60),
flexibleTimeWindow: Duration.minutes(5),
groupName: scheduleGroup.ref,
kmsKey: key,
schedule: tasks.Schedule.rate(Duration.minutes(5)),
timezone: 'UTC',
enabled: true,
target: new tasks.EventBridgeSchedulerTarget({
arn: targetQueue.queueArn,
role: schedulerRole,
retryPolicy: {
maximumRetryAttempts: 2,
maximumEventAge: Duration.minutes(5),
},
deadLetterQueue,
}),
});
```
## Glue
Step Functions supports [AWS Glue](https://docs.aws.amazon.com/step-functions/latest/dg/connect-glue.html) through the service integration pattern.
### StartJobRun
You can call the [`StartJobRun`](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-jobs-runs.html#aws-glue-api-jobs-runs-StartJobRun) API from a `Task` state.
```ts
new tasks.GlueStartJobRun(this, 'Task', {
glueJobName: 'my-glue-job',
arguments: sfn.TaskInput.fromObject({
key: 'value',
}),
taskTimeout: sfn.Timeout.duration(Duration.minutes(30)),
notifyDelayAfter: Duration.minutes(5),
});
```
You can configure workers by setting the `workerTypeV2` and `numberOfWorkers` properties.
`workerType` is deprecated and no longer recommended. Use `workerTypeV2` which is
a ENUM-like class for more powerful worker configuration around using pre-defined values or
dynamic values.
```ts
new tasks.GlueStartJobRun(this, 'Task', {
glueJobName: 'my-glue-job',
workerConfiguration: {
workerTypeV2: tasks.WorkerTypeV2.G_1X, // Worker type
numberOfWorkers: 2, // Number of Workers
},
});
```
To configure the worker type or number of workers dynamically from StateMachine's input,
you can configure it using JSON Path values using `workerTypeV2` like this:
```ts
new tasks.GlueStartJobRun(this, 'Glue Job Task', {
glueJobName: 'my-glue-job',
workerConfiguration: {
workerTypeV2: tasks.WorkerTypeV2.of(sfn.JsonPath.stringAt('$.glue_jobs_configs.executor_type')),
numberOfWorkers: sfn.JsonPath.numberAt('$.glue_jobs_configs.max_number_workers'),
},
});
```
You can choose the execution class by setting the `executionClass` property.
```ts
new tasks.GlueStartJobRun(this, 'Task', {
glueJobName: 'my-glue-job',
executionClass: tasks.ExecutionClass.FLEX,
});
```
### StartCrawlerRun
You can call the [`StartCrawler`](https://docs.aws.amazon.com/glue/latest/dg/aws-glue-api-crawler-crawling.html#aws-glue-api-crawler-crawling-StartCrawler) API from a `Task` state through AWS SDK service integrations.
```ts
import * as glue from 'aws-cdk-lib/aws-glue';
declare const myCrawler: glue.CfnCrawler;
// You can get the crawler name from `crawler.ref`
new tasks.GlueStartCrawlerRun(this, 'Task1', {
crawlerName: myCrawler.ref,
});
// Of course, you can also specify the crawler name directly.
new tasks.GlueStartCrawlerRun(this, 'Task2', {
crawlerName: 'my-crawler-job',
});
```
## Glue DataBrew
Step Functions supports [AWS Glue DataBrew](https://docs.aws.amazon.com/step-functions/latest/dg/connect-databrew.html) through the service integration pattern.
### Start Job Run
You can call the [`StartJobRun`](https://docs.aws.amazon.com/databrew/latest/dg/API_StartJobRun.html) API from a `Task` state.
```ts
new tasks.GlueDataBrewStartJobRun(this, 'Task', {
name: 'databrew-job',
});
```
## Invoke HTTP API
Step Functions supports [calling third-party APIs](https://docs.aws.amazon.com/step-functions/latest/dg/connect-third-party-apis.html) with credentials managed by Amazon EventBridge [Connections](https://docs.aws.amazon.com/eventbridge/latest/APIReference/API_Connection.html).
The following snippet creates a new API destination connection, and uses it to make a POST request to the specified URL. The endpoint response is available at the `$.ResponseBody` path.
```ts
import * as events from 'aws-cdk-lib/aws-events';
const connection = new events.Connection(this, 'Connection', {
authorization: events.Authorization.basic('username', SecretValue.unsafePlainText('password')),
});
new tasks.HttpInvoke(this, 'Invoke HTTP API', {
apiRoot: 'https://api.example.com',
apiEndpoint: sfn.TaskInput.fromText('path/to/resource'),
body: