Connect with us


Patterns for multi-account, hub-and-spoke Amazon SageMaker model registry

Data science workflows have to pass multiple stages as they progress from the experimentation to production pipeline. A common approach involves separate accounts dedicated to different phases of the AI/ML workflow (experimentation, development, and production). In addition, issues related to data access control may also mandate that workflows for different AI/ML applications be hosted on…



Data science workflows have to pass multiple stages as they progress from the experimentation to production pipeline. A common approach involves separate accounts dedicated to different phases of the AI/ML workflow (experimentation, development, and production).

In addition, issues related to data access control may also mandate that workflows for different AI/ML applications be hosted on separate, isolated AWS accounts. Managing these stages and multiple accounts is complex and challenging.

When it comes to model deployment, however, it often makes sense to have a central repository of approved models to keep track of what is being used for production-grade inference. The Amazon SageMaker Model Registry is the natural choice for this kind of inference-oriented metadata store. In this post, we showcase how to set up such a centralized repository.


The workflow we address here is the one common to many data science projects. A data scientist in a dedicated data science account experiments on models, creates model artifacts on Amazon Simple Storage Service (Amazon S3), keeps track of the association between model artifacts and Amazon Elastic Container Registry (Amazon ECR) images using SageMaker model packages, and groups model versions into model package groups. The following diagram gives an overview of the structure of the SageMaker Model Registry.

A typical scenario has the following components:

  • One or more spoke environments are used for experimenting and for training ML models
  • Segregation between the spoke environments and a centralized environment is needed
  • We want to promote a machine learning (ML) model from the spokes to the centralized environment by creating a model package (version) in the centralized environment, and optionally moving the generated artifact model.tar.gz to an S3 bucket to serve as a centralized model store
  • Tracking and versioning of promoted ML models is done in the centralized environment from which, for example, deployment can be performed

This post illustrates how to build federated, hub-and-spoke model registries, where multiple spoke accounts use the SageMaker Model Registry from a hub account to register their model package groups and versions.

The following diagram illustrates two possible patterns: a push-based approach and a pull-based approach.

In the push-based approach, a user or role from a spoke account assumes a role in the central account. They then register the model packages or versions directly into the central registry. This is the simplest approach, both to set up and operate. However, you must give the spoke accounts write access (through the assumed role) to the central hub, which in some setups may not be possible or desirable.

In the pull-based approach, the spoke account registers model package groups or versions in the local SageMaker Model Registry. Amazon EventBridge notifies the hub account of the modification, which triggers a process that pulls the modification and replicates it to the hub’s registry. In this setup, spoke accounts don’t have any access to the central registry. Instead, the central account has read access to the spoke registries.

In the following sections, we illustrate example configurations for simple, two-account setups:

  • A data science (DS) account used for performing isolated experimentation using AWS services, such as SageMaker, the SageMaker Model Registry, Amazon S3, and Amazon ECR
  • A hub account used for storing the central model registry, and optionally also ML model binaries and Amazon ECR model images.

In real-life scenarios, multiple DS accounts would be associated to a single hub account.

Strictly connected to the operation of a model registry is the topic of model lineage, which is the possibility to trace a deployed model all the way back to the exact experiment and training job or data that generated it. Amazon SageMaker ML Lineage Tracking creates and stores information about the steps of an ML workflow (from data preparation to model deployment) in the accounts where the different steps are originally run. Exporting this information to different accounts is possible as of this writing using dedicated model metadata. Model metadata can be exchanged through different mechanisms (for example by emitting and forwarding a custom EventBridge event, or by writing to an Amazon DynamoDB table). A detailed description of these processes is beyond the scope of this post.

Access to model artifacts, Amazon ECR, and basic model registry permissions

Full cross-account operation of the model registry requires three main components:

  • Access from the hub account to model artifacts on Amazon S3 and to Amazon ECR images (either in the DS accounts or in a centralized Amazon S3 and Amazon ECR location)
  • Same-account operations on the model registry
  • Cross-account operations on the model registry

We can achieve the first component using resource policies. We provide examples of cross-account read-only policies for Amazon S3 and Amazon ECR in this section. In addition to these settings, the principals in the following policies must act using a role where the corresponding actions are allowed. For example, it’s not enough to have a resource policy that allows the DS account to read a bucket. The account must also do so from a role where Amazon S3 reads are allowed. This basic Amazon S3 and Amazon ECR configuration is not detailed here; links to the relevant documentation are provided at the end of this post.

Careful consideration must also be given to the location where model artifacts and Amazon ECR images are stored. If a central location is desired, it seems like a natural choice to let the hub account also serve as an artifact and image store. In this case, as part of the promotion process, model artifacts and Amazon ECR images must be copied from the DS accounts to the hub account. This is a normal copy operation, and can be done using both push-to-hub and pull-from-DS patterns, which aren’t detailed in this post. However, the attached code for the push-based pattern shows a complete example, including the code to handle the Amazon S3 copy of the artifacts. The example assumes that such a central store exists, that it coincides with the hub account, and that the necessary copy operations are in place.

In this context, versioning (of model images and of model artifacts) is also an important building block. It is required to improve the security profile of the setup and make sure that no accidental overwriting or deletion occurs. In real-life scenarios, the operation of the setups described here is fully automated, and steered by CI/CD pipelines that use unique build-ids to generate unique identifiers for all archived resources (unique object keys for Amazon S3, unique image tags for Amazon ECR). An additional level of robustness can be added by activating versioning on the relevant S3 buckets, as detailed in the resources provided at the end of this post.

Amazon S3 bucket policy

The following resource policy allows the DS account to get objects inside a defined S3 bucket in the hub account. As already mentioned, in this scenario, the hub account also serves as a model store, keeping a copy of the model artifacts. The case where the model store is disjointed from the hub account would have a similar configuration: the relevant bucket must allow read operations from the hub and DS accounts.

{ “Version”:”2012-10-17″, “Statement”:[ { “Sid”:”S3CrossAccountRead”, “Effect”:”Allow”, “Action”:”s3:GetObject”, “Resource”: [ “arn::s3:::{HUB_BUCKET_NAME}/*model.tar.gz” ], “Principal”:{ “AWS”:[ “arn:aws:iam::{DS_ACCOUNT_ID}:role/{DS_ACCOUNT_ROLE}” ] } } ] }

Amazon ECR repository policy

The following resource policy allows the DS account to get images from a defined Amazon ECR repository in the hub account, because in this example the hub account also serves as the central Amazon ECR registry. In case a separate central registry is desired, the configuration is similar: the hub or DS account needs to be given read access to the central registry. Optionally, you can also restrict the access to specific resources, such as enforce a specific pattern for tagging cross-account images.

{ “Version”:”2012-10-17″, “Statement”:[ { “Sid”:”S3CrossAccountRead”, “Effect”:”Allow”, “Action”: [ “ecr:BatchGetImage”, “ecr:GetDownloadUrlForLayer” ] “Principal”:{ “AWS”:[ “arn:aws:iam::{DS_ACCOUNT_ID}:role/{DS_ACCOUNT_ROLE}” ] } } ] }

IAM policy for SageMaker Model Registry

Operations on the model registry within an account are regulated by normal AWS Identity and Access Management (IAM) policies. The following example allows basic actions on the model registry:

{ “Version”: “2012-10-17”, “Statement”: [ { “Action”: [ “sagemaker:CreateModelPackage*”, “sagemaker:DescribeModelPackage”, “sagemaker:DescribeModelPackageGroup”, “sagemaker:ListModelPackages”, “sagemaker:ListModelPackageGroups” ], “Resource”: [ “*” ], “Effect”: “Allow” } ] }

We now detail how to configure cross-account operations on the model registry.

SageMaker Model Registry configuration: Push-based approach

The following diagram shows the architecture of the push-based approach.

In this approach, users in the DS account can read from the hub account, thanks to resource-based policies. However, to gain write access to central registry, the DS account must assume a role in the hub account with the appropriate permissions.

The minimal setup of this architecture requires the following:

  • Read access to the model artifacts on Amazon S3 and to the Amazon ECR images, using resource-based policies, as outlined in the previous section.
  • IAM policies in the hub account allowing it to write the objects into the chosen S3 bucket and create model packages into the SageMaker model package groups.
  • An IAM role in the hub account with the previous policies attached with a cross-account AssumeRole rule. The DS account assumes this role to write the model.tar.gz in the S3 bucket and create a model package. For example, this operation could be carried out by an AWS Lambda function.
  • A second IAM role, in the DS account, that can read the model.tar.gz artifact from the S3 bucket, and assume the role in the hub account mentioned above. This role is used for reads from the registry. For example, this could be used as the run role of a Lambda function.

Create a resource policy for model package groups

The following is an example policy to be attached to model package groups in the hub account. It allows read operations on a package group and on all package versions it contains.

{ ‘Version’: ‘2012-10-17’, ‘Statement’: [ { ‘Sid’: ‘AddPermModelPackageGroup’, ‘Effect’: ‘Allow’, ‘Principal’: { ‘AWS’: [ ‘arn:aws:iam::{DS_ACCOUNT_ID}:role/service-role/{LAMBDA_ROLE}’ ] }, ‘Action’: [ ‘sagemaker:DescribeModelPackageGroup’ ], ‘Resource’: ‘arn:aws:sagemaker:{REGION}:{HUB_ACCOUNT_ID}:model-package-group/{NAME}’ }, { ‘Sid’: ‘AddPermModelPackageVersion’, ‘Effect’: ‘Allow’, ‘Principal’: { ‘AWS’: ‘arn:aws:iam::{DS_ACCOUNT_ID}:role/service-role/{LAMBDA_ROLE}’ }, ‘Action’: [ “sagemaker:DescribeModelPackage”, “sagemaker:ListModelPackages”, ], ‘Resource’: ‘arn:aws:sagemaker:{REGION}:{HUB_ACCOUNT_ID}:model-package/{NAME}/*’ } ] }

You can’t associate this policy with the package group via the AWS Management Console. You need SDK or AWS Command Line Interface (AWS CLI) access. For example, the following code uses Python and Boto3:

sm_client = boto3.client(‘sagemaker’) sm_client.put_model_package_group_policy( ModelPackageGroupName = model_package_group_name, ResourcePolicy = model_pacakge_group_policy)

Cross-account policy for the DS account to the Hub account

This policy allows the users and services in the DS account to assume the relevant role in the hub account. For example, the following policy allows a Lambda execution role in the DS account to assume the role in the hub account:

{ “Version”: “2012-10-17”, “Statement”: [ { “Action”: [ “sts:AssumeRole” ], “Resource”: [ “arn:aws:iam::{HUB_ACCOUNT_ID}:role/SagemakerModelRegistryRole” ], “Effect”: “Allow” } ] }

Example workflow

Now that all permissions are configured, we can illustrate the workflow using a Lambda function that assumes the hub account role previously defined, copies the artifact model.tar.gz created into the hub account S3 bucket, and creates the model package linked to the previously copied artifact.

In the following code snippets, we illustrate how to create a model package in the target account after assuming the relevant role. The complete code needed for operation (including manipulation of Amazon S3 and Amazon ECR assets) is attached to this post.

Copy the artifact

To maintain a centralized approach in the hub account, the first operation described is copying the artifact in the centralized S3 bucket.

The method requires as input the DS source bucket name, the hub target bucket name, and the path to the model.tar.gz. After you copy the artifact into the target bucket, it returns the new Amazon S3 path that is used from the model package. As discussed earlier, you need to run this code from a role that has read (write) access to the source (destination) Amazon S3 location. You set this up, for example, in the execution role of a Lambda function, whose details are beyond the scope of this document. See the following code:

def copy_artifact(ds_bucket_name, hub_bucket_name, model_path): try: s3_client = boto3.client(“s3”) source_response = s3_client.get_object( Bucket=ds_bucket_name, Key=model_path ) # HERE we are assuming the role for copying into the target S3 bucket s3_client = assume_dev_role_s3() s3_client.upload_fileobj( source_response[“Body”], hub_bucket_name, model_path ) new_model_path = “s3://{}/{}”.format(hub_bucket_name, model_path) return new_model_path except Exception as e: stacktrace = traceback.format_exc() LOGGER.error(“{}”.format(stacktrace)) raise e

Create a model package

This method registers the model version in a model package group that you already created in the hub account. The method requires as input a Boto3 SageMaker client instantiated after assuming the role in the hub account, the Amazon ECR image URI to use in the model package, the model URL created after copying the artifact in the target S3 bucket, the model package group name used for creating the new model package version, and the approval status to be assigned to the new version created:

def create_model_package(sm_client, image_uri, model_path, model_package_group_name, approval_status): try: modelpackage_inference_specification = { “InferenceSpecification”: { “Containers”: [ { “Image”: image_uri, “ModelDataUrl”: model_path } ], # use correct types here “SupportedContentTypes”: [“text/csv”], “SupportedResponseMIMETypes”: [“text/csv”], } } create_model_package_input_dict = { “ModelPackageGroupName”: model_package_group_name, “ModelPackageDescription”: f”Model for {model_package_group_name}”, “ModelApprovalStatus”: approval_status } create_model_package_input_dict.update(modelpackage_inference_specification) create_mode_package_response = sm_client.create_model_package( **create_model_package_input_dict) model_package_arn = create_mode_package_response[“ModelPackageArn”] return model_package_arn except Exception as e: stacktrace = traceback.format_exc() LOGGER.error(“{}”.format(stacktrace)) raise e

A Lambda handler orchestrates all the actions needed to operate the central registry. The mandatory parameters in this example are as follows:

  • image_uri – The Amazon ECR image URI used in the model package
  • model_path – The source path of the artifact in the S3 bucket
  • model_package_group_name – The model package group name used for creating the new model package version
  • ds_bucket_name – The name of the source S3 bucket
  • hub_bucket_name – The name of the target S3 bucket
  • approval_status – The status to assign to the model package version

See the following code:

def lambda_handler(event, context): image_uri = event.get(“image_uri”, None) model_path = event.get(“model_path”, None) model_package_group_name = event.get(“model_package_group_name”, None) ds_bucket_name = event.get(“ds_bucket_name”, None) hub_bucket_name = event.get(“hub_bucket_name”, None) approval_status = event.get(“approval_status”, None) # copy the S3 assets from DS to Hub model_path = copy_artifact(ds_bucket_name, hub_bucket_name, model_path) # assume a role in the Hub account, retrieve the sagemaker client sm_client = assume_hub_role_sagemaker() # create the model package in the Hub account model_package_arn = create_model_package(sm_client, image_uri, model_path, model_package_group_name, approval_status) response = { “statusCode”: “200”, “model_arn”: model_package_arn } return response

SageMaker Model Registry configuration: Pull-based approach

The following diagram illustrates the architecture for the pull-based approach.

This approach is better suited for cases where write access to the account hosting the central registry is restricted. The preceding diagram shows a minimal setup, with a hub and just one spoke.

A typical workflow is as follows:

  1. A data scientist is working on a dedicated account. The local model registry is used to keep track of model packages and deployment.
  2. Each time a model package is created, an event “SageMaker Model Package State Change” is emitted.
  3. The EventBridge rule in the DS account forwards the event to the hub account, where it triggers actions. In this example, a Lambda function with cross-account read access to the DS model registry can retrieve the needed information and copy it to the central registry.

The minimal setup of this architecture requires the following:

  • Model package groups in the DS account need to have a resource policy, allowing read access from the Lambda execution role in the hub account.
  • The EventBridge rule in the DS account must be configured to forward relevant events to the hub account.
  • The hub account must allow the DS EventBridge rule to send events over.
  • Access to the S3 bucket storing the model artifacts, as well as to Amazon ECR for model images, must be granted to a role in the hub account. These configurations follow the lines of what we outlined in the first section, and are not further elaborated on here.

If the hub account is also in charge of deployment in addition to simple bookkeeping, read access to the model artifacts on Amazon S3 and to the model images on Amazon ECR must also be set up. This can be done by either archiving resources to the hub account or with read-only cross-account access, as already outlined earlier in this post.

Create a resource policy for model package groups

The following is an example policy to attach to model package groups in the DS account. It allows read operations on a package group and on all package versions it contains:

{ ‘Version’: ‘2012-10-17’, ‘Statement’: [ { ‘Sid’: ‘AddPermModelPackageGroup’, ‘Effect’: ‘Allow’, ‘Principal’: { ‘AWS’: ‘arn:aws:iam::{HUB_ACCOUNT_ID}:role/service-role/{LAMBDA_ROLE}’ }, ‘Action’: [‘sagemaker:DescribeModelPackageGroup’], ‘Resource’: ‘arn:aws:sagemaker:{REGION}:{DS_ACCOUNT_ID}:model-package-group/{NAME}’ }, { ‘Sid’: ‘AddPermModelPackageVersion’, ‘Effect’: ‘Allow’, ‘Principal’: { ‘AWS’: ‘arn:aws:iam::{HUB_ACCOUNT_ID}:role/service-role/{LAMBDA_ROLE}’ }, ‘Action’: [ “sagemaker:DescribeModelPackage”, “sagemaker:ListModelPackages”, ], ‘Resource’: ‘arn:aws:sagemaker:{REGION}:{DS_ACCOUNT_ID}:model-package/{NAME}/*’ } ] }

You can’t associate this policy to the package group via the console. The SDK or AWS CLI is required. For example, the following code uses Python and Boto3:

sm_client = boto3.client(‘sagemaker’) sm_client.put_model_package_group_policy( ModelPackageGroupName = model_package_group_name, ResourcePolicy = model_pacakge_group_policy)

Configure an EventBridge rule in the DS account

In the DS account, you must configure a rule for EventBridge:

  1. On the EventBridge console, choose Rules.
  2. Choose the event bus you want to add the rule to (for example, the default bus).
  3. Choose Create rule.
  4. Select Event Pattern, and navigate your way to through the drop-down menus to choose Predefined pattern, AWS, SageMaker¸ and SageMaker Model Package State Change.

You can refine the event pattern as you like. For example, to forward only events related to approved models within a specific package group, use the following code:

{ “source”: [“aws.sagemaker”], “detail-type”: [“SageMaker Model Package State Change”], “detail”: { “ModelPackageGroupName”: [“ExportPackageGroup”], “ModelApprovalStatus”: [“Approved”], } }

  1. In the Target section, choose Event Bus in another AWS account.
  2. Enter the ARN of the event bus in the hub account that receives the events.
  3. Finish creating the rule.
  4. In the hub account, open the EventBridge console, choose the event bus that receives the events from the DS account, and edit the Permissions field so that it contains the following code:

{ “Version”: “2012-10-17”, “Statement”: [{ “Sid”: “sid1”, “Effect”: “Allow”, “Principal”: { “AWS”: “arn:aws:iam::{DS_ACCOUNT_ID}:root” }, “Action”: “events:*”, “Resource”: “arn:aws:events:{REGION}:{HUB_ACCOUNT_ID}:event-bus/{BUS_NAME}” }] }

Configure an EventBridge rule in the hub account

Now events can flow from the DS account to the hub account. You must configure the hub account to properly handle the events:

  1. On the EventBridge console, choose Rules.
  2. Choose Create rule.
  3. Similarly to the previous section, create a rule for the relevant event type.
  4. Connect it to the appropriate target—in this case, a Lambda function.

In the following example code, we process the event, extract the model package ARN, and retrieve its details. The event from EventBridge already contains all the information from the model package in the DS account. In principle, the resource policy for the model package group isn’t even needed when the copy operation is triggered by EventBridge.

import boto3 sm_client = boto3.client(‘sagemaker’) # this is meant to be triggered by events in the bus def lambda_handler(event, context): # users need to implement the function get_model_details # to extract info from the event received from EventBridge model_arn, model_spec, model_desc = get_model_details(event) target_group_name = ‘targetGroupName’ # copy the model package to the hub registry create_model_package_args = { ‘InferenceSpecification’: model_spec, ‘ModelApprovalStatus’: ‘PendingManualApproval’, ‘ModelPackageDescription’: model_desc, ‘ModelPackageGroupName’: target_group_name} return sm_client.create_model_package(**create_model_package_args)


SageMaker model registries are a native AWS tool to track model versions and lineage. The implementation overhead is minimal, in particular when compared with a fully custom metadata store, and they integrate with the rest of the tools within SageMaker. As we demonstrated in this post, even in complex multi-account setups with strict segregation between accounts, model registries are a viable solution to track operations of AI/ML workflows.


To learn more, refer to the following resources:

About the Authors

Andrea Di Simone is a Data Scientist in the Professional Services team based in Munich, Germany. He helps customers to develop their AI/ML products and workflows, leveraging AWS tools. He enjoys reading, classical music and hiking.



Bruno Pistone is a Machine Learning Engineer for AWS based in Milan. He works with enterprise customers on helping them to productionize Machine Learning solutions and to follow best practices using AWS AI/ML services. His field of expertise are Machine Learning Industrialization and MLOps. He enjoys spending time with his friends and exploring new places around Milan, as well as traveling to new destinations.


Matteo Calabrese is a Data and ML engineer in the Professional Services team based in Milan (Italy).
He works with large enterprises on AI/ML projects, helping them in proposition, deliver, scale, and optimize ML solutions . His goal is shorten their time to value and accelerate business outcomes by providing AWS best practices. In his spare time, he enjoys hiking and traveling.



Giuseppe Angelo Porcelli is a Principal Machine Learning Specialist Solutions Architect for Amazon Web Services. With several years software engineering an ML background, he works with customers of any size to deeply understand their business and technical needs and design AI and Machine Learning solutions that make the best use of the AWS Cloud and the Amazon Machine Learning stack. He has worked on projects in different domains, including MLOps, Computer Vision, NLP, and involving a broad set of AWS services. In his free time, Giuseppe enjoys playing football.




Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.


Enhance the caller experience with hints in Amazon Lex

We understand speech input better if we have some background on the topic of conversation. Consider a customer service agent at an auto parts wholesaler helping with orders. If the agent knows that the customer is looking for tires, they’re more likely to recognize responses (for example, “Michelin”) on the phone. Agents often pick up…




We understand speech input better if we have some background on the topic of conversation. Consider a customer service agent at an auto parts wholesaler helping with orders. If the agent knows that the customer is looking for tires, they’re more likely to recognize responses (for example, “Michelin”) on the phone. Agents often pick up such clues or hints based on their domain knowledge and access to business intelligence dashboards. Amazon Lex now supports a hints capability to enhance the recognition of relevant phrases in a conversation. You can programmatically provide phrases as hints during a live interaction to influence the transcription of spoken input. Better recognition drives efficient conversations, reduces agent handling time, and ultimately increases customer satisfaction.

In this post, we review the runtime hints capability and use it to implement verification of callers based on their mother’s maiden name.

Overview of the runtime hints capability

You can provide a list of phrases or words to help your bot with the transcription of speech input. You can use these hints with built-in slot types such as first and last names, street names, city, state, and country. You can also configure these for your custom slot types.

You can use the capability to transcribe names that may be difficult to pronounce or understand. For example, in the following sample conversation, we use it to transcribe the name “Loreck.”

Conversation 1

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Checking

IVR: What is the account number?

Caller: 1111 2222 3333 4444

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Loreck

IVR: Thank you. The balance on your checking account is 123 dollars.

Words provided as hints are preferred over other similar words. For example, in the second sample conversation, the runtime hint (“Smythe”) is selected over a more common transcription (“Smith”).

Conversation 2

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Checking

IVR: What is the account number?

Caller: 5555 6666 7777 8888

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Smythe

IVR: Thank you. The balance on your checking account is 456 dollars.

If the name doesn’t match the runtime hint, you can fail the verification and route the call to an agent.

Conversation 3

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Savings

IVR: What is the account number?

Caller: 5555 6666 7777 8888

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Jane

IVR: There is an issue with your account. For support, you will be forwarded to an agent.

Solution overview

Let’s review the overall architecture for the solution (see the following diagram):

  • We use an Amazon Lex bot integrated with an Amazon Connect contact flow to deliver the conversational experience.
  • We use a dialog codehook in the Amazon Lex bot to invoke an AWS Lambda function that provides the runtime hint at the previous turn of the conversation.
  • For the purposes of this post, the mother’s maiden name data used for authentication is stored in an Amazon DynamoDB table.
  • After the caller is authenticated, the control is passed to the bot to perform transactions (for example, check balance)

In addition to the Lambda function, you can also send runtime hints to Amazon Lex V2 using the PutSession, RecognizeText, RecognizeUtterance, or StartConversation operations. The runtime hints can be set at any point in the conversation and are persisted at every turn until cleared.

Deploy the sample Amazon Lex bot

To create the sample bot and configure the runtime phrase hints, perform the following steps. This creates an Amazon Lex bot called BankingBot, and one slot type (accountNumber).

  1. Download the Amazon Lex bot.
  2. On the Amazon Lex console, choose Actions, Import.
  3. Choose the file that you downloaded, and choose Import.
  4. Choose the bot BankingBot on the Amazon Lex console.
  5. Choose the language English (GB).
  6. Choose Build.
  7. Download the supporting Lambda code.
  8. On the Lambda console, create a new function and select Author from scratch.
  9. For Function name, enter BankingBotEnglish.
  10. For Runtime, choose Python 3.8.
  11. Choose Create function.
  12. In the Code source section, open and delete the existing code.
  13. Download the function code and open it in a text editor.
  14. Copy the code and enter it into the empty function code field.
  15. Choose deploy.
  16. On the Amazon Lex console, select the bot BankingBot.
  17. Choose Deployment and then Aliases, then choose the alias TestBotAlias.
  18. On the Aliases page, choose Languages and choose English (GB).
  19. For Source, select the bot BankingBotEnglish.
  20. For Lambda version or alias, enter $LATEST.
  21. On the DynamoDB console, choose Create table.
  22. Provide the name as customerDatabase.
  23. Provide the partition key as accountNumber.
  24. Add an item with accountNumber: “1111222233334444” and mothersMaidenName “Loreck”.
  25. Add item with accountNumber: “5555666677778888” and mothersMaidenName “Smythe”.
  26. Make sure the Lambda function has permissions to read from the DynamoDB table customerDatabase.
  27. On the Amazon Connect console, choose Contact flows.
  28. In the Amazon Lex section, select your Amazon Lex bot and make it available for use in the Amazon Connect contact flow.
  29. Download the contact flow to integrate with the Amazon Lex bot.
  30. Choose the contact flow to load it into the application.
  31. Make sure the right bot is configured in the “Get Customer Input” block.
  32. Choose a queue in the “Set working queue” block.
  33. Add a phone number to the contact flow.
  34. Test the IVR flow by calling in to the phone number.

Test the solution

You can now call in to the Amazon Connect phone number and interact with the bot.


Runtime hints allow you to influence the transcription of words or phrases dynamically in the conversation. You can use business logic to identify the hints as the conversation evolves. Better recognition of the user input allows you to deliver an enhanced experience. You can configure runtime hints via the Lex V2 SDK. The capability is available in all AWS Regions where Amazon Lex operates in the English (Australia), English (UK), and English (US) locales.

To learn more, refer to runtime hints.

About the Authors

Kai Loreck is a professional services Amazon Connect consultant. He works on designing and implementing scalable customer experience solutions. In his spare time, he can be found playing sports, snowboarding, or hiking in the mountains.

Anubhav Mishra is a Product Manager with AWS. He spends his time understanding customers and designing product experiences to address their business challenges.

Sravan Bodapati is an Applied Science Manager at AWS Lex. He focuses on building cutting edge Artificial Intelligence and Machine Learning solutions for AWS customers in ASR and NLP space. In his spare time, he enjoys hiking, learning economics, watching TV shows and spending time with his family.


Continue Reading


The Intel®3D Athlete Tracking (3DAT) scalable architecture deploys pose estimation models using Amazon Kinesis Data Streams and Amazon EKS

This blog post is co-written by Jonathan Lee, Nelson Leung, Paul Min, and Troy Squillaci from Intel.  In Part 1 of this post, we discussed how Intel®3DAT collaborated with AWS Machine Learning Professional Services (MLPS) to build a scalable AI SaaS application. 3DAT uses computer vision and AI to recognize, track, and analyze over 1,000…




This blog post is co-written by Jonathan Lee, Nelson Leung, Paul Min, and Troy Squillaci from Intel. 

In Part 1 of this post, we discussed how Intel®3DAT collaborated with AWS Machine Learning Professional Services (MLPS) to build a scalable AI SaaS application. 3DAT uses computer vision and AI to recognize, track, and analyze over 1,000 biomechanics data points from standard video. It allows customers to create rich and powerful biomechanics-driven products, such as web and mobile applications with detailed performance data and three-dimensional visualizations.

In Part 2 of this post, we dive deeper into each stage of the architecture. We explore the AWS services used to meet the 3DAT design requirements, including Amazon Kinesis Data Streams and Amazon Elastic Kubernetes Service (Amazon EKS), in order to scalably deploy the necessary pose estimation models for this software as a service (SaaS) application.

Architecture overview

The primary goal of the MLPS team was to productionalize the 2D and 3D pose estimation model pipelines and create a functional and scalable application. The following diagram illustrates the solution architecture.

The complete architecture is broken down into five major components:

  • User application interface layers
  • Database
  • Workflow orchestration
  • Scalable pose estimation inference generation
  • Operational monitoring

Let’s go into detail on each component, their interactions, and the rationale behind the design choices.

User application interface layers

The following diagram shows the application interface layers that provide user access and control of the application and its resources.

These access points support different use cases based on different customer personas. For example, an application user can submit a job via the CLI, whereas a developer can build an application using the Python SDK and embed pose estimation intelligence into their applications. The CLI and SDK are built as modular components—both layers are wrappers of the API layer, which is built using Amazon API Gateway to resolve the API calls and associated AWS Lambda functions, which take care of the backend logic associated with each API call. These layers were a crucial component for the Intel OTG team because it opens up a broad base of customers that can effectively use this SaaS application.

API layer

The solution has a core set of nine APIs, which correspond to the types of objects that operate on this platform. Each API has a Python file that defines the API actions that can be run. The creation of new objects is automatically assigned an object ID sequentially. The attributes of these objects are stored and tracked in the Amazon Aurora Serverless database using this ID. Therefore, the API actions tie back to functions that are defined in a central file that contains the backend logic for querying the Aurora database. This backend logic uses the Boto3 Amazon RDS DataService client to access the database cluster.

The one exception is the /job API, which has a create_job method that handles video submission for creating a new processing job. This method starts the AWS Step Functions workflow logic for running the job. By passing in a job_id, this method uses the Boto3 Step Functions client to call the start_execution method for a specified stateMachineARN (Amazon Resource Name).

The eight object APIs have the methods and similar access pattern as summarized in the following table.

Method Type Function Name Description
GET list_[object_name]s Selects all objects of this type from the database and displays.
POST create_[object] Inserts a new object record with required inputs into the database.
GET get_[object] Selects object attributes based on the object ID from the database and displays.
PUT update_[object] Updates an existing object record with the required inputs.
DELETE delete_[object] Deletes an existing object record from the database based on object ID.

The details of the nine APIs are as follows:

  1. /user – A user is the identity of someone authorized to submit jobs to this application. The creation of a user requires a user name, user email, and group ID that the user belongs to.
  2. /user_group – A user group is a collection of users. Every user group is mapped to one project and one pipeline parameter set. To have different tiers (in terms of infrastructural resources and pipeline parameters), users are divided into user groups. Each user can belong to only one user group. The creation of a user group requires a project ID, pipeline parameter set ID, user group name, and user group description. Note that user groups are different from user roles defined in the AWS account. The latter is used to provide different level of access based on their access roles (for example admin).
  3. /project – A project is used to group different sets of infrastructural resources together. A project is associated with a single project_cluster_url (Aurora cluster) for recording users, jobs, and other metadata, a project_queue_arn (Kinesis Data Streams ARN), and a compute runtime environment (currently controlled via Cortex) used for running inference on the frame batches or postprocessing on the videos. Each user group is associated to one project, and this mechanism is how different tiers are enabled in terms of latency and compute power for different groups of users. The creation of a project requires a project name, project cluster URL, and project queue ARN.
  4. /pipeline – A pipeline is associated with a single configuration for a sequence of processing containers that perform video processing in the Amazon EKS inference generation cluster coordinated by Cortex (see the section on video processing inference generation for more details). Typically, this consists of three containers: preprocessing and decoding, object detection, and pose estimation. For example, the decode and object detection step are the same for the 2D and 3D pipelines, but swapping out the last container using either HRNet or 3DMPPE results in the parameter set for 2D vs. 3D processing pipelines. You can create new configurations to define possible pipelines that can be used for processing, and it requires as input a new Python file in the Cortex repo that details the sequence of model endpoints call that define that pipeline (see the section on video processing inference generation). The pipeline endpoint is the Cortex endpoint that is called to process a single frame. The creation of a pipeline requires a pipeline name, pipeline description, and pipeline endpoint.
  5. /pipeline_parameter_set – A pipeline parameter set is a flexible JSON collection of multiple parameters (a pipeline configuration runtime) for a particular pipeline, and is added to provide flexibility for future customization when multiple pipeline configuration runtimes are required. User groups can be associated with a particular pipeline parameter set, and its purpose is to have different groups of parameters per user group and per pipeline. This was an important forward-looking addition for Intel OTG to build in customization that supports portability as different customers, particularly ISVs, start using the application.
  6. /pipeline_parameters – A single collection of pipeline parameters is an instantiation of a pipeline parameter set. This makes it a 1:many mapping of a pipeline parameter set to pipeline parameters. This API requires a pipeline ID to associate with the set of pipeline parameters that enables the creation of a pipeline for a 1:1 mapping of pipeline parameters to the pipeline. The other inputs required by this API are a pipeline parameter set ID, pipeline parameters value, and pipeline parameters name.
  7. /video – A video object is used to define individual videos that make up a .zip package submitted during a job. This file is broken up into multiple videos after submission. A video is related to the job_id for the job where the .zip package is submitted, and Amazon Simple Storage Service (Amazon S3) paths for the location of the raw separated videos and the postprocessing results of each video. The video object also contains a video progress percentage, which is consistently updated during processing of individual frame batches of that video, as well as a video status flag for complete or not complete. The creation of a video requires a job ID, video path, video results path, video progress percentage, and video status.
  8. /frame_batch – A frame_batch object is a mini-batch of frames created by sampling a single video. Separating a video into regular-sized frame batches provides a lever to tune latency and increases parallelization and throughput. This is the granular unit that is run through Kinesis Data Streams for inference. A creation of a frame batch requires a video ID, frame batch start number, frame batch end number, frame batch input path, frame batch results path, and frame batch status.
  9. /job – This interaction API is used for file submission to start a processing job. This API has a different function from other object APIs because it’s the direct path to interact with the video processing backend Step Functions workflow coordination and Amazon EKS cluster. This API requires a user ID, project ID, pipeline ID, pipeline parameter set ID, job parameters, and job status. In the job parameters, an input file path is specified, which is the location in Amazon S3 where the .zip package of videos to be processed is located. File upload is handled with the upload_handler method, which generates a presigned S3 URL for the user to place a file. A WORKFLOW_STATEMACHINE_ARN is an environment variable that is passed to the create_job API to specify where a video .zip package with an input file path is submitted to start a job.

The following table summarizes the API’s functions.

Method Type Function Description
GET list_jobs Selects all jobs from the database and displays.
POST create_ job Inserts a new job record with user ID, project ID, pipeline ID, pipeline parameter set ID, job results path, job parameters, and job status.
GET get_ job Selects job attributes based on job ID from the database and displays.
GET upload_handler Generates a presigned S3 URL as the location for the .zip file upload. Requires an S3 bucket name and expects an application/zip file type.

Python SDK layer

Building upon the APIs, the team created a Python SDK client library as a wrapper to make it easier for developers to access the API methods. They used the open-source Poetry, which handles Python packaging and dependency management. They created a file that contains functions wrapping each of the APIs using the Python requests library to handle API requests and exceptions.

For developers to launch the Intel 3DAT SDK, they need to install and build the Poetry package. Then, they can add a simple Python import of intel_3dat_sdk to any Python code.

To use the client, you can create an instance of the client, specifying the API endpoint:

client = intel_3dat_sdk.client(endpoint_url=”API Endpoint Here”)

You can then use the client to call the individual methods such as the create_pipeline method (see the following code), taking in the proper arguments such as pipeline name and pipeline description.

client.create_pipeline({ “pipeline_name”: “pipeline_1”, “pipeline_description”: “first instance of a pipeline” })

CLI layer

Similarly, the team built on the APIs to create a command line interface for users who want to access the API methods with a straightforward interface without needing to write Python code. They used the open-source Python package Click (Command Line Interface Creation Kit). The benefits of this framework are the arbitrary nesting of commands, automatic help page generation, and support of lazy loading of subcommands at runtime. In the same file as in the SDK, each SDK client method was wrapped using Click and the required method arguments were converted to command line flags. The flag inputs are then used when calling the SDK command.

To launch the CLI, you can use the CLI configure command. You’re prompted for the endpoint URL:

$ intel-cli configure Endpoint url:

Now you can use the CLI to call different commands related to the API methods, for example:

$ intel-cli list-users [ { “user_id”: 1, “group_id”: 1, “user_name”: “Intel User”, “user_email”: } ]


As a database, this application uses Aurora Serverless to store metadata associated with each of the APIs with MYSQL as the database engine. Choosing the Aurora Serverless database service adheres to the design principle to minimize infrastructural overhead by utilizing serverless AWS services when possible. The following diagram illustrates this architecture.

The serverless engine mode meets the intermittent usage pattern because this application scales up to new customers and workloads are still uncertain. When launching a database endpoint, a specific DB instance size isn’t required, only a minimum and maximum range for cluster capacity. Aurora Serverless handles the appropriate provisioning of a router fleet and distributes the workload amongst the resources. Aurora Serverless automatically performs backup retention for a minimum of 1 day up to 35 days. The team optimized for safety by setting the default to the maximum value of 35.

In addition, the team used the Data API to handle access to the Aurora Serverless cluster, which doesn’t require a persistent connection, and instead provides a secure HTTP endpoint and integration with AWS SDKs. This feature uses AWS Secrets Manager to store the database credentials so there is no need to explicitly pass credentials. CREATE TABLE scripts were written in .sql files for each of the nine tables that correspond to the nine APIs. Because this database contained all the metadata and state of objects in the system, the API methods were run using the appropriate SQL commands (for example select * from Job for the list_jobs API) and passed to the execute_statement method from the Amazon RDS client in the Data API.

Workflow orchestration

The functional backbone of the application was handled using Step Functions to coordinate the workflow, as shown in the following diagram.

The state machine consisted of a sequence of four Lambda functions, which starts when a job is submitted using the create_job method from the job API. The user ID, project ID, pipeline ID, pipeline parameter set ID, job results path, job parameters, and job status are required for job creation. You can first upload a .zip package of video files using the upload_handler method from the job API to generate a presigned S3 URL. During job submission, the input file path is passed via the job parameters, to specify the location of the file. This starts the run of the workflow state machine, triggering four main steps:

  • Initializer Lambda function
  • Submitter Lambda function
  • Completion Check Lambda function
  • Collector Lambda function

Initializer Lambda function

The main function of the Initializer is to separate the .zip package into individual video files and prepare them for the Submitter. First, the .zip file is downloaded, and then each individual file, including video files, is unzipped and extracted. The videos, preferably in .mp4 format, are uploaded back into an S3 bucket. Using the create_video method in the video API, a video object is created with the video path as input. This inserts data on each video into the Aurora database. Any other file types, such as JSON files, are considered metadata and similarly uploaded, but no video object is created. A list of the names of files and video files extracted is passed to the next step.

Submitter Lambda function

The Submitter function takes the video files that were extracted by the Initializer and creates mini-batches of video frames as images. Most current computer vision models in production have been trained on images so even when video is processed, they’re first separated into image frames before model inference. This current solution using a state-of-the-art pose estimation model is no different—the frame batches from the Submitter are passed to Kinesis Data Streams to initiate the inference generation step.

First, the video file is downloaded by the Lambda function. The frame rate and number of frames is calculated using the FileVideoStream module from the processing library. The frames are extracted and grouped according to a specified mini-batch size, which is one of the key tunable parameters of this pipeline. Using the Python pickle library, the data is serialized and uploaded to Amazon S3. Subsequently, a frame batch object is created and the metadata entry is created in the Aurora database. This Lambda function was built using a Dockerfile with dependencies on opencv-python, numpy, and imutils libraries.

Completion Check Lambda function

The Completion Check function continues to query the Aurora database to see, for each video in the .zip package for this current job, how many frame batches are in the COMPLETED status. When all frame batches for all videos are complete, then this check process is complete.

Collector Lambda function

The Collector function takes the outputs of the inferences that were performed on each frame during the Consumer stage and combines them across a frame batch and across a video. The combined merged data is then upload to an S3 bucket. The function then invokes the Cortex postprocessing API for a particular ML pipeline in order to perform any postprocessing computations, and adds the aggregated results by video to the output bucket. Many of these metrics are calculated across frames, such as speed, acceleration, and joint angle, so this calculation needs to be performed on the aggregated data. The main outputs include body key points data (aggregated into CSV format), BMA calculations (such as acceleration), and visual overlay of key points added to each frame in an image file.

Scalable pose estimation inference generation

The processing engine that powers the scaling of ML inference happens in this stage. It involves three main pieces, each with have their own concurrency levers that can be tuned for latency tradeoffs (see the following diagram).

This architecture allows for experimentation in testing latency gains, as well as flexibility for the future when workloads may change with different mixes of end-user segments that access the application.

Kinesis Data Streams

The team chose Kinesis Data Streams because it’s typically used to handle streaming data, and in this case is a good fit because it can handle frame batches in a similar way to provide scalability and parallelization. In the Submitter Lambda function, the Kinesis Boto3 client is used, with the put_record method passing in the metadata associated with a single frame batch, such as the frame batch ID, frame batch starting frame, frame batch ending frame, image shape, frame rate, and video ID.

We defined various job queue and Kinesis data stream configurations to set levels of throughput that tie back to the priority level of different user groups. Access to different levels of processing power is linked by passing a project queue ARN when creating a new project using the project API. Every user group is then linked to a particular project during user group creation. Three default stream configurations are defined in the AWS Serverless Application Model (AWS SAM) infrastructure template:

  • Standard – JobStreamShardCount
  • Priority – PriorityJobStreamShardCount
  • High priority – HighPriorityJobStreamShardCount

The team used a few different levers to differentiate the processing power of each stream or tune the latency of the system, as summarized in the following table.

Lever Description Default value
Shard A shard is native to Kinesis Data Streams; it’s the base unit of throughput for ingestion. The default is 1MB/sec, which equates to 1,000 data records per second. 2
KinesisBatchSize The maximum number of records that Kinesis Data Streams retrieves in a single batch before invoking the consumer Lambda function. 1
KinesisParallelizationFactor The number of batches to process from each shard concurrently. 1
Enhanced fan-out Data consumers who have enhanced fan-out activated have a dedicated ingestion throughput per consumer (such as the default 1MB/sec) instead of sharing throughput across consumers. Off

Consumer Lambda function

From the perspective of Kinesis Data Streams, a data consumer is an AWS service that retrieves data from a data stream shard as data is generated in a stream. This application uses a Consumer Lambda function, which is invoked when messages are passed from the data stream queues. Each Consumer function processes one frame batch by performing the following steps. First, a call is made to the Cortex processor API synchronously, which is the endpoint that hosts the model inference pipeline (see the next section regarding Amazon EKS with Cortex for more details). The results are stored in Amazon S3, and an update is made to the database by changing the status of the processed frame batch to Complete. Error handling is built in to manage the Cortex API call with a retry loop to handle any 504 errors from the Cortex cluster, with number of retries set to 5.

Amazon EKS with Cortex for ML inference

The team used Amazon EKS, a managed Kubernetes service in AWS, as the compute engine for ML inference. A design choice was made to use Amazon EKS to host ML endpoints, giving the flexibility of running upstream Kubernetes with the option of clusters both fully managed in AWS via AWS Fargate, or on-premises hardware via Amazon EKS Anywhere. This was a critical piece of functionality desired by Intel OTG, which provided the option to hook up this application to specialized on-premises hardware, for example.

The three ML models that were the building blocks for constructing the inference pipelines were a custom Yolo model (for object detection), a custom HRNet model (for 2D pose estimation), and a 3DMPPE model (for 3D pose estimation) (see the previous ML section for more details). They used the open-source Cortex library to handle deployment and management of ML inference pipeline endpoints, and launching and deployment of the Amazon EKS clusters. Each of these models were packaged up into Docker containers—model files were stored in Amazon S3 and model images were stored in Amazon Elastic Container Registry (Amazon ECR)—and deployed as Cortex Realtime APIs. Versions of the model containers that run on CPU and GPU were created to provide flexibility for the type of compute hardware. In the future, if additional models or model pipelines need to be added, they can simply create additional Cortex Realtime APIs.

They then constructed inference pipelines by composing together the Cortex Realtime model APIs into Cortex Realtime pipeline APIs. A single Realtime pipeline API consisted of calling a sequence of Realtime model APIs. The Consumer Lambda functions treated a pipeline API as a black box, using a single API call to retrieve the final inference output for an image. Two pipelines were created: a 2D pipeline and a 3D pipeline.

The 2D pipeline combines a decoding preprocessing step, object detection using a custom Yolo model to locate the athlete and produce bounding boxes, and finally a custom HRNet model for creating 2D key points for pose estimation.

The 3D pipeline combines a decoding preprocessing step, object detection using a custom Yolo model to locate the athlete and produce bounding boxes, and finally a 3DMPPE model for creating 3D key points for pose estimation.

After generating inferences on a batch of frames, each pipeline also includes a separate postprocessing Realtime Cortex endpoint that generates three main outputs:

  • Aggregated body key points data into a single CSV file
  • BMA calculations (such as acceleration)
  • Visual overlay of key points added to each frame in an image file

The Collector Lambda function submits the appropriate metadata associated with a particular video, such as the frame IDs and S3 locations of the pose estimation inference outputs, to the endpoint to generate these postprocessing outputs.

Cortex is designed to be integrated with Amazon EKS, and only requires a cluster configuration file and a simple command to launch a Kubernetes cluster:

cortex cluster up –config cluster_config_file.yaml

Another lever for performance tuning was the instance configuration for the compute clusters. Three tiers were created with varying mixes of M5 and G4dn instances, codified as .yaml files with specifications such as cluster name, Region, and instance configuration and mix. M5 instances are lower-cost CPU-based and G4dn are higher cost GPU-based to provide some cost/performance tradeoffs.

Operational monitoring

To maintain operational logging standards, all Lambda functions include code to record and ingest logs via Amazon Kinesis Data Firehose. For example, every frame batch processed from the Submitter Lambda function is logged with the timestamp, name of action, and Lambda function response JSON and saved to Amazon S3. The following diagram illustrates this step in the architecture.


Deployment is handled using AWS SAM, an open-source framework for building serverless applications in AWS. AWS SAM enables infrastructure design, including functions, APIs, databases, and event source mappings to be codified and easily deployed in new AWS environments. During deployment, the AWS SAM syntax is translated into AWS CloudFormation to handle the infrastructure provisioning.

A template.yaml file contains the infrastructure specifications along with tunable parameters, such as Kinesis Data Streams latency levers detailed in the preceding sections. A samconfig.toml file contains deployment parameters such as stack name, S3 bucket name where application files like Lambda function code is stored, and resource tags for tracking cost. A shell script with the simple commands is all that is required to build and deploy the entire template:

sam build -t template.yaml sam deploy –config-env “env_name” –profile “profile_name”

User work flow

To sum up, after the infrastructure has been deployed, you can follow this workflow to get started:

  1. Create an Intel 3DAT client using the client library.
  2. Use the API to create a new instance of a pipeline corresponding to the type of processing that is required, such as one for 3D pose estimation.
  3. Create a new instance of a project, passing in the cluster ARN and Kinesis queue ARN.
  4. Create a new instance of a pipeline parameter set.
  5. Create a new instance of pipeline parameters that map to the pipeline parameter set.
  6. Create a new user group that is associated with a project ID and a pipeline parameter set ID.
  7. Create a new user that is associated with the user group.
  8. Upload a .zip file of videos to Amazon S3 using a presigned S3 URL generated by the upload function in the job API.
  9. Submit a create_job API call, with job parameters that specify location of the video files. This starts the processing job.


The application is now live and ready to be tested with athletes and coaches alike. Intel OTG is excited to make innovative pose estimation technology using computer vision accessible for a variety of users, from developers to athletes to software vendor partners.

The AWS team is passionate about helping customers like Intel OTG accelerate their ML journey, from the ideation and discovery stage with ML Solutions Lab to the hardening and deployment stage with AWS ML ProServe. We will all be watching closely at the 2021 Tokyo Olympics this summer to envision all the progress that ML can unlock in sports.

Get started today! Explore your use case with the services mentioned in this post and many others on the AWS Management Console.

About the Authors

Han Man is a Senior Manager- Machine Learning & AI at AWS based in San Diego, CA. He has a PhD in engineering from Northwestern University and has several years of experience as a management consultant advising clients in manufacturing, financial services, and energy. Today he is passionately working with customers from a variety of industries to develop and implement machine learning & AI solutions on AWS. He enjoys following the NBA and playing basketball in his spare time.

Iman Kamyabi is an ML Engineer with AWS Professional Services. He has worked with a wide range of AWS customers to champion best practices in setting up repeatable and reliable ML pipelines.

Jonathan Lee is the Director of Sports Performance Technology, Olympic Technology Group at Intel. He studied the application of machine learning to health as an undergrad at UCLA and during his graduate work at University of Oxford. His career has focused on algorithm and sensor development for health and human performance. He now leads the 3D Athlete Tracking project at Intel.

Nelson Leung is the Platform Architect in the Sports Performance CoE at Intel, where he defines end-to-end architecture for cutting-edge products that enhance athlete performance. He also leads the implementation, deployment and productization of these machine learning solutions at scale to different Intel partners.

Troy Squillaci is a DecSecOps engineer at Intel where he delivers professional software solutions to customers through DevOps best practices. He enjoys integrating AI solutions into scalable platforms in a variety of domains.

Paul Min is an Associate Solutions Architect Intern at Amazon Web Services (AWS), where he helps customers across different industry verticals advance their mission and accelerate their cloud adoption. Previously at Intel, he worked as a Software Engineering Intern to help develop the 3D Athlete Tracking Cloud SDK. Outside of work, Paul enjoys playing golf and can be heard singing.


Continue Reading


Intelligently search your Jira projects with Amazon Kendra Jira cloud connector

Organizations use agile project management platforms such as Atlassian Jira to enable teams to collaborate to plan, track, and ship deliverables. Jira captures organizational knowledge about the workings of the deliverables in the issues and comments logged during project implementation. However, making this knowledge easily and securely available to users is challenging due to it…




Organizations use agile project management platforms such as Atlassian Jira to enable teams to collaborate to plan, track, and ship deliverables. Jira captures organizational knowledge about the workings of the deliverables in the issues and comments logged during project implementation. However, making this knowledge easily and securely available to users is challenging due to it being fragmented across issues belonging to different projects and sprints. Additionally, because different stakeholders such as developers, test engineers, and project managers contribute to the same issue by logging it and then adding attachments and comments, traditional keyword-based search is rendered ineffective when searching for information in Jira projects.

You can now use the Amazon Kendra Jira cloud connector to index issues, comments, and attachments in your Jira projects, and search this content using Amazon Kendra intelligent search, powered by machine learning (ML).

This post shows how to use the Amazon Kendra Jira cloud connector to configure a Jira cloud instance as a data source for an Amazon Kendra index, and intelligently search the contents of the projects in it. We use an example of Jira projects where team members collaborate by creating issues and adding information to them in the form of descriptions, comments, and attachments throughout the issue lifecycle.

Solution overview

A Jira instance has one or more projects, where each project has team members working on issues in that project. Each team member has set of permissions about the operations they can perform with respect to different issues in the project they belong to. Team members can create new issues, or add more information to the issues in the form of attachments and comments, as well as change the status of an issue from its opening to closure throughout the issue lifecycle defined for that project. A project manager creates sprints, assigns issues to specific sprints, and assigns owners to issues. During the course of the project, the knowledge captured in these issues keeps evolving.

In our solution, we configure a Jira cloud instance as a data source to an Amazon Kendra search index using the Amazon Kendra Jira connector. Based on the configuration, when the data source is synchronized, the connector crawls and indexes the content from the projects in the Jira instance. Optionally, you can configure it to index the content based on the change log. The connector also collects and ingests access control list (ACL) information for each issue, comment, and attachment. The ACL information is used for user context filtering, where search results for a query are filtered by what a user has authorized access to.


To try out the Amazon Kendra connector for Jira using this post as a reference, you need the following:

Jira instance configuration

This section describes the Jira configuration used to demonstrate how to configure an Amazon Kendra data source using the Jira connector, ingest the data from the Jira projects into the Amazon Kendra index, and make search queries. You can use your own Jira instance for which you have admin access or create a new project and carry out the steps to try out the Amazon Kendra connector for Jira.

In our example Jira instance, we created two projects to demonstrate that the search queries made by users return results from only the projects to which they have access. We used data from the following public domain projects to simulate the use case of real-life software development projects:

The following is a screenshot of our Kanban-style board for project 1.

Create an API token for the Jira instance

To get the API token needed to configure the Amazon Kendra Jira connector, complete the following steps:

  1. Log in to
  2. Choose Create API token.
  3. In the dialog box that appears, enter a label for your token and choose Create.
  4. Choose Copy and enter the token on a temporary notepad.

You can’t copy this token again, and you need it to configure the Amazon Kendra Jira connector.

Configure the data source using the Amazon Kendra connector for Jira

To add a data source to your Amazon Kendra index using the Jira connector, you can use an existing index or create a new index. Then complete the following steps. For more information on this topic, refer to Amazon Kendra Developer Guide.

  1. On the Amazon Kendra console, open your index and choose Data sources in the navigation pane.
  2. Choose Add data source.
  3. Under Jira, choose Add connector.
  4. In the Specify data source details section, enter the details of your data source and choose Next.
  5. In the Define access and security section, for Jira Account URL, enter the URL of your Jira cloud instance.
  6. Under Authentication, you have two options:
    1. Choose Create to add a new secret using the Jira API token you copied from your Jira instance and use the email address used to log in to Jira as the Jira ID. (This is the option we choose for this post.)
    2. Use an existing AWS Secrets Manager secret that has the API token for the Jira instance you want the connector to access.
  7. For IAM role, choose Create a new role or choose an existing IAM role configured with appropriate IAM policies to access the Secrets Manager secret, Amazon Kendra index, and data source.
  8. Choose Next.
  9. In the Configure sync settings section, provide information about your sync scope and run schedule.
  10. Choose Next.
  11. In the Set field mappings section, you can optionally configure the field mappings, or how the Jira field names are mapped to Amazon Kendra attributes or facets.
  12. Choose Next.
  13. Review your settings and confirm to add the data source.
  14. After the data source is added, choose Data sources in the navigation pane, select the newly added data source, and choose Sync now to start data source synchronization with the Amazon Kendra index.
    The sync process can take about 10–15 minutes. Let’s now enable access control for the Amazon Kendra index.
  15. In the navigation pane, choose your index.
  16. In the middle pane, choose the User access control tab.
  17. Choose Edit settings and change the settings to look like the following screenshot.
  18. Choose Next and then choose Update.

Perform intelligent search with Amazon Kendra

Before you try searching on the Amazon Kendra console or using the API, make sure that the data source sync is complete. To check, view the data sources and verify if the last sync was successful.

  1. To start your search, on the Amazon Kendra console, choose Search indexed content in the navigation pane.
    You’re redirected to the Amazon Kendra Search console.
  2. Expand Test query with an access token and choose Apply token.
  3. For Username, enter the email address associated with your Jira account.
  4. Choose Apply.

Now we’re ready to search our index. Let’s use the query “where does boto3 store security tokens?”

In this case, Kendra provides a suggested answer from one of the cards in our Kanban project on Jira.

Note that this is also a suggested answer pointing to an issue discussing AWS security tokens and Boto3. You may also build search experience with multiple data sources including SDK documentation and wikis with Amazon Kendra, and present results and related links accordingly. The following screenshot shows another search query made against the same index.

Note that when we apply a different access token (associate the search with a different user), the search results are restricted to projects that this user has access to.

Lastly, we can also use filters relevant to Jira in our search. First, we navigate to our index’s Facet definition page and check Facetable for j_status, j_assignee, and j_project_name. For every search, we can then filter by these fields, as shown in the following screenshot.

Clean up

To avoid incurring future costs, clean up the resources you created as part of this solution. If you created a new Amazon Kendra index while testing this solution, delete it. If you only added a new data source using the Amazon Kendra connector for Jira, delete that data source.


With the Amazon Kendra Jira connector, your organization can make invaluable knowledge in your Jira projects available to your users securely using intelligent search powered by Amazon Kendra.

To learn more about the Amazon Kendra Jira connector, refer to the Amazon Kendra Jira connector section of the Amazon Kendra Developer Guide.

For more information on other Amazon Kendra built-in connectors to popular data sources, refer to Unravel the knowledge in Slack workspaces with intelligent search using the Amazon Kendra Slack connector and Search for knowledge in Quip documents with intelligent search using the Quip connector for Amazon Kendra.

About the Authors

Shreyas Subramanian is an AI/ML specialist Solutions Architect, and helps customers by using Machine Learning to solve their business challenges on the AWS Cloud.

Abhinav JawadekarAbhinav Jawadekar is a Principal Solutions Architect focused on Amazon Kendra in the AI/ML language services team at AWS. Abhinav works with AWS customers and partners to help them build intelligent search solutions on AWS.


Continue Reading


Copyright © 2021 Today's Digital.