Connect with us


Automate model retraining with Amazon SageMaker Pipelines when drift is detected

Training your machine learning (ML) model and serving predictions is usually not the end of the ML project. The accuracy of ML models can deteriorate over time, a phenomenon known as model drift. Many factors can cause model drift, such as changes in model features. The accuracy of ML models can also be affected by…



Training your machine learning (ML) model and serving predictions is usually not the end of the ML project. The accuracy of ML models can deteriorate over time, a phenomenon known as model drift. Many factors can cause model drift, such as changes in model features. The accuracy of ML models can also be affected by concept drift, the difference between data used to train models and data used during inference. Therefore, teams must make sure that their models and solutions remain relevant and continue providing value back to the business. Without proper metrics, alarms, and automation in place, the technical debt from simply maintaining existing ML models in production can become overwhelming.

Amazon SageMaker Pipelines is a native workflow orchestration tool for building ML pipelines that take advantage of direct Amazon SageMaker integration. Three components improve the operational resilience and reproducibility of your ML workflows: pipelines, model registry, and projects. These workflow automation components enable you to easily scale your ability to build, train, test, and deploy hundreds of models in production, iterate faster, reduce errors due to manual orchestration, and build repeatable mechanisms.

In this post, we discuss how to automate retraining with pipelines in SageMaker when model drift is detected.

From static models to continuous training

Static models are a great place to start when you’re experimenting with ML. However, because real-world data is always changing, static models degrade over time, and your training dataset won’t represent real behavior for long. Having an effective deployment model monitoring phase is an important step when building an MLOps pipeline. It’s also one of the most challenging aspects of MLOps because it requires having an effective feedback loop between the data captured by your production system and the data distribution used during the training phase. For model retraining to be effective, you also must be continuously updating your training dataset with new ground truth labels. You might be able to use implicit or explicit feedback from users based on the predictions you provide, such as in the case of recommendations. Alternatively, you may need to introduce a human in the loop workflow through a service like Amazon Augmented AI (Amazon A2I) to qualify the accuracy of predictions from your ML system. Other considerations are to monitor predictions for bias on a regular basis, which can be supported through Amazon SageMaker Clarify.

In this post, we propose a solution that focuses on data quality monitoring to detect concept drift in the production data and retrain your model automatically.

Solution overview

Our solution uses an open-source AWS CloudFormation template to create a model build and deployment pipeline. We use Pipelines and supporting AWS services, including AWS CodePipeline, AWS CodeBuild, and Amazon EventBridge.

The following diagram illustrates our architecture.

The following are the high-level steps for this solution:

  1. Create the new Amazon SageMaker Studio project based on the custom template.
  2. Create a SageMaker pipeline to perform data preprocessing, generate baseline statistics, and train a model.
  3. Register the model in the SageMaker Model Registry.
  4. The data scientist verifies the models metrics and performance and approves the model.
  5. The model is deployed to a real-time endpoint in staging and, after approval, to a production endpoint.
  6. Amazon SageMaker Model Monitor is configured on the production endpoint to detect a concept drift of the data with respect to the training baseline.
  7. Model Monitor is scheduled to run every hour, and publishes metrics to Amazon CloudWatch.
  8. A CloudWatch alarm is raised when metrics exceed a model-specific threshold. This results in an EventBridge rule starting the model build pipeline.
  9. The model build pipeline can also be retrained with an EventBridge rule that runs on a schedule.


This solution uses the New York City Taxi and Limousine Commission (TLC) Trip Record Data public dataset to train a model to predict taxi fare based on the information available for that trip. The available information includes the start and end location and travel date and time from which we engineer datetime features and distance traveled.

For example, in the following image, we see an example trip for location 65 (Downtown Brooklyn) to 68 (East Chelsea), which took 21 minutes and cost $20.

If you’re interested in understanding more about the dataset, you can use the exploratory data analysis notebook in the GitHub repository.

Get started

You can use the following quick start button to launch a CloudFormation stack to publish the custom SageMaker MLOps project template to the AWS Service Catalog:

Create a new project in Studio

After your MLOps project template is published, you can create a new project using your new template via the Studio UI.

  1. In the Studio sidebar, choose SageMaker Components and registries.
  2. Choose Projects on the drop-down menu.
  3. Choose Create Project.

On the Create project page, SageMaker templates is chosen by default. This option lists the built-in templates. However, you want to use the template you published for the drift detection pipeline.

  1. Choose Organization templates.
  2. Choose Amazon SageMaker drift detection template for real-time deployment.
  3. Choose Select project template.

If you have recently updated your AWS Service Catalog project, you may need to refresh Studio to make sure it finds the latest version of your template.

  1. In the Project details section, for Name, enter drift-detection.

Your project name must have 32 characters or fewer.

  1. Under Project template parameters, for RetrainSchedule, keep the default of cron(0 12 1 * ? *).
  2. Choose Create project.

When you choose Create project, a CloudFormation stack is created in your account. This takes a few minutes. If you’re interested in the components that are being deployed, navigate to the CloudFormation console. There you can find the stack that is being created.

When the page reloads, on the main project page you can find a summary of all resources created in the project. For now, we need to clone the sagemaker-drift-detection-build repository.

  1. On the Repository tab, choose clone repo… and accept the default values in the dialog box.

This clones the repository to the Studio space for your user. The notebook build-pipeline.ipynb is provided as an entry point for you to run through the solution and to help you understand how to use it.

  1. Open the notebook.
  2. Choose the Python 3 (Data Science) kernel for the notebook.

If a Studio instance isn’t already running, an instance is provisioned. This can take a couple of minutes. By default, an ml.t3.medium instance is launched, which is enough to run our notebook.

  1. When the notebook is open, edit the second cell with the actual project name you chose:

project_name = “drift-detection”

  1. Run the first couple of cells to initialize some variables we need later and to verify we have selected our project:

import sagemaker import json sess = sagemaker.session.Session() region_name = sess._region_name sm_client = sess.sagemaker_client project_id = sm_client.describe_project(ProjectName=project_name)[“ProjectId”] print(f”Project: {project_name} ({project_id})”)

Next, we must define the dataset that we’re using. In this example, we use data from NYC Taxi and Limousine Commission (TLC).

  1. Download the data from its public Amazon Simple Storage Service (Amazon S3) location and upload to the artifact bucket provisioned by the project template:

from sagemaker.s3 import S3Downloader, S3Uploader # Download to the data folder, and upload to the pipeline input uri download_uri = “s3://nyc-tlc/trip data/green_tripdata_2018-02.csv” S3Downloader().download(download_uri, “data”) # Uplaod the data to the input location artifact_bucket = f”sagemaker-project-{project_id}-{region_name}” input_data_uri = f”s3://{artifact_bucket}/{project_id}/input” S3Uploader().upload(“data”, input_data_uri) print(“Listing input files:”) for s3_uri in S3Downloader.list(input_data_uri): print(s3_uri.split(“/”)[-1])

For your own custom template, you could also upload your data directly in the input_data_uri location, because this is where the pipeline expects to find the training data.

Run the training pipeline

To run the pipeline, you can continue running the cells in the notebook. You can also start a pipeline run through the Studio interface.

  1. First, go back to the main project view page and choose the Pipelines tab.
  2. Choose the pipeline drift-detection-pipeline, which opens a tab containing a list of past runs.

When you first get to this page, you can see a previous failed run of the pipeline. This was started when the project was initialized. It failed because at the time there was no data for the pipeline to use.

You can now start a new pipeline with the data.

  1. Choose Start an execution.
  2. For Name, enter First-Pipeline-execution.
  3. Choose Start.

When the pipeline starts, it’s added to the list of pipeline runs with a status of Executing.

  1. Choose this pipeline run to open its details.

Pipelines automatically constructs a graph showing the data dependency for each step in the pipeline. Based on this, you can see the order in which the steps are completed.

If you wait for a few minutes and refresh the graph, you can notice that the BaselineJob and TrainModel steps run at the same time. This happened automatically. SageMaker understands that these steps are safe to run in parallel because there is no data dependency between them. You can explore this page by choosing the different steps and tabs on the page. Let’s look at what the different steps are doing:

  • PreprocessData – This step is responsible for preprocessing the data and transforming it into a format that is appropriate for the following ML algorithm. This step contains custom code developed for the particular use case.
  • BaselineJob – This step is responsible for generating a baseline regarding the expected type and distribution of your data. This is essential for the monitoring of the model. Model Monitor uses this baseline to compare against the latest collected data from the endpoint. This step doesn’t require custom code because it’s part of the Model Monitor offering.
  • TrainModel – This step is responsible for training an XGBoost regressor using the built-in implementation of the algorithm by SageMaker. Because we’re using the built-in model, no custom code is required.
  • EvaluateModel and CheckEvaluation – In these steps, we calculate an evaluation metric that is important for us, in this case the root mean square error (rmse) on the test set. If it’s less than the predefined threshold (7), we continue to the next step. If not, the pipeline stops. The EvaluateModel step requires custom code to compute the metric we’re interested in.
  • RegisterModel – During this step, the trained model from the TrainModel step is registered in the model registry. From there, we can centrally manage and deploy the trained models. No custom code is required at this step.

Approve and deploy a model

After the pipeline has finished running, navigate to the main project page. Go to the Model groups tab, choose the drift-detection model group, and then a registered model version. This brings up the specific model version page. You can inspect the outputs of this model including the following metrics. You can also get more insights from the XGBoost training report.

The deployment pipeline for a registered model is triggered based on its status. To deploy this model, complete the following steps:

  1. Choose Update status.
  2. Change the pending status to Approved.
  3. Choose Update status.

Approving the model generates an event in CloudWatch that gets captured by a rule in EventBridge, which starts the model deployment.

To see the deployment progress, navigate to the CodePipeline console. From the pipelines section, choose sagemaker-drift-detection-deploy to see the deployment of the approved model in progress.

This pipeline includes a build stage that gets the latest approved model version from the model registry and generates a CloudFormation template. The model is deployed to a staging SageMaker endpoint using this template.

Now you can return to the notebook to test the staging endpoint by running the following code:

predictor = wait_for_predictor(“staging”) payload = “1,-73.986114,40.685634,-73.936794,40.715370,5.318025,7,0,2” predictor.predict(data=payload)

Promote the model to production

If the staging endpoint is performing as expected, the model can be promoted to production. If this is the first time running this pipeline, you can approve this model by choosing Review in CodePipeline, entering any comments, and choosing Approve.

In our example, the approval of a model to production is a two-step process, reflecting different responsibilities and personas:

  1. First, we approved the model in the model registry to be tested on a staging endpoint. This would typically be performed by a data scientist after evaluating the model training results from a data science perspective.
  2. After the endpoint has been tested in the staging environment, the second approval is to deploy the model to production. This approval could be restricted by AWS Identity and Access Management (IAM) roles to be performed only by an operations or application team. This second approval could follow additional tests defined by these teams.

The production deployment has a few extra configuration parameters compared to the staging. Although the staging created a single instance to host the endpoint, this stage creates the endpoint with automatic scaling with two instances across multiple Availability Zones. Automatic scaling makes sure that if traffic increases, the deployed model scales out in order to meet the user request throughput. Additionally, the production variant of the deployment enables data capture, which means all requests and response predictions from the endpoint are logged to Amazon S3. A monitoring schedule, also deployed at this stage, analyzes this data. If data drift is detected, a new run of the build pipeline is started.

Monitor the model

For model monitoring, an important step is to define sensible thresholds that are relevant to your business problem. In our case, we want to be alerted if, for example, the underlying distribution of the prices of fares change. The deployment pipeline has a prod-config.json file that defines a metric and threshold for this drift detection.

  1. Navigate back to the main project page in Studio.
  2. Choose the Endpoints tab, where you can see both the staging and the prod endpoints are InService.
  3. Choose the prod endpoint to display additional details.

If you choose the prod version, the first thing you see is the monitoring schedule (which hasn’t run yet).

  1. Test the production endpoint and the data capture by sending some artificial traffic using the notebook cells under the Test Production and Inspect Data Capture sections.

This code also modifies the distribution of the input data, which causes a drift to be detected in the predicted fare amount when Model Monitor runs. This in turn raises an alarm and restarts the training pipeline to train a new model.

The monitoring schedule has been set to run hourly. After the hour, you can see that a new monitoring job is now In progress. This should take about 10 minutes to complete, at which point you should see its status change to Issue Found due to the data drift that we introduced.

To monitor the metrics emitted by the monitoring job, you can add a chart in Studio to inspect the different features over a relevant timeline.

  1. Choose the Data Quality tab of the Model Monitoring page.
  2. Choose Add chart, which reveals the chart properties.
  3. To get more insights into the monitoring job, choose the latest job to inspect the job details.

In Monitor Job Details, you can see a summary of the discovered constraint violations.

  1. To discover more, copy the long string under Processing Job ARN.
  2. Choose View Amazon SageMaker notebook, which opens a pre-populated notebook.
  3. In the notebook the cell, replace FILL-IN-PROCESSING-JOB-ARN with the ARN value you copied.
  4. Run all the notebook cells.

This notebook outputs a series of tables and graphs, including a distribution that compares the newly collected data (in blue) to the baseline metrics distribution (in green). For the geo_distance and passenger_count features for which we introduced artificial noise, you can see the shifts in distributions. Similarly, as a consequence you can notice a shift in the distribution for the fare_amount predicted value.

Retrain the model

The preceding change in data raises an alarm in response to the CloudWatch metrics published from Model Monitor exceeding the configured threshold. Thanks to the Pipelines integration with EventBridge, the model build pipeline is started to retrain the model on the latest data.

Navigating to the CloudWatch console and choosing Alarms should show that the alarm sagemaker-drift-detection-prod-threshold is in the status In Alarm. When the alarm changes to In alarm, a new run of the pipeline is started. You can see this on the pipeline tab of the main project in the Studio interface.

At this point, the new model that is generated suffers from the same drift if we use that same generated data to test the endpoint. This is because we didn’t change or update the training data. In a real production environment, for this pipeline to be effective, a process should exist to load newly labeled data to the location where the pipeline is getting the input data. This last detail is crucial when building your solution.

Clean up

The build-pipeline.ipynb notebook includes cells that you can run to clean up the following resources:

  • SageMaker prod endpoint
  • SageMaker staging endpoint
  • SageMaker pipeline workflow and model package group
  • Amazon S3 artifacts and SageMaker project

You can also clean up resources using the AWS Command Line Interface (AWS CLI):

  1. Delete the CloudFormation stack created to provision the production endpoint:
    aws cloudformation delete-stack —stack-name sagemaker-<>-deploy-prod
  1. Delete the CloudFormation stack created to provision the staging endpoint:
    aws cloudformation delete-stack —stack-name sagemaker-<>-deploy-staging
  1. Delete the CloudFormation stack created to provision the SageMaker pipeline and model package group:
    aws cloudformation delete-stack —stack-name sagemaker-<>-deploy-pipeline
  1. Empty the S3 bucket containing the artifacts output from the drift deployment pipeline:
    aws s3 rm —recursive s3://sagemaker-project-<>-region_name
  1. Delete the project, which removes the CloudFormation stack that created the deployment pipeline:
    aws sagemaker delete-project —project-name <>
  1. Delete the AWS Service Catalog project template:
    aws cloudformation delete-stack —stack-name <>


Model Monitor allows you to capture incoming data to the deployed model, detect changes, and raise alarms when significant data drift is detected. Additionally, Pipelines allows you to orchestrate building new model versions. With its integration with EventBridge, you can run a pipeline either on a schedule or on demand. The latter feature allows for integration between monitoring a model and automatically retraining a model when a drift in the incoming feature data has been detected.

You can use the code repository on GitHub as a starting point to try out this solution for your own data and use case.

Additional references

For additional information, see the following resources:

About the Author

Julian Bright is an Principal AI/ML Specialist Solutions Architect based out of Melbourne, Australia. Julian works as part of the global Amazon Machine Learning team and is passionate about helping customers realize their AI and ML journey through MLOps. In his spare time, he loves running around after his kids, playing soccer and getting outdoors.

Georgios Schinas is a Specialist Solutions Architect for AI/ML in the EMEA region. He is based in London and works closely with customers in UK. Georgios helps customers design and deploy machine learning applications in production on AWS with a particular interest in MLOps practices. In his spare time, he enjoys traveling, cooking and spending time with friends and family.

Theiss Heilker is an AI/ML Solutions Architect at AWS. He helps customer create AI/ML solutions and accelerate their Machine Learning journey. He is passionate about MLOps and in his spare time you can find him in the outdoors playing with his dog and son.

Alessandro Cerè is a Senior ML Solutions Architect at AWS based in Singapore, where he helps customers design and deploy Machine Learning solutions across the ASEAN region. Before being a data scientist, Alessandro was researching the limits of Quantum Correlation for secure communication. In his spare time, he’s a landscape and underwater photographer.


Continue Reading
Click to comment

Leave a Reply

Your email address will not be published.


AWS Week in Review – May 16, 2022

This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS! I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you…




This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS!

I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you in person. The Serverless Developer Advocates are going around many of the AWS Summits with the Serverlesspresso booth. If you attend an event that has the booth, say “Hi ” to my colleagues, and have a coffee while asking all your serverless questions. You can find all the upcoming AWS Summits in the events section at the end of this post.

Last week’s launches
Here are some launches that got my attention during the previous week.

AWS Step Functions announced a new console experience to debug your state machine executions – Now you can opt-in to the new console experience of Step Functions, which makes it easier to analyze, debug, and optimize Standard Workflows. The new page allows you to inspect executions using three different views: graph, table, and event view, and add many new features to enhance the navigation and analysis of the executions. To learn about all the features and how to use them, read Ben’s blog post.

Example on how the Graph View looks

Example on how the Graph View looks

AWS Lambda now supports Node.js 16.x runtime – Now you can start using the Node.js 16 runtime when you create a new function or update your existing functions to use it. You can also use the new container image base that supports this runtime. To learn more about this launch, check Dan’s blog post.

AWS Amplify announces its Android library designed for Kotlin – The Amplify Android library has been rewritten for Kotlin, and now it is available in preview. This new library provides better debugging capacities and visibility into underlying state management. And it is also using the new AWS SDK for Kotlin that was released last year in preview. Read the What’s New post for more information.

Three new APIs for batch data retrieval in AWS IoT SiteWise – With this new launch AWS IoT SiteWise now supports batch data retrieval from multiple asset properties. The new APIs allow you to retrieve current values, historical values, and aggregated values. Read the What’s New post to learn how you can start using the new APIs.

AWS Secrets Manager now publishes secret usage metrics to Amazon CloudWatch – This launch is very useful to see the number of secrets in your account and set alarms for any unexpected increase or decrease in the number of secrets. Read the documentation on Monitoring Secrets Manager with Amazon CloudWatch for more information.

For a full list of AWS announcements, be sure to keep an eye on the What’s New at AWS page.

Other AWS News
Some other launches and news that you may have missed:

IBM signed a deal with AWS to offer its software portfolio as a service on AWS. This allows customers using AWS to access IBM software for automation, data and artificial intelligence, and security that is built on Red Hat OpenShift Service on AWS.

Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish. This week’s episode introduces you to Amazon DynamoDB and shares stories on how different customers use this database service. You can listen to all the episodes directly from your favorite podcast app or the podcast web page.

AWS Open Source News and Updates – Ricardo Sueiras, my colleague from the AWS Developer Relation team, runs this newsletter. It brings you all the latest open-source projects, posts, and more. Read edition #112 here.

Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:

You can register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.

That’s all for this week. Check back next Monday for another Week in Review!

— Marcia


Continue Reading


Personalize your machine translation results by using fuzzy matching with Amazon Translate

A person’s vernacular is part of the characteristics that make them unique. There are often countless different ways to express one specific idea. When a firm communicates with their customers, it’s critical that the message is delivered in a way that best represents the information they’re trying to convey. This becomes even more important when…




A person’s vernacular is part of the characteristics that make them unique. There are often countless different ways to express one specific idea. When a firm communicates with their customers, it’s critical that the message is delivered in a way that best represents the information they’re trying to convey. This becomes even more important when it comes to professional language translation. Customers of translation systems and services expect accurate and highly customized outputs. To achieve this, they often reuse previous translation outputs—called translation memory (TM)—and compare them to new input text. In computer-assisted translation, this technique is known as fuzzy matching. The primary function of fuzzy matching is to assist the translator by speeding up the translation process. When an exact match can’t be found in the TM database for the text being translated, translation management systems (TMSs) often have the option to search for a match that is less than exact. Potential matches are provided to the translator as additional input for final translation. Translators who enhance their workflow with machine translation capabilities such as Amazon Translate often expect fuzzy matching data to be used as part of the automated translation solution.

In this post, you learn how to customize output from Amazon Translate according to translation memory fuzzy match quality scores.

Translation Quality Match

The XML Localization Interchange File Format (XLIFF) standard is often used as a data exchange format between TMSs and Amazon Translate. XLIFF files produced by TMSs include source and target text data along with match quality scores based on the available TM. These scores—usually expressed as a percentage—indicate how close the translation memory is to the text being translated.

Some customers with very strict requirements only want machine translation to be used when match quality scores are below a certain threshold. Beyond this threshold, they expect their own translation memory to take precedence. Translators often need to apply these preferences manually either within their TMS or by altering the text data. This flow is illustrated in the following diagram. The machine translation system processes the translation data—text and fuzzy match scores— which is then reviewed and manually edited by translators, based on their desired quality thresholds. Applying thresholds as part of the machine translation step allows you to remove these manual steps, which improves efficiency and optimizes cost.

Machine Translation Review Flow

Figure 1: Machine Translation Review Flow

The solution presented in this post allows you to enforce rules based on match quality score thresholds to drive whether a given input text should be machine translated by Amazon Translate or not. When not machine translated, the resulting text is left to the discretion of the translators reviewing the final output.

Solution Architecture

The solution architecture illustrated in Figure 2 leverages the following services:

  • Amazon Simple Storage Service – Amazon S3 buckets contain the following content:
    • Fuzzy match threshold configuration files
    • Source text to be translated
    • Amazon Translate input and output data locations
  • AWS Systems Manager – We use Parameter Store parameters to store match quality threshold configuration values
  • AWS Lambda – We use two Lambda functions:
    • One function preprocesses the quality match threshold configuration files and persists the data into Parameter Store
    • One function automatically creates the asynchronous translation jobs
  • Amazon Simple Queue Service – An Amazon SQS queue triggers the translation flow as a result of new files coming into the source bucket

Solution Architecture Diagram

Figure 2: Solution Architecture

You first set up quality thresholds for your translation jobs by editing a configuration file and uploading it into the fuzzy match threshold configuration S3 bucket. The following is a sample configuration in CSV format. We chose CSV for simplicity, although you can use any format. Each line represents a threshold to be applied to either a specific translation job or as a default value to any job.

default, 75 SourceMT-Test, 80

The specifications of the configuration file are as follows:

  • Column 1 should be populated with the name of the XLIFF file—without extension—provided to the Amazon Translate job as input data.
  • Column 2 should be populated with the quality match percentage threshold. For any score below this value, machine translation is used.
  • For all XLIFF files whose name doesn’t match any name listed in the configuration file, the default threshold is used—the line with the keyword default set in Column 1.

Auto-generated parameter in Systems Manager Parameter Store

Figure 3: Auto-generated parameter in Systems Manager Parameter Store

When a new file is uploaded, Amazon S3 triggers the Lambda function in charge of processing the parameters. This function reads and stores the threshold parameters into Parameter Store for future usage. Using Parameter Store avoids performing redundant Amazon S3 GET requests each time a new translation job is initiated. The sample configuration file produces the parameter tags shown in the following screenshot.

The job initialization Lambda function uses these parameters to preprocess the data prior to invoking Amazon Translate. We use an English-to-Spanish translation XLIFF input file, as shown in the following code. It contains the initial text to be translated, broken down into what is referred to as segments, represented in the source tags.

Consent Form CONSENT FORM FORMULARIO DE CONSENTIMIENTO Screening Visit: Screening Visit Selección

The source text has been pre-matched with the translation memory beforehand. The data contains potential translation alternatives—represented as tags—alongside a match quality attribute, expressed as a percentage. The business rule is as follows:

  • Segments received with alternative translations and a match quality below the threshold are untouched or empty. This signals to Amazon Translate that they must be translated.
  • Segments received with alternative translations with a match quality above the threshold are pre-populated with the suggested target text. Amazon Translate skips those segments.

Let’s assume the quality match threshold configured for this job is 80%. The first segment with 99% match quality isn’t machine translated, whereas the second segment is, because its match quality is below the defined threshold. In this configuration, Amazon Translate produces the following output:

Consent Form FORMULARIO DE CONSENTIMIENTO CONSENT FORM FORMULARIO DE CONSENTIMIENTO Screening Visit: Visita de selección Screening Visit Selección

In the second segment, Amazon Translate overwrites the target text initially suggested (Selección) with a higher quality translation: Visita de selección.

One possible extension to this use case could be to reuse the translated output and create our own translation memory. Amazon Translate supports customization of machine translation using translation memory thanks to the parallel data feature. Text segments previously machine translated due to their initial low-quality score could then be reused in new translation projects.

In the following sections, we walk you through the process of deploying and testing this solution. You use AWS CloudFormation scripts and data samples to launch an asynchronous translation job personalized with a configurable quality match threshold.


For this walkthrough, you must have an AWS account. If you don’t have an account yet, you can create and activate one.

Launch AWS CloudFormation stack

  1. Choose Launch Stack:
  2. For Stack name, enter a name.
  3. For ConfigBucketName, enter the S3 bucket containing the threshold configuration files.
  4. For ParameterStoreRoot, enter the root path of the parameters created by the parameters processing Lambda function.
  5. For QueueName, enter the SQS queue that you create to post new file notifications from the source bucket to the job initialization Lambda function. This is the function that reads the configuration file.
  6. For SourceBucketName, enter the S3 bucket containing the XLIFF files to be translated. If you prefer to use a preexisting bucket, you need to change the value of the CreateSourceBucket parameter to No.
  7. For WorkingBucketName, enter the S3 bucket Amazon Translate uses for input and output data.
  8. Choose Next.

    Figure 4: CloudFormation stack details

  9. Optionally on the Stack Options page, add key names and values for the tags you may want to assign to the resources about to be created.
  10. Choose Next.
  11. On the Review page, select I acknowledge that this template might cause AWS CloudFormation to create IAM resources.
  12. Review the other settings, then choose Create stack.

AWS CloudFormation takes several minutes to create the resources on your behalf. You can watch the progress on the Events tab on the AWS CloudFormation console. When the stack has been created, you can see a CREATE_COMPLETE message in the Status column on the Overview tab.

Test the solution

Let’s go through a simple example.

  1. Download the following sample data.
  2. Unzip the content.

There should be two files: an .xlf file in XLIFF format, and a threshold configuration file with .cfg as the extension. The following is an excerpt of the XLIFF file.

English to French sample file extract

Figure 5: English to French sample file extract

  1. On the Amazon S3 console, upload the quality threshold configuration file into the configuration bucket you specified earlier.

The value set for test_En_to_Fr is 75%. You should be able to see the parameters on the Systems Manager console in the Parameter Store section.

  1. Still on the Amazon S3 console, upload the .xlf file into the S3 bucket you configured as source. Make sure the file is under a folder named translate (for example, /translate/test_En_to_Fr.xlf).

This starts the translation flow.

  1. Open the Amazon Translate console.

A new job should appear with a status of In Progress.

Auto-generated parameter in Systems Manager Parameter Store

Figure 6: In progress translation jobs on Amazon Translate console

  1. Once the job is complete, click into the job’s link and consult the output. All segments should have been translated.

All segments should have been translated. In the translated XLIFF file, look for segments with additional attributes named lscustom:match-quality, as shown in the following screenshot. These custom attributes identify segments where suggested translation was retained based on score.

Custom attributes identifying segments where suggested translation was retained based on score

Figure 7: Custom attributes identifying segments where suggested translation was retained based on score

These were derived from the translation memory according to the quality threshold. All other segments were machine translated.

You have now deployed and tested an automated asynchronous translation job assistant that enforces configurable translation memory match quality thresholds. Great job!


If you deployed the solution into your account, don’t forget to delete the CloudFormation stack to avoid any unexpected cost. You need to empty the S3 buckets manually beforehand.


In this post, you learned how to customize your Amazon Translate translation jobs based on standard XLIFF fuzzy matching quality metrics. With this solution, you can greatly reduce the manual labor involved in reviewing machine translated text while also optimizing your usage of Amazon Translate. You can also extend the solution with data ingestion automation and workflow orchestration capabilities, as described in Speed Up Translation Jobs with a Fully Automated Translation System Assistant.

About the Authors

Narcisse Zekpa is a Solutions Architect based in Boston. He helps customers in the Northeast U.S. accelerate their adoption of the AWS Cloud, by providing architectural guidelines, design innovative, and scalable solutions. When Narcisse is not building, he enjoys spending time with his family, traveling, cooking, and playing basketball.

Dimitri Restaino is a Solutions Architect at AWS, based out of Brooklyn, New York. He works primarily with Healthcare and Financial Services companies in the North East, helping to design innovative and creative solutions to best serve their customers. Coming from a software development background, he is excited by the new possibilities that serverless technology can bring to the world. Outside of work, he loves to hike and explore the NYC food scene.


Continue Reading


Enhance the caller experience with hints in Amazon Lex

We understand speech input better if we have some background on the topic of conversation. Consider a customer service agent at an auto parts wholesaler helping with orders. If the agent knows that the customer is looking for tires, they’re more likely to recognize responses (for example, “Michelin”) on the phone. Agents often pick up…




We understand speech input better if we have some background on the topic of conversation. Consider a customer service agent at an auto parts wholesaler helping with orders. If the agent knows that the customer is looking for tires, they’re more likely to recognize responses (for example, “Michelin”) on the phone. Agents often pick up such clues or hints based on their domain knowledge and access to business intelligence dashboards. Amazon Lex now supports a hints capability to enhance the recognition of relevant phrases in a conversation. You can programmatically provide phrases as hints during a live interaction to influence the transcription of spoken input. Better recognition drives efficient conversations, reduces agent handling time, and ultimately increases customer satisfaction.

In this post, we review the runtime hints capability and use it to implement verification of callers based on their mother’s maiden name.

Overview of the runtime hints capability

You can provide a list of phrases or words to help your bot with the transcription of speech input. You can use these hints with built-in slot types such as first and last names, street names, city, state, and country. You can also configure these for your custom slot types.

You can use the capability to transcribe names that may be difficult to pronounce or understand. For example, in the following sample conversation, we use it to transcribe the name “Loreck.”

Conversation 1

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Checking

IVR: What is the account number?

Caller: 1111 2222 3333 4444

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Loreck

IVR: Thank you. The balance on your checking account is 123 dollars.

Words provided as hints are preferred over other similar words. For example, in the second sample conversation, the runtime hint (“Smythe”) is selected over a more common transcription (“Smith”).

Conversation 2

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Checking

IVR: What is the account number?

Caller: 5555 6666 7777 8888

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Smythe

IVR: Thank you. The balance on your checking account is 456 dollars.

If the name doesn’t match the runtime hint, you can fail the verification and route the call to an agent.

Conversation 3

IVR: Welcome to ACME bank. How can I help you today?

Caller: I want to check my account balance.

IVR: Sure. Which account should I pull up?

Caller: Savings

IVR: What is the account number?

Caller: 5555 6666 7777 8888

IVR: For verification purposes, what is your mother’s maiden name?

Caller: Jane

IVR: There is an issue with your account. For support, you will be forwarded to an agent.

Solution overview

Let’s review the overall architecture for the solution (see the following diagram):

  • We use an Amazon Lex bot integrated with an Amazon Connect contact flow to deliver the conversational experience.
  • We use a dialog codehook in the Amazon Lex bot to invoke an AWS Lambda function that provides the runtime hint at the previous turn of the conversation.
  • For the purposes of this post, the mother’s maiden name data used for authentication is stored in an Amazon DynamoDB table.
  • After the caller is authenticated, the control is passed to the bot to perform transactions (for example, check balance)

In addition to the Lambda function, you can also send runtime hints to Amazon Lex V2 using the PutSession, RecognizeText, RecognizeUtterance, or StartConversation operations. The runtime hints can be set at any point in the conversation and are persisted at every turn until cleared.

Deploy the sample Amazon Lex bot

To create the sample bot and configure the runtime phrase hints, perform the following steps. This creates an Amazon Lex bot called BankingBot, and one slot type (accountNumber).

  1. Download the Amazon Lex bot.
  2. On the Amazon Lex console, choose Actions, Import.
  3. Choose the file that you downloaded, and choose Import.
  4. Choose the bot BankingBot on the Amazon Lex console.
  5. Choose the language English (GB).
  6. Choose Build.
  7. Download the supporting Lambda code.
  8. On the Lambda console, create a new function and select Author from scratch.
  9. For Function name, enter BankingBotEnglish.
  10. For Runtime, choose Python 3.8.
  11. Choose Create function.
  12. In the Code source section, open and delete the existing code.
  13. Download the function code and open it in a text editor.
  14. Copy the code and enter it into the empty function code field.
  15. Choose deploy.
  16. On the Amazon Lex console, select the bot BankingBot.
  17. Choose Deployment and then Aliases, then choose the alias TestBotAlias.
  18. On the Aliases page, choose Languages and choose English (GB).
  19. For Source, select the bot BankingBotEnglish.
  20. For Lambda version or alias, enter $LATEST.
  21. On the DynamoDB console, choose Create table.
  22. Provide the name as customerDatabase.
  23. Provide the partition key as accountNumber.
  24. Add an item with accountNumber: “1111222233334444” and mothersMaidenName “Loreck”.
  25. Add item with accountNumber: “5555666677778888” and mothersMaidenName “Smythe”.
  26. Make sure the Lambda function has permissions to read from the DynamoDB table customerDatabase.
  27. On the Amazon Connect console, choose Contact flows.
  28. In the Amazon Lex section, select your Amazon Lex bot and make it available for use in the Amazon Connect contact flow.
  29. Download the contact flow to integrate with the Amazon Lex bot.
  30. Choose the contact flow to load it into the application.
  31. Make sure the right bot is configured in the “Get Customer Input” block.
  32. Choose a queue in the “Set working queue” block.
  33. Add a phone number to the contact flow.
  34. Test the IVR flow by calling in to the phone number.

Test the solution

You can now call in to the Amazon Connect phone number and interact with the bot.


Runtime hints allow you to influence the transcription of words or phrases dynamically in the conversation. You can use business logic to identify the hints as the conversation evolves. Better recognition of the user input allows you to deliver an enhanced experience. You can configure runtime hints via the Lex V2 SDK. The capability is available in all AWS Regions where Amazon Lex operates in the English (Australia), English (UK), and English (US) locales.

To learn more, refer to runtime hints.

About the Authors

Kai Loreck is a professional services Amazon Connect consultant. He works on designing and implementing scalable customer experience solutions. In his spare time, he can be found playing sports, snowboarding, or hiking in the mountains.

Anubhav Mishra is a Product Manager with AWS. He spends his time understanding customers and designing product experiences to address their business challenges.

Sravan Bodapati is an Applied Science Manager at AWS Lex. He focuses on building cutting edge Artificial Intelligence and Machine Learning solutions for AWS customers in ASR and NLP space. In his spare time, he enjoys hiking, learning economics, watching TV shows and spending time with his family.


Continue Reading


Copyright © 2021 Today's Digital.