Organizations of all sizes and across all industries gather and analyze metrics or key performance indicators (KPIs) to help their businesses run effectively and efficiently. Operational metrics are used to evaluate performance, compare results, and track relevant data to improve business outcomes. For example, you can use operational metrics to determine application performance (the average time it takes to render a page for an end user) or application availability (the duration of time the application was operational). One challenge that most organizations face today is detecting anomalies in operational metrics, which are key in ensuring continuity of IT system operations.
Traditional rule-based methods are manual and look for data that falls outside of numerical ranges that have been arbitrarily defined. An example of this is an alert when transactions per hour fall below a certain number. This results in false alarms if the range is too narrow, or missed anomalies if the range is too broad. These ranges are also static. They don’t change based on evolving conditions like the time of the day, day of the week, seasons, or business cycles. When anomalies are detected, developers, analysts, and business owners can spend weeks trying to identify the root cause of the change before they can take action.
Amazon Lookout for Metrics uses machine learning (ML) to automatically detect and diagnose anomalies without any prior ML experience. In a couple of clicks, you can connect Lookout for Metrics to popular data stores like Amazon Simple Storage Service (Amazon S3), Amazon Redshift, and Amazon Relational Database Service (Amazon RDS), as well as third-party software as a service (SaaS) applications (such as Salesforce, Dynatrace, Marketo, Zendesk, and ServiceNow) via Amazon AppFlow and start monitoring metrics that are important to your business.
This post demonstrates how you can connect to your IT operational infrastructure monitored by Dynatrace using Amazon AppFlow and set up an accurate anomaly detector across metrics and dimensions using Lookout for Metrics. The solution allows you to set up a continuous anomaly detector and optionally set up alerts to receive notifications when anomalies occur.
Lookout for Metrics integrates seamlessly with Dynatrace to detect anomalies within your operational metrics. Once connected, Lookout for Metrics uses ML to start monitoring data and metrics for anomalies and deviations from the norm. Dynatrace enables monitoring of your entire infrastructure, including your hosts, processes, and network. You can perform log monitoring and view information such as the total traffic of your network, the CPU usage of your hosts, the response time of your processes, and more.
Amazon AppFlow is a fully managed service that provides integration capabilities by enabling you to transfer data between SaaS applications like Datadog, Salesforce, Marketo, and Slack and AWS services like Amazon S3 and Amazon Redshift. It provides capabilities to transform, filter, and validate data to generate enriched and usable data in a few easy steps.
In this post, we demonstrate how to integrate with an environment monitored by Dynatrace and detect anomalies in the operation metrics. We also determine how application availability and performance (resource contention) were impacted.
The source data is a cluster of Amazon Elastic Compute Cloud (Amazon EC2) instances that is monitored by Dynatrace. Each EC2 instance is installed with Dynatrace OneAgent to collect all monitored telemetry data (CPU utilization, memory, network utilization, and disk I/O). Amazon AppFlow enables you to securely integrate SaaS applications like Dynatrace and automate data flows, while providing options to configure and connect to such services natively from the AWS Management Console or via API. In this post, we focus on connecting to Dynatrace as our source and Lookout for Metrics as the target, both of which are natively supported applications in Amazon AppFlow.
The solution enables you to create an Amazon AppFlow data flow from Dynatrace to Lookout for Metrics. You can then use Lookout for Metrics to detect any anomalies in the telemetry data, as shown in the following diagram. Optionally, you can send automated anomaly alerts to AWS Lambda functions, webhooks, or Amazon Simple Notification Service (Amazon SNS) topics.
The following are the high-level steps to implement the solution:
- Set up Amazon AppFlow integration with Dynatrace.
- Create an anomaly detector with Lookout for Metrics.
- Add a dataset to the detector and integrate Dynatrace metrics.
- Activate the detector.
- Create an alert.
- Review the detector and data flow status.
- Review and analyze any anomalies.
Set up Amazon AppFlow integration with Dynatrace
To set up the data flow, complete the following steps:
- On the Amazon AppFlow console, choose Create flow.
- For Flow name, enter a name.
- For Flow description, enter an optional description.
- In the Data encryption section, you can choose or create an AWS Key Management Service (AWS KMS) key.
- Choose Next.
- For Source name, choose Dynatrace.
- For Choose Dynatrace Connection, choose the connection you created.
- For Choose Dynatrace object, choose Problems (this is the only object supported as of this writing).
For more information about Dynatrace problems, see Problem overview page.
- For Destination name, choose Amazon Lookout for Metrics.
- For API token, generate an API token from the Dynatrace console.
- For Subdomain, enter your Dynatrace portal URL address.
- For Data encryption, choose the AWS KMS key.
- For Connection Name, enter a name.
- Choose Connect.
- For Flow trigger, select Run flow on schedule.
- For Repeats, choose Minutes (alternatively, you can choose hourly or daily).
- Set the trigger to repeat every 5 minutes.
- Enter a starting time.
- Enter a start date.
Dynatrace requires a between date range filter to be set.
- For Field name, choose Date range.
- For Condition, choose is between.
- For Criteria 1, choose your start date.
- For Criteria 2, choose your end date.
- Review your settings and choose Create flow.
Create an anomaly detector with Lookout for Metrics
To create your anomaly detector, complete the following steps:
- On the Lookout for Metrics console, choose Create detector.
- For Detector name, enter a name.
- For Description, enter an optional description.
- For Interval, choose the time between each analysis. This should match the interval set on the flow.
- For Encryption, create or choose an existing AWS KMS key.
- Choose Create.
Add a dataset to the detector and integrate Dynatrace metrics
The next step in activating your anomaly detector is to add a dataset and integrate the Dynatrace metrics.
- On the detector details, choose Add a dataset.
- For Name, enter the data source name.
- For Description, enter an optional description.
- For Timezone, choose the time zone relevant to your dataset. This should match the time zone used in Amazon AppFlow (which picks up from the browser).
- For Datasource, choose Dynatrace.
- For Amazon AppFlow flow, choose the flow that you created.
- For Permissions, choose a service role.
- Choose Next.
- For Map fields, the detector tracks 5 measures; in this example I choose impactLevel and hasRootCause.
- For Dimensions, the detector creates segments in measure values. For this post, I choose severityLevel.
- Review the settings and choose Save dataset.
Activate the detector
Create an alert
You can create an alert to send automated anomaly alerts to Lambda functions; webhooks; cloud applications like Slack, PagerDuty, and DataDog; or to SNS topics with subscribers that use SMS, email, or push notifications.
- On the detector details, choose Add alerts.
- For Alert Name, enter the name.
- For Sensitivity threshold, enter a threshold at which the detector sends anomaly alerts.
- For Channel, choose either Amazon SNS or Lambda as the notification method. For this post, I use Amazon SNS.
- For SNS topic, create or choose an existing SNS topic.
- For Service role, choose an execution role.
- Choose Add alert.
Review the detector and flow status
Review and analyze any anomalies
On the Anomalies page, you can adjust the severity score on the threshold dial to filter anomalies above a given score.
The following analysis represents the severity level and impacted metrics. The graph suggests anomalies detected by the detector with the availability and resource contention being impacted. The anomaly was detected on June 28 at 14:30 PDT and has a severity score of 98, indicating a high severity anomaly that needs immediate attention.
Lookout for Metrics also allows you to provide real-time feedback on the relevance of the detected anomalies, which enables a powerful human-in-the-loop mechanism. This information is fed back to the anomaly detection model to improve its accuracy continuously, in near-real time.
Anomaly detection can be very useful in identifying anomalies that could signal potential issues within your operational environment. Timely detection of anomalies can aid in troubleshooting, help avoid loss in revenue, and help maintain your company’s reputation. Lookout for Metrics automatically inspects and prepares the data, selects the best-suited ML algorithm, begins detecting anomalies, groups related anomalies together, and summarizes potential root causes.
To get started with this capability, see Amazon Lookout for Metrics. You can use this capability in all Regions where Lookout for Metrics is publicly available. For more information about Region availability, see AWS Regional Services.
About the Author
Sumeeth Siriyur is a Solutions Architect based out of AWS, Sydney. He is passionate about infrastructure services and uses AI services to influence IT infrastructure observability and management. In his spare time, he likes binge-watching and works to continually improve his outdoor sports.
Customize pronunciation using lexicons in Amazon Polly
Amazon Polly is a text-to-speech service that uses advanced deep learning technologies to synthesize natural-sounding human speech. It is used in a variety of use cases, such as contact center systems, delivering conversational user experiences with human-like voices for automated real-time status check, automated account and billing inquiries, and by news agencies like The Washington…
Amazon Polly is a text-to-speech service that uses advanced deep learning technologies to synthesize natural-sounding human speech. It is used in a variety of use cases, such as contact center systems, delivering conversational user experiences with human-like voices for automated real-time status check, automated account and billing inquiries, and by news agencies like The Washington Post to allow readers to listen to news articles.
As of today, Amazon Polly provides over 60 voices in 30+ language variants. Amazon Polly also uses context to pronounce certain words differently based upon the verb tense and other contextual information. For example, “read” in “I read a book” (present tense) and “I will read a book” (future tense) is pronounced differently.
However, in some situations you may want to customize the way Amazon Polly pronounces a word. For example, you may need to match the pronunciation with local dialect or vernacular. Names of things (e.g., Tomato can be pronounced as tom-ah-to or tom-ay-to), people, streets, or places are often pronounced in many different ways.
In this post, we demonstrate how you can leverage lexicons for creating custom pronunciations. You can apply lexicons for use cases such as publishing, education, or call centers.
Customize pronunciation using SSML tag
Let’s say you stream a popular podcast from Australia and you use the Amazon Polly Australian English (Olivia) voice to convert your script into human-like speech. In one of your scripts, you want to use words that are unknown to Amazon Polly voice. For example, you want to send Mātariki (Māori New Year) greetings to your New Zealand listeners. For such scenarios, Amazon Polly supports phonetic pronunciation, which you can use to achieve a pronunciation that is close to the correct pronunciation in the foreign language.
You can use the
First, login into your AWS console and search for Amazon Polly in the search bar at the top. Select Amazon Polly and then choose Try Polly button.
In the Amazon Polly console, select Australian English from the language dropdown and enter following text in the Input text box and then click on Listen to test the pronunciation.
Sample speech without applying phonetic pronunciation:
If you hear the sample speech above, you can notice that the pronunciation of Mātariki – a word which is not part of Australian English – isn’t quite spot-on. Now, let’s look at how in such scenarios we can use phonetic pronunciation using
To use SSML tags, turn ON the SSML option in Amazon Polly console. Then copy and paste following SSML script containing phonetic pronunciation for Mātariki specified inside the ph attribute of the
Sample speech after applying phonetic pronunciation:
If you hear the sample sound, you’ll notice that we opted for a different pronunciation for some of vowels (e.g., ā) to make Amazon Polly synthesize the sounds that are closer to the correct pronunciation. Now you might have a question, how do I generate the phonetic transcription “mA:.tA:.ri.ki” for the word Mātariki?
You can create phonetic transcriptions by referring to the Phoneme and Viseme tables for the supported languages. In the example above we have used the phonemes for Australian English.
Amazon Polly offers support in two phonetic alphabets: IPA and X-Sampa. Benefit of X-Sampa is that they are standard ASCII characters, so it is easier to type the phonetic transcription with a normal keyboard. You can use either of IPA or X-Sampa to generate your transcriptions, but make sure to stay consistent with your choice, especially when you use a lexicon file which we’ll cover in the next section.
Each phoneme in the phoneme table represents a speech sound. The bolded letters in the “Example” column of the Phoneme/Viseme table in the Australian English page linked above represent the part of the word the “Phoneme” corresponds to. For example, the phoneme /j/ represents the sound that an Australian English speaker makes when pronouncing the letter “y” in “yes.”
Customize pronunciation using lexicons
Phoneme tags are suitable for one-off situations to customize isolated cases, but these are not scalable. If you process huge volume of text, managed by different editors and reviewers, we recommend using lexicons. Using lexicons, you can achieve consistency in adding custom pronunciations and simultaneously reduce manual effort of inserting phoneme tags into the script.
A good practice is that after you test the custom pronunciation on the Amazon Polly console using the
Create a lexicon file
A lexicon file contains the mapping between words and their phonetic pronunciations. Pronunciation Lexicon Specification (PLS) is a W3C recommendation for specifying interoperable pronunciation information. The following is an example PLS document:
Make sure that you use correct value for the xml:lang field. Use en-AU if you’re uploading the lexicon file to use with the Amazon Polly Australian English voice. For a complete list of supported languages, refer to Languages Supported by Amazon Polly.
You can also use
For more information on lexicon file format, see Pronunciation Lexicon Specification (PLS) Version 1.0 on the W3C website.
You can save a lexicon file with as a .pls or .xml file before uploading it to Amazon Polly.
Upload and apply the lexicon file
Upload your lexicon file to Amazon Polly using the following instructions:
- On the Amazon Polly console, choose Lexicons in the navigation pane.
- Choose Upload lexicon.
- Enter a name for the lexicon and then choose a lexicon file.
- Choose the file to upload.
- Choose Upload lexicon.
If a lexicon by the same name (whether a .pls or .xml file) already exists, uploading the lexicon overwrites the existing lexicon.
Now you can apply the lexicon to customize pronunciation.
- Choose Text-to-Speech in the navigation pane.
- Expand Additional settings.
- Turn on Customize pronunciation.
- Choose the lexicon on the drop-down menu.
You can also choose Upload lexicon to upload a new lexicon file (or a new version).
It’s a good practice to version control the lexicon file in a source code repository. Keeping the custom pronunciations in a lexicon file ensures that you can consistently refer to phonetic pronunciations for certain words across the organization. Also, keep in mind the pronunciation lexicon limits mentioned on Quotas in Amazon Polly page.
Test the pronunciation after applying the lexicon
Let’s perform quick test using “Wishing all my listeners in NZ, a very Happy Mātariki” as the input text.
Before applying the lexicon:
After applying the lexicon:
In this post, we discussed how you can customize pronunciations of commonly used acronyms or words not found in the selected language in Amazon Polly. You can use
Summary of resources
About the Authors
Ratan Kumar is a Solutions Architect based out of Auckland, New Zealand. He works with large enterprise customers helping them design and build secure, cost-effective, and reliable internet scale applications using the AWS cloud. He is passionate about technology and likes sharing knowledge through blog posts and twitch sessions.
Maciek Tegi is a Principal Audio Designer and a Product Manager for Polly Brand Voices. He has worked in professional capacity in the tech industry, movies, commercials and game localization. In 2013, he was the first audio engineer hired to the Alexa Text-To- Speech team. Maciek was involved in releasing 12 Alexa TTS voices across different countries, over 20 Polly voices, and 4 Alexa celebrity voices. Maciek is a triathlete, and an avid acoustic guitar player.
AWS Week in Review – May 16, 2022
This post is part of our Week in Review series. Check back each week for a quick roundup of interesting news and announcements from AWS! I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you…
I had been on the road for the last five weeks and attended many of the AWS Summits in Europe. It was great to talk to so many of you in person. The Serverless Developer Advocates are going around many of the AWS Summits with the Serverlesspresso booth. If you attend an event that has the booth, say “Hi ” to my colleagues, and have a coffee while asking all your serverless questions. You can find all the upcoming AWS Summits in the events section at the end of this post.
Last week’s launches
Here are some launches that got my attention during the previous week.
AWS Step Functions announced a new console experience to debug your state machine executions – Now you can opt-in to the new console experience of Step Functions, which makes it easier to analyze, debug, and optimize Standard Workflows. The new page allows you to inspect executions using three different views: graph, table, and event view, and add many new features to enhance the navigation and analysis of the executions. To learn about all the features and how to use them, read Ben’s blog post.
Example on how the Graph View looks
AWS Lambda now supports Node.js 16.x runtime – Now you can start using the Node.js 16 runtime when you create a new function or update your existing functions to use it. You can also use the new container image base that supports this runtime. To learn more about this launch, check Dan’s blog post.
AWS Amplify announces its Android library designed for Kotlin – The Amplify Android library has been rewritten for Kotlin, and now it is available in preview. This new library provides better debugging capacities and visibility into underlying state management. And it is also using the new AWS SDK for Kotlin that was released last year in preview. Read the What’s New post for more information.
Three new APIs for batch data retrieval in AWS IoT SiteWise – With this new launch AWS IoT SiteWise now supports batch data retrieval from multiple asset properties. The new APIs allow you to retrieve current values, historical values, and aggregated values. Read the What’s New post to learn how you can start using the new APIs.
AWS Secrets Manager now publishes secret usage metrics to Amazon CloudWatch – This launch is very useful to see the number of secrets in your account and set alarms for any unexpected increase or decrease in the number of secrets. Read the documentation on Monitoring Secrets Manager with Amazon CloudWatch for more information.
Other AWS News
Some other launches and news that you may have missed:
IBM signed a deal with AWS to offer its software portfolio as a service on AWS. This allows customers using AWS to access IBM software for automation, data and artificial intelligence, and security that is built on Red Hat OpenShift Service on AWS.
Podcast Charlas Técnicas de AWS – If you understand Spanish, this podcast is for you. Podcast Charlas Técnicas is one of the official AWS podcasts in Spanish. This week’s episode introduces you to Amazon DynamoDB and shares stories on how different customers use this database service. You can listen to all the episodes directly from your favorite podcast app or the podcast web page.
AWS Open Source News and Updates – Ricardo Sueiras, my colleague from the AWS Developer Relation team, runs this newsletter. It brings you all the latest open-source projects, posts, and more. Read edition #112 here.
Upcoming AWS Events
It’s AWS Summits season and here are some virtual and in-person events that might be close to you:
You can register for re:MARS to get fresh ideas on topics such as machine learning, automation, robotics, and space. The conference will be in person in Las Vegas, June 21–24.
That’s all for this week. Check back next Monday for another Week in Review!
Personalize your machine translation results by using fuzzy matching with Amazon Translate
A person’s vernacular is part of the characteristics that make them unique. There are often countless different ways to express one specific idea. When a firm communicates with their customers, it’s critical that the message is delivered in a way that best represents the information they’re trying to convey. This becomes even more important when…
A person’s vernacular is part of the characteristics that make them unique. There are often countless different ways to express one specific idea. When a firm communicates with their customers, it’s critical that the message is delivered in a way that best represents the information they’re trying to convey. This becomes even more important when it comes to professional language translation. Customers of translation systems and services expect accurate and highly customized outputs. To achieve this, they often reuse previous translation outputs—called translation memory (TM)—and compare them to new input text. In computer-assisted translation, this technique is known as fuzzy matching. The primary function of fuzzy matching is to assist the translator by speeding up the translation process. When an exact match can’t be found in the TM database for the text being translated, translation management systems (TMSs) often have the option to search for a match that is less than exact. Potential matches are provided to the translator as additional input for final translation. Translators who enhance their workflow with machine translation capabilities such as Amazon Translate often expect fuzzy matching data to be used as part of the automated translation solution.
In this post, you learn how to customize output from Amazon Translate according to translation memory fuzzy match quality scores.
Translation Quality Match
The XML Localization Interchange File Format (XLIFF) standard is often used as a data exchange format between TMSs and Amazon Translate. XLIFF files produced by TMSs include source and target text data along with match quality scores based on the available TM. These scores—usually expressed as a percentage—indicate how close the translation memory is to the text being translated.
Some customers with very strict requirements only want machine translation to be used when match quality scores are below a certain threshold. Beyond this threshold, they expect their own translation memory to take precedence. Translators often need to apply these preferences manually either within their TMS or by altering the text data. This flow is illustrated in the following diagram. The machine translation system processes the translation data—text and fuzzy match scores— which is then reviewed and manually edited by translators, based on their desired quality thresholds. Applying thresholds as part of the machine translation step allows you to remove these manual steps, which improves efficiency and optimizes cost.
Figure 1: Machine Translation Review Flow
The solution presented in this post allows you to enforce rules based on match quality score thresholds to drive whether a given input text should be machine translated by Amazon Translate or not. When not machine translated, the resulting text is left to the discretion of the translators reviewing the final output.
The solution architecture illustrated in Figure 2 leverages the following services:
- Amazon Simple Storage Service – Amazon S3 buckets contain the following content:
- Fuzzy match threshold configuration files
- Source text to be translated
- Amazon Translate input and output data locations
- AWS Systems Manager – We use Parameter Store parameters to store match quality threshold configuration values
- AWS Lambda – We use two Lambda functions:
- One function preprocesses the quality match threshold configuration files and persists the data into Parameter Store
- One function automatically creates the asynchronous translation jobs
- Amazon Simple Queue Service – An Amazon SQS queue triggers the translation flow as a result of new files coming into the source bucket
Figure 2: Solution Architecture
You first set up quality thresholds for your translation jobs by editing a configuration file and uploading it into the fuzzy match threshold configuration S3 bucket. The following is a sample configuration in CSV format. We chose CSV for simplicity, although you can use any format. Each line represents a threshold to be applied to either a specific translation job or as a default value to any job.
default, 75 SourceMT-Test, 80
The specifications of the configuration file are as follows:
- Column 1 should be populated with the name of the XLIFF file—without extension—provided to the Amazon Translate job as input data.
- Column 2 should be populated with the quality match percentage threshold. For any score below this value, machine translation is used.
- For all XLIFF files whose name doesn’t match any name listed in the configuration file, the default threshold is used—the line with the keyword default set in Column 1.
Figure 3: Auto-generated parameter in Systems Manager Parameter Store
When a new file is uploaded, Amazon S3 triggers the Lambda function in charge of processing the parameters. This function reads and stores the threshold parameters into Parameter Store for future usage. Using Parameter Store avoids performing redundant Amazon S3 GET requests each time a new translation job is initiated. The sample configuration file produces the parameter tags shown in the following screenshot.
The job initialization Lambda function uses these parameters to preprocess the data prior to invoking Amazon Translate. We use an English-to-Spanish translation XLIFF input file, as shown in the following code. It contains the initial text to be translated, broken down into what is referred to as segments, represented in the source tags.
The source text has been pre-matched with the translation memory beforehand. The data contains potential translation alternatives—represented as
- Segments received with alternative translations and a match quality below the threshold are untouched or empty. This signals to Amazon Translate that they must be translated.
- Segments received with alternative translations with a match quality above the threshold are pre-populated with the suggested target text. Amazon Translate skips those segments.
Let’s assume the quality match threshold configured for this job is 80%. The first segment with 99% match quality isn’t machine translated, whereas the second segment is, because its match quality is below the defined threshold. In this configuration, Amazon Translate produces the following output:
In the second segment, Amazon Translate overwrites the target text initially suggested (Selección) with a higher quality translation: Visita de selección.
One possible extension to this use case could be to reuse the translated output and create our own translation memory. Amazon Translate supports customization of machine translation using translation memory thanks to the parallel data feature. Text segments previously machine translated due to their initial low-quality score could then be reused in new translation projects.
In the following sections, we walk you through the process of deploying and testing this solution. You use AWS CloudFormation scripts and data samples to launch an asynchronous translation job personalized with a configurable quality match threshold.
Launch AWS CloudFormation stack
- Choose Launch Stack:
- For Stack name, enter a name.
- For ConfigBucketName, enter the S3 bucket containing the threshold configuration files.
- For ParameterStoreRoot, enter the root path of the parameters created by the parameters processing Lambda function.
- For QueueName, enter the SQS queue that you create to post new file notifications from the source bucket to the job initialization Lambda function. This is the function that reads the configuration file.
- For SourceBucketName, enter the S3 bucket containing the XLIFF files to be translated. If you prefer to use a preexisting bucket, you need to change the value of the CreateSourceBucket parameter to No.
- For WorkingBucketName, enter the S3 bucket Amazon Translate uses for input and output data.
- Choose Next.
Figure 4: CloudFormation stack details
- Optionally on the Stack Options page, add key names and values for the tags you may want to assign to the resources about to be created.
- Choose Next.
- On the Review page, select I acknowledge that this template might cause AWS CloudFormation to create IAM resources.
- Review the other settings, then choose Create stack.
AWS CloudFormation takes several minutes to create the resources on your behalf. You can watch the progress on the Events tab on the AWS CloudFormation console. When the stack has been created, you can see a CREATE_COMPLETE message in the Status column on the Overview tab.
Test the solution
Let’s go through a simple example.
- Download the following sample data.
- Unzip the content.
There should be two files: an .xlf file in XLIFF format, and a threshold configuration file with .cfg as the extension. The following is an excerpt of the XLIFF file.
Figure 5: English to French sample file extract
- On the Amazon S3 console, upload the quality threshold configuration file into the configuration bucket you specified earlier.
The value set for test_En_to_Fr is 75%. You should be able to see the parameters on the Systems Manager console in the Parameter Store section.
- Still on the Amazon S3 console, upload the .xlf file into the S3 bucket you configured as source. Make sure the file is under a folder named translate (for example,
This starts the translation flow.
- Open the Amazon Translate console.
A new job should appear with a status of In Progress.
Figure 6: In progress translation jobs on Amazon Translate console
- Once the job is complete, click into the job’s link and consult the output. All segments should have been translated.
All segments should have been translated. In the translated XLIFF file, look for segments with additional attributes named lscustom:match-quality, as shown in the following screenshot. These custom attributes identify segments where suggested translation was retained based on score.
Figure 7: Custom attributes identifying segments where suggested translation was retained based on score
These were derived from the translation memory according to the quality threshold. All other segments were machine translated.
You have now deployed and tested an automated asynchronous translation job assistant that enforces configurable translation memory match quality thresholds. Great job!
If you deployed the solution into your account, don’t forget to delete the CloudFormation stack to avoid any unexpected cost. You need to empty the S3 buckets manually beforehand.
In this post, you learned how to customize your Amazon Translate translation jobs based on standard XLIFF fuzzy matching quality metrics. With this solution, you can greatly reduce the manual labor involved in reviewing machine translated text while also optimizing your usage of Amazon Translate. You can also extend the solution with data ingestion automation and workflow orchestration capabilities, as described in Speed Up Translation Jobs with a Fully Automated Translation System Assistant.
About the Authors
Narcisse Zekpa is a Solutions Architect based in Boston. He helps customers in the Northeast U.S. accelerate their adoption of the AWS Cloud, by providing architectural guidelines, design innovative, and scalable solutions. When Narcisse is not building, he enjoys spending time with his family, traveling, cooking, and playing basketball.
Dimitri Restaino is a Solutions Architect at AWS, based out of Brooklyn, New York. He works primarily with Healthcare and Financial Services companies in the North East, helping to design innovative and creative solutions to best serve their customers. Coming from a software development background, he is excited by the new possibilities that serverless technology can bring to the world. Outside of work, he loves to hike and explore the NYC food scene.
Fostering inclusive spaces through Disability Alliance
Facebook: Community Standards Enforcement Report Assessment Results
Facebook: Widely Viewed Content Report, First Quarter 2022
Build a cold start time series forecasting engine using AutoGluon
AWS Local Zones Are Now Open in Las Vegas, New York City, and Portland
Use deep learning frameworks natively in Amazon SageMaker Processing
Amazon2 months ago
Build a cold start time series forecasting engine using AutoGluon
Amazon7 months ago
AWS Local Zones Are Now Open in Las Vegas, New York City, and Portland
Amazon5 months ago
Use deep learning frameworks natively in Amazon SageMaker Processing
Amazon11 months ago
Build accurate ML training datasets using point-in-time queries with Amazon SageMaker Feature Store and Apache Spark