One Giant Leap in Automation: Amazon Step Functions

If you were ever curious how many words there are in each sentence from Wikipedia’s entry on the legendary band Queen, you are about to find out. Even better, you’re about to find out how to find this out for yourself, using the power of automation and Amazon Step Functions. Andrei Elefterescu, Levi9 JavaScript Tech Lead, has 10 years experience in AWS and worked with Amazon Step Functions even before their launch, during the Beta program. Andrei will be your guide to the melodic orchestration of triggers, functions, validations, transition states, conditional routes, and callbacks. It sounds complicated, but trust us, it’s a kind of magic.

What is Amazon Step Functions?

To find out how many words are in each sentence of Queen’s Wikipedia biography, you’ll need to do some common-sense actions: read the text, split it into sentences, count the words, then put back together all the information about each sentence in one single text.

 

Surely you don’t need automation for that. It’s a straight-forward, simple task. But what if you needed to count all the words in the Wikipedia entries of the Top 100 British bands of all time?

 

This is where Andrei Elefterescu steps in to explain what Amazon Step functions is.

 

Amazon Step Functions is a serverless orchestration service that helps us integrate multiple AWS services such as Lambda, S3, SNS, and so on to develop various applications. Going back to the Queen example, those steps define a “Step function”.

While it’s easier to visualize as a diagram, Amazon Step Functions is written in an AWS proprietary language, which is JSON-based.

 

Of course, Amazon Step Functions is much more useful than counting words on a Wikipedia page. The service can be used to process images in S3, ETL (Extract – Transform – Load) processes, machine learning, microservice orchestration, IT and security automation, as well as Continuous Integration and Continuous Deployment (CICD), a process that automates the integration and deployment of code.

 

“Amazon Step Functions has around 250 integrations with other AWS services and around 11,000 API calls that can be called from it. In 2016, when they launched, they only had an integration with S3 and 3–4 other services.”

Key concepts

Before we delve deeper into how Amazon Step Functions work, here are some key concepts, as explained by Andrei:

Main advantages: Parallel run and serverless

But is this melodic, yet complicated, orchestration needed? Couldn’t a Lambda function do the job instead?

 

For one thing, Amazon Steps are faster, highlights Andrei. This is due to their ability to support parallel runs, such as a simple map mode with 40 concurrent executions or a dynamic map mode with up to 10,000 parallel executions. On top of that, “Amazon Step Functions is a serverless service that allows for stateless states without having to have a separate database or cache, decoupling the logical side from the business side.” In other words, there is no need to worry about sizing the resources to fit the process. This is done automatically. Some other advantages that Andrei appreciates are the fact that it’s easy to see the state of your workflows and that it integrates retry mechanisms and exponential back-off.

 

This “Plan B” for errors plays a crucial role in workflows. For example, if we sent a blank Wikipedia text instead of the real Queen biography to Amazon Step Functions, an error mechanism would be triggered. Without it, the function would have stopped working. And errors can happen during the processing of large amounts of data, and some of them can stop the process completely. This is one of the several drawbacks that Andrei mentions.

Disadvantages: Hard limits on executions and history

There is a hard limit of 25.000 executions per workflow. While analyzing a Wikipedia article doesn’t seem like much, when you need to count the words in 10.000 sentences, things change. Remember, each step normally means three executions: the state before the step, the step, and the state after the step. “So, with 10.000 sentences, it’s pretty easy to get there”, emphasizes Andrei. “And if you go over 25.000, the workflow will stop, and there is nothing that you can do about it.”

 

Also, as Amazon Step Functions decouples the business logic from the logic part of the workflow, the resulting code is more complex and harder to understand. A relatively simple image processing workflow can have clear inputs and outputs, but the connection between the various microservices is challenging to understand. Another disadvantage is limited mobility. The state machine is written in the Amazon State Language, which is a proprietary language of AWS. This means that if you want to move to Azure, for example, you need to start from scratch.

 

Another disadvantage is the 250 KB limit that can be passed between states in the step function and between transitions. Plus, the execution history is kept for only 90 days, and the maximum execution of a workflow is one year.

 

Wait, a workflow that takes one year? “It’s possible,” says Andrei. “Amazon Step Functions has callbacks, which allow the workflow to pause and wait until a human provides some input we are waiting for. Then, the callback is invoked, and the workflow resumes.”

Best practices

Most of these drawbacks can be overcome by best practices. For example, AWS’s 256 Kb interstate is limiting, but AWS itself came up with a solution. You can write the payload to a PS3 file and process it with a Lambda function.

 

The native retry mechanism is also very useful. Amazon Step Functions provides a retry and exponential back-off mechanism to catch Lambda exceptions and throw specific errors. It also allows for the ability to make a specific decision if a Lambda fails with a throw error, such as not found.

 

Additionally, there is also a way to avoid the limit of having only 100.000 objects to process. “Yes, we did get there at some point, and it was not pleasant.” The solution might be to keep an eye on the process and trigger an alert when you are close to the limit.

 

Even so, the costs offset all the small inconveniences of Amazon Step Functions. In Levi9’s experience, even complex processes for large clients do not go over $5 per month.

Published:
6 June 2023

Related white papers