A deep dive into the AWS Lambda Python runtime environment

This article focuses on describing an efficient method for capturing CloudWatch metrics in Python based Lambda functions. Recently we were involved in a project that had us extracting data from a 3rd party application by querying their APIs. At extreme risk of DOS-ing the service, we decided to implement metrics for every HTTP request to the API to measure the volume of calls being made and the response times from the third-party service.

We tailored our code for execution speed and maximum concurrency. This meant that any time spent initialising objects during the execution of our code had to be carefully considered, and decisions made about where best in the execution runtime to place those activities.

For example, we developed a helper module called api.py to contain all our logic for making HTTP requests. We used the Python library requests and the requests.Session() object to execute all "GET" and "POST" requests to the API endpoints. We made this logic available to the rest of our code in a little function called api_request, which had logging and error handling logic built into it.

The question is: when should you initialise your requests.Session() object? If that logic was contained in the api_request function, then every time you make an API request Python would spend a moment initialising a new Session object. This takes time. Not much time. But when you're optimising for speed and concurrency, milliseconds spent in the wrong place actually add up drastically. We were making several million API calls during the extraction process.

The good thing about Session objects is that they can be configured once and then reused until your code is finished. To optimise for speed you can initialise your Session object once in the global scope of a module. Our application logic was contained in a module called app.py (for the sake of this document). After importing the modules required to execute our task, app.py would then initialise and configure a single Session object. Then, whenever a function was required to make an HTTP request, the Session object would be passed from the global scope into the api_request function as a parameter, like so:

import requests

from api import api_request

# Create global requests Session object

my_session = requests.Session()

# configure the session as desired

# ... application code

def some_function():

  # make an api request here

  response = api_request(my_session, some_url, ... other params ...)

  return response

# ... more application code

I have since discovered that you can instead define the Session object in the scope of the api module, and not worry about it in your app logic at all. Like this:

api.py

import requests

session = requests.Session()

def api_request(url):
  r = session.get(url)
  return r

app.py

from api import api_request # python loads the api module and initialises the Session object

# ... application code

def some_function(some_url):
  # make an api request here
  response = api_request(some_url,  ... other params ...)
  return response
  
def main():
  r = api_request("my.url.com")
  return

if __name__ == "__main__":
  main()

Now the application logic doesn't need to worry about managing a Session object, and can instead focus solely on the business logic of querying API endpoints. This method will be used again we reach our discussion on CloudWatch metrics.

The important concept we have learnt in this section is that the scope in which you instantiate a new object in Python can have big impacts on the efficiency of your code.

Imports, scopes and names

In order to write effective Lambda functions and properly utilise our learnings from above, we need to have an appreciation for the lifecycle of a Python program.

In your local environment you might start and test your code by running a command like python app.py. Taking the pseudocode examples from before, this command would start the Python interpreter and proceed to import the api module, binding the api_request function to the name api_request in the local scope of app module. In the process of importing api, the session = requests.Session() code is executed and the session name is assigned in the scope of the api module. After the import and initialisation process has taken place (which includes execution of code in the global namespace of app.py) the actual function of the the app module is executed (i.e. code contained in the if __name__ == "__main__": block).

When the script is done executing, the Python interpreter closes and all the scopes and names are removed in the process.

The key consideration is this: what is the lifecycle of a Python program running inside an AWS Lambda function? It is not the same as executing python app.py.

Lambda functions

For further reading, please review the documentation: Understanding the Lambda execution environment lifecycle.

This diagram shows the general case for a Lambda function, which we know can be written in many different interpreters. In the context of this article remember that we are talking about the Python interpreter. I cannot attest to whether the following assessment holds true for other languages and interpreters like Javascript and Node.js etc.

Notice that the diagram is broken into for stages: INIT, INVOKE, a second INVOKE, and SHUTDOWN.

A intuitive understanding of Python might suggest that somewhere in this chain the command python app.py needs to be run. Perhaps at both the INVOKE steps? That might make sense, after all, we want to "invoke" our application code from time to time.

However, this is not the case. Our code isn't executed using a python app.py command, instead something much more interesting happens.

Lambda handlers

From the docs:

The Lambda function handler is the method in your function code that processes events. When your function is invoked, Lambda runs the handler method.

When we define and deploy a Lambda function we get to choose a Python module (file) and a function from that module to act as the "handler".

What do you think it means for Lambda to "run the handler method?" There must be some internal method inside of the Lambda service that is aware of our application code, and has a means to pass data down to it (the application).

Let's try to understand this process - with some hand waving and guesswork about the technical implementation (see this link for some deep Python shenanigans).

Before we go further, remember this: the Lambda service is always online. Our particular execution environment might not exist, but the service itself is waiting for a trigger to initialise an environment.

So, some request to invoke our function is sent to the Lambda service. The service assess whether an execution environment currently exists and has capacity to serve the request, and if that is not the case it begins an initialisation process and waits for that to complete before passing the additional data from the request down to the application.

This is where the INIT section from the diagram takes place. Lambda has received a request and needs an environment in which to execute the request against the application code. Notice that the INIT section has component called "FUNCTION INIT". It is in this period that our application code is initialised.

During FUNCTION INIT, the handler we defined for our application is dynamically imported into the Python interpreter of the execution environment. It is during this dynamic import process that all of our other module imports occur.

We can imagine the Lambda service works like this piece of Python pseudo code:

lambda_service.py

import importlib

if __name__ == "__main__":
  event = "hello world"
  context = None
  handler = "app.main"
  module = importlib.import_module(handler)
  module(event, context)

In this pseudo code, the imaginary lambda_service module doesn't directly import our handler “app.main", but instead dynamically imports it via the importlib.import_module method. Finally, the event data is sent to the application code in module(event, context).

Using the import_module method in this way does exactly the same thing as writing import app.main at the top of our application code. It imports the app module, which in turn imports all its own modules, and all the global variables in those modules are initialised.

Why does this matter? Because now we know that all global variables are initialised during the FUNCTION INIT portion of the Lambda execution environment lifecycle. This means that they technically exist before the Lambda function is invoked (technically because the invocation request is what triggers the initialisation the first time around).

To return to our original examples, this means the requests.Session() object in app.py is brought to life during FUNCTION INIT, and exists to serve all future invocations of the function, for the duration of the execution environment's lifespan.

This knowledge is contrary to the AWS documentation, which instructs us to treat each invocation of a Lambda function as if it is operating in its own independent, clean environment. You should treat each execution as being "stateless".

The fact is, your lambda invocation might not be happening in a clean environment. There could be all sorts of left over gunk from previous executions hanging around in the Python interpreter's namespace and memory.

Is this bad? No! It’s good, because we can take advantage of this knowledge to write efficient code.

Gathering CloudWatch metrics in Lambda functions

CloudWatch metrics are a fun way to gather interesting and potentially useless data about the inner workings of your application code. Instead of simply logging all errors and finding them in a CloudWatch log-group, you could post custom metrics and then build dashboards that require a wholly different skillset to digest.

We continue our story by implementing a method for gathering and posting metrics in our api.py module.

Since I have been rambling on about names and scopes, it might seem intuitive to create a boto3.Client("cloudwatch") object in the scope of the api module, and hook up some method to post metrics to CloudWatch every time we make an HTTP request in the api_request function. Which could look like like this:

app.py

import requests
import boto3

session = requests.Session()
cloudwatch_client = boto3.Client("cloudwatch")

def api_request(url):
  
  r = session.get(url)
  
  # ... some logic to gather a metric from this action
  
  cloudwatch_client.put_metric_data(our_data)
  
  return r

This works, but it has some drawbacks: First, adding the put_metric_data method into the api_request function slows down our code slightly. Remember that we’re making up to 1 million HTTP requests, so every millisecond counts.

Second, posting individual metrics to CloudWatch is expensive. There is one dominating factor that influences the way you write code for an AWS project: cost.

API name

Cost

PutMetricData

$0.01 per 1,000 requests

The cost each time we ran our pipeline is calculated like so:

(1,000,000 requests / 1,000 requests) * 0.01 dollars = 100 dollars

It wasn’t optimal to both slow down the pipeline and noticeably increase our costs by $100 each time we ran the machine.

Fortunately, the solution to improving the performance of our Lambda function is the same as the solution for reducing the cost of the the put_metric_data API call: batching data.

The put_metric_data method actually allows us to send a batch of metric data to CloudWatch in one hit. If we can utilise this capability, then we stand a chance to reduce associated costs.

How do we do this? With imports, scopes and names.

Optimising CloudWatch metrics

I am going to rephrase our problem scenario so we can think through the solution together.

I have application code running inside a Lambda function. Each time the function is invoked, my application code makes some number of HTTP requests. I want to gather metrics about those requests (count and duration) and minimise the number of times that data needs to be posted to CloudWatch. This will improve speed and cost.

We want to decouple the act of capturing and sending metric data from the act of making HTTP requests. This means we need to create some object in which we temporarily store metric data while the rest of our application code continues running. When the application is finished, and all our metrics have been captured safely in the object, we can then post them off to CloudWatch.

Recall that during the initialisation process for our api module we are instantiating requests.Session() and boto3.Client(“cloudwatch”) objects. These live in the global scope of the module, and can be utilised by the module’s functions. What we want is another object that acts in the same way, whose purpose is to store metric data. In our project we called it the LambdaMetrics object, and it was implemented as new Python class.

class LambdaMetrics:
    def __init__(self):
        # Dictionary to store count and duration metrics separately
        self.count_metrics = {}
        self.duration_metrics = defaultdict(list)

The LambdaMetrics class had two data properties: a dictionary of “count” based metrics”, and a dictionary of “duration” based metrics. It also featured three important class methods: add_metric and add_duration_metric which append data to the class’ attributes. These were responsible for appropriately batching up metric data. The third method was get_metrics which fetches the attributes and formats them so they can be consumed by the put_metric_data function. For simplicity, we stored the code for the class in a module called metrics.py. It was integrated into the api module like so:

import requests
from metrics import LambdaMetrics

session = requests.Session()
api_metrics = LambdaMetrics()

def api_request(url):
  
  r = session.get(url)
  
  # ... some logic to gather a metric from this action
  
  api_metrics.add_metric(our_data)
  
  return r

This is great. When our function code is initialised, a new LambdaMetrics object is created and is ready to store metric data. When the Lambda function is invoked and application code executed, the add_metric and add_duration_metric functions are called, and those chunks of data are appended to various dictionaries in the LambdaMetrics object.

Appending data to dictionaries and lists is faster than calling put_metric_data constantly, so we have improved the speed of our function again. Yay!

However, we haven’t yet devised a method for posting that data to CloudWatch.

Managing long living objects

Let’s zoom out again to the lifecycle of our Lambda execution environment.

During the INIT FUNCTION stage our application modules are imported into the Python interpreter and their various global variables initialised. With the example code above, the api_metrics = LambdaMetrics() code is run during this time, and api_metrics is now an object that exists in the scope of the api module. When our function is invoked the api_metrics object is filled with data.

What we want to do is surface the api_metrics object up from the api module and into a scope where we can access its data and methods. Ideally this would be in the same location where the Lambda service accesses our application code via the handler function. In the example of this article that is the app module.

app.py

import boto3

from api import api_request # python loads the api module and initialises the Session object
from api import api_metrics # this is the instance of LambdaMetrics for the api module

cloudwatch_client = boto3.Client("cloudwatch")

# ... application code

def some_function(some_url):
  # make an api request here
  response = api_request(some_url,  ... other params ...)
  return response
  
def main(event, context):
  # The lambda handler
  
  r = some_function("my.url.com")
  
  # ... some business logic
  function_metrics = api_metrics.get_metrics()
  # function_metrics now contains data that was stored in the api_metrics object
  cloudwatch_client.put_metric_data(function_metrics)
  
  return

In the pseudo code above I have also imported boto3 and, assuming that the get_metrics method returns correctly formatted data, have implemented the put_metric_data method at the end of the Lambda handler. So the last thing that happens during the invocation of this function is the metrics being posted to CloudWatch.

This might seem great, but there is a problem. When our function is invoked for the first time, api_metrics is filled with data. When our function is invoked for the second time, the same api_metrics object is filled with more data. This continues as long as the current execution environment for the Lambda function exists.

Notice the difference between LambdaMetrics objects that exist across function invocations versus requests.Session() objects. The properties of the Session object are set during the import and initialisation process and don’t change during the invocation. Whereas the properties of any LambdaMetrics objects are dynamic. By definition we wanted these objects to capture data specific to an invocation.

Since data from a previous invocation is not relevant to the current execution, we implemented a reset method in the LambdaMetrics class.

class LambdaMetrics:
    def __init__(self):
        # Dictionary to store count and duration metrics separately
        self.count_metrics = {}
        # List to store multiple duration values for each key
        self.duration_metrics = defaultdict(list) 
            
    def reset(self):
        self.count_metrics.clear()
        self.duration_metrics.clear()

We then call this method at the start of every function invocation. The final Lambda handler code looked something like this:

import boto3

from api import api_request # python loads the api module and initialises the Session object
from api import api_metrics # this is the instance of LambdaMetrics for the api module

cloudwatch_client = boto3.Client("cloudwatch")

# ... application code

def some_function(some_url):
  # make an api request here
  response = api_request(some_url,  ... other params ...)
  return response
  
def main(event, context):
  # The lambda handler
  
  api_metrics.reset()
  
  r = some_function("my.url.com")
  
  # ... some business logic
  function_metrics = api_metrics.get_metrics()
  # function_metrics now contains data that was stored in the api_metrics object
  cloudwatch_client.put_metric_data(function_metrics)
  
  return

Now we have fully taken advantage of the long living objects that exist in the various scopes of our Python Lambda functions, and utilised them both to increase execution speed and reduce costs.

The conclusion

Understanding Python execution environments in Lambda let’s you develop truly scalable solutions. In our scenario it wasn’t enough to throw more compute at the problem by running 400 simultaneous Lambda functions. We needed to make strategic decisions about where to optimise our code base so that our application logic could execute as fast a possible.

Writing batching logic into our LambdaMetrics class greatly minimised the number of requests we made using put_metric_data to the point where we couldn’t notice an impact to our account billing during times when the machine was running.

All in all this method was very successful, and entirely reusable. The same logic that we applied to design the solution for this scenario is directly applicable to other use cases for Python Lambda functions: Identify what objects can be instantiated in the global scope and manage their state between invocations.

I hope you found this article insightful and entertaining. Cheers 👍