Using CPU profiling to increase the efficiency of your AWS Lambda Functions

Mx Kas Perch
IOpipe Blog
Published in
4 min readJun 19, 2018

--

CPU profiling used to be for those times where you had to squeeze the most out of each CPU cycle, but it’s making a comeback in a big way in serverless computing! Our Senior Site Reliability Engineer, Mike, shows us more in this demo video:

Mike’s Profiling Video

For those who’d rather read (or like pre-written notes!), let’s take a look at what Mike is saying with this video.

CPU Profiling as a Development Tool

When you think of tools for developers…CPU profiling isn’t usually at the top of the list. Or even the middle. But Mike makes a really strong case for CPU profiling being in your Lambda dev toolkit. We’ll start with one of the first thing anyone running a Lambda application thinks about: costs, and how to lower them!

Mike mentions in the video that profiling can be used to lower Lambda execution cost; but there are several ways in which we can do this: we can lower execution time, we can make sure our CPU and memory caps are a good fit, and we can make sure our code is making the most of the resources it has. Mike’s demo proves that CPU profiling can assist in all three of these endeavors, making it an invaluable resource for Lambda developers.

Finding the Weakest Link in Your Lambda Code

A chain is only as strong as its weakest link, and a Lambda function is only as fast as its slowest line of code. While we have tracing at IOpipe to give you the high-level view of what’s taking the longest, tracing lets you really get into that data and see culprits you may not have expected!

For instance, in Mike’s video, he talks about noticing a crypto library that took much longer to load than the one he ended up using, which he discovered in the profiling data! If you put tracing around your require statements in Node, that can only tell you so much, whereas diving into the profiling data can tell you if the import mechanics themselves, or an unoptimized library, are the real issue.

If the Shoe (or memory/CPU size) fits…

One of the major factors of cost for AWS Lambda functions is the CPU and memory size you allocate. But downsizing isn’t always the cheapest decision! Ever worn a shoe that’s too small? It’s excruciatingly painful, and the only real fix is to find a shoe that fits. It’s the same with AWS Lambda functions.

In Mike’s crypto demo, he lowered the memory from 1536MB to 512MB, and the execution time far more than tripled! What’s really interesting is what the profiling data showed about the time increase: the overhead setting up the crypto event loops was faster in the lower memory execution! It was only the crypto functions themselves that took exponentially longer in the lower-memory environment. It’s information like that — the fine-grain “what is actually going on when I change this variable?” data that profiling provides.

Making Sure Your Code Uses Resources Efficiently

You want to be sure that the code you are running in your AWS Lambda functions are efficient: you don’t want to have to size up your memory or CPU, or deal with longer execution times, when changing the way your code runs can prevent it! There are many ways, for better or worse, to do any one task in code; so how do we figure out what’s efficient? (I’m pretty sure you’ve surmised my answer here).

When you think of Node, you think of everything being asynchronous, because that’s, well, better in Node. Why? Well, because, we read it once somewhere and it stuck. But, just like with every paradigm, it’s not a cure-all that makes everything faster!

Mike’s demo shows a really great example of this via cryptography: when he ran the crypto library synchronously, the function executed faster than when run asynchronously. This can be a surprising turn of events for those of us who just use async versions of functions on autopilot (I’m guilty of it too). But Mike offers a really great answer as to why: crypto is a a very I/O heavy task, and the overhead of setting up async processes is nulled out when using the synchronous version, because the crypto library has more room (memory and cpu) to work in without all that overhead, allowing it to finish faster, decreasing the overall time of the function execution. That answer is absolutely backed up by the profiling data in the demo, and is an interesting case!

There are several ways to start profiling your AWS Lambda functions. One non-vendor way is to use our open source library to get profiles from V8 Lambda functions — the user needs to send the profile somewhere, but it’s a starting point! The other way is to use IOpipe, which takes care of the sending/securing profiling data and is available in Node.JS, Python, and Java. If you have questions about either of these, or profiling in general, come talk to us on our community slack.

--

--

Dev🥑. Robotics Author and Maker. Yells at robots occasionally.