Debugging Serverless Apps: from monitoring invocations to observing a system of functions

Published in

IOpipe Blog

4 min readJun 29, 2018

Observability into your serverless application is crucial: for development, for operations, for all stages of the application pipeline.

While monitoring is an integral part of any ops strategy, and shows you when you have the problems you expect to have, observability shows you the problems you didn’t see coming. A mix of observability and monitoring is key to faster development and improved health of serverless applications.

Classifying your serverless functions

The need for observability is especially prudent as serverless applications grow: as we move from one-off functions to fully serverless applications, we need to adjust our strategies in order to keep observability while keeping the data we’re observing palatable.

One approach is classification; comparing, contrasting, and grouping our functions and data into groups that help up grok what’s actually going on.

Classifying functions, using labels, allows us to see a larger view: we not only start to see how functions interact with each other, but how functions interact with outside services. We also start to see outliers — functions that don’t fit into a group, and do not fall into a classification. Once we see these, and see the application context they exist in, we can start to work on bringing them into the larger picture, which creates a more uniform and understandable application.

Events: the between-the-lines of serverless development

Events, specifically, the sources and context that trigger your serverless functions, are a crucial part of the context that allows you to fully observe your serverless application. Events can even help us discover information and classifications that we didn’t expect.

As we design applications, we usually have a set picture in our head of how our functions will work and work together. But events can show us how running in production actually looks compared to that ideal design we once had by providing a full view of how the application actually runs, which functions call other functions or outside services. Classifying events can be as useful as classifying functions because it gives us that big-picture view of our application.

Classifying functions and events for system-wide observability

Classifying your functions using events, logging additional data, and putting information around interactions between functions and between functions and outside services, you give yourself a much stronger overarching picture to observe. This is crucial for maintaining the health of your serverless application: it’s much easier to pinpoint a problem with stronger context — it becomes much easier to see whether a function, a service, or another issue is causing your serverless application to behave erratically. An example is if a third-party service goes down; if you’re able to quickly view all of the functions that are not working, and the information about third-party service calls is attached to each function, this gives you a classification that points to the source of the problem.

Auto Labeling: creating context and enabling observability

Manually labeling your functions can be daunting: it’s hard to define an overarching picture of how your application should me running; it’s much easier to see this information generated through collected data. But what should we label? What concepts can we grab that will make sense of all of these seemingly disparate functions?

It is in that spirit that we have begun auto-labeling invocations and functions in our Node.js and Python agents.

We automatically label event sources, whether the Lambda function ran from a schedule, cold starts, errors, and whether custom metrics or profiling/tracing data is available for the function. We even label invocations that have timed out automatically, so you can see at a glance when you’re seeing a spike in timeouts!

To aid classification, when you click on automated label from the invocation page, it takes you to a search page for all invocations with the same label. This gives you the context you need with the click of a label, instead of searching manually through logs:

In order to use this feature, you’ll need to make sure your IOpipe module is updated to 1.7.0 for Node.JS if you’re using @iopipe/iopipe, 1.12.0 if you’re using @iopipe/core, 1.6.0 for the Python agent, and 1.5.0 for the Java agent. Once your agent is updated, your invocations will be automatically labeled.

We’re excited to bring you the ability to go from monitoring a function to observing your application, and the context to make sure your serverless code is working as intended.

Want to try auto-labeling? We have a 21-day free trial! Want to talk to us about growing and observing your serverless applications? Our community slack is open and we’re happy to chat!