When I first started at RapDev, I heard of a mythical feature of the Datadog Agent class. This feature could only be the agent’s persistent caching functions. These functions allow the agent to read and write from a file located on the host, one that will persist agent & host restarts. This is useful in cases where you want to monitor the changes to the state of an event or check, like an open investigation becoming closed, or the last submitted check status vs the current status.

When the Datadog Agent starts a custom check or integration, the `__init__` definition is run to create an instance of said check/integration. Any variables declared within this instance can be read from and written to while the check runs. These variables can also be referenced in subsequent check runs and will report the state last written to them.

This is where the Datadog Agent’s persistent cache function comes into play. Since the init class is initialized at the agent’s runtime, anything written to it is ephemeral, surviving only as long as the agent continues to run. The persistent cache, in my opinion, alleviates this limitation if implemented correctly.

In order to implement the Agent’s persistent cache in your agent check, the following agent base definitions are used to read from, and write to the cache, respectively. The cache itself is a dictionary object and therefore should be handled as so.

My recommendation for implementing these functions into your check are to add helper function definitions for handling your variables. Specifically, a helper function to read from cache, one that implements the read_persistent_cache() function and handles the first time initialization of a check as well as normal check runs. In addition to this, I recommend another helper function to write to cache, ensuring to implement the write_persistent_cache() function, and handle conversions from other types into a JSON formatted string.

The following is an example of how I implemented the use of the Agent’s persistent cache in our Rapid7 investigation integration. I used the persistent cache to track which investigations were already submitted to the event stream to ensure there aren’t duplicate events generated.

I leave you with some warnings in regards to caching, as its incorrect usage could lead to memory and storage issues. What you write into persistent memory should be of a bounded domain. You must ensure that anything written into persistent memory is handled correctly, and eventually be removed (popped) from the cache.

Written by

Mitch Nethercott

Datadog engineer with experience in network administration and configuration, application/network performance monitoring, and automation using configuration management tools. Born and raised in Connecticut, he’s been using computers since preschool and is more than equipped to troubleshoot a wide variety of problems.

Written by

No items found.

More by

Mitch

Using Datadog Java Tracer to Collect Custom JMX Metrics

Mar 2025

Exploring DASH2023: Memorable Highlights

Aug 2023

Datadog Network Device Monitoring: Configuration & Troubleshooting

Jun 2023

Resources

We don’t believe in hoarding knowledge

We go further and faster when we collaborate. Geek out with our team of engineers on our learnings, insights, and best practices to unlock maximum value and begin your business transformation today.

Datadog

RapDev

ServiceNow