Using Telemetry to Secure Applications

Telemetry analysis came to mind recently while in technical discussion with my friends at CIX Software, located walking distance from my office at the Stevens Institute in Hoboken. I’ve had my eye on this small start-up company for the past year, not only because Sameer Malhotra and his team are so technically capable, but because they focus on advanced security analysis of the reams of telemetry that are generated from the business applications we all depend on every day.

Here is their correct observation: When a software application executes in a typical operating system environment, it leaves a reasonably predictable quantitative footprint during execution. This footprint is realized in memory usage, process management, size of executables, thread attributes, and many other indicators. CIX Software has, in fact, identified 115 different footprint categories in which to search for meaningful anomalies.

The motivation behind the CIX Software approach is that malware such as rootkits, which we all know cannot be easily identified by anti-malware tools, can only be reliably highlighted by focusing on lower-level run-time behaviors. And these behaviors can be best understood by analyzing the telemetry data generated during application execution and embedded into the run-time environment in the footprint categories referenced above.

The CIX Software product, called Bushido, relies on a collection of software agents that are deployed into application run-time systems, which can be containers, virtual machines, or traditional operating systems. The agents report their run-time telemetry to a monitoring system that offers the cyber security team a view of any execution anomalies worth investigating. This provides advantages for several different groups in the enterprise:

Security compliance teams, for example, can benefit by establishing report-based dashboards that demonstrate the absence or presence of unexpected quantitative changes to the execution environment. While this is certainly not proof of absolute security, it offers good evidence beyond the usual signature-based detection, or process-oriented analysis of the development environment.

Dev/Ops teams can also benefit from the before-and-after telemetry measurements that can be made around updates or changes to applications. Having a more predictable model of how software modifications affect application execution will bring Dev/Ops closer to the cyber security community, and will help both groups establish a common framework for sharing functional requirements.

Hosting teams can benefit from the real-time alerts and notifications that come from possible advanced persistent threats (APTs) that can cause an application to produce unusual telemetry. In this sense, the approach is like an APT detection system for applications, but one that is based more on deviations from a profiled norm, than on any preconceived notions of what a specific APT might look like.

And finally, application users benefit in the understanding that the underlying software they use is being subjected to quantitative run-time analysis. One can imagine users having some defined visibility into the telemetry, perhaps as a confidence measure to increase trust that a given application has integrity. Cloud-hosted applications would benefit in this regard.

Obviously, the review of quantitative execution telemetry will have some challenges. Dealing with the myriad of telemetry seems particularly challenging, although CIX Software has gone to great lengths to include means for administrators to establish norms-based exceptions to reduce false positives. Other challenges include dealing with applications that are changed frequently, thus changing baseline measurements on an on-going basis.

Despite these challenges, it seems prudent to consider running a tool like this for your critical applications. Even if the software is installed in purely passive mode (which is recommended for initial deployment), you can gain a visibility into your applications previously unavailable. And who knows – perhaps if your other layers of defense fail in the presence of an APT, this technique might be the only thing that prevents real consequence.

I’d recommend that you have a look.