An Engine for Search

In 1995, the great Doug Comer from Purdue published The Internet Book. The title sounds presumptuous today, but it was a useful resource to explain the burgeoning Internet. Here is a cool sentence from that book: “Because all Internet services use the Internet Protocol, a computer must have IP software before it can use the Internet.” It was that kind of book. It introduced many basic concepts that we take for granted today. It’s an IP Mozart.

I have the book open on my desk right now, and I noticed that most of its back half is related to search. Here’s a sentence from the chapter on Archie and Veronica (and if these are unfamiliar, then we share different generations): “To find information on a large and growing Internet, one needs a tool that automatically searches for information.” Mark Zuckerberg was eleven years old when Comer penned those prescient words.

Today, automated search is so central to our day-to-day psyche that I suspect future historians will reference this capability as initiating gradual changes in the temporal lobe of our cerebral cortex. I mean, let’s face it – why bother storing stuff in your brain when you can just click and search? And relax, this doesn’t make us dumber. Instead, it frees that portion of the brain for other matters (ahem) like processing stimuli. It’s all good.

I was thinking search-related thoughts this past week while meeting with the security analytics team from Elastic – a public company with more impact than you might realize, but perhaps less name recognition. Let me share with you my understanding of the wonderful technology they develop, with emphasis obviously, on how their products can be an essential component in an enterprise security infrastructure.

“Elastic is a search company,” explained Mike Paquette, Director of Product for the Security Market at Elastic. “Our platforms provide underlying big data search and management capability for services like Uber, Instacart, and Tinder. Our products are also used for search-related functions at major companies such as Cisco and Sprint in their customer-facing eCommerce, retail, and application infrastructure. They rely on our technology.”

The Elastic offering is centered on its flagship Elastic Stack, which is a platform that supports search, analysis, and reporting of data from any source and in any format. As you’d expect, the Elastic Stack is a collection of tools and engines available for customers to select and use for their application, either via Elastic Cloud SaaS offerings or self-managed download options. Here are the primary components of the platform:

Elasticsearch is a distributed analytics engine, available via REST APIs and JSON, that can be used with Java, Python, .NET, SGL, and PHP and integrated with big data tools such as Hadoop. The same Elasticsearch software is used for laptop prototypes to massive distributed clusters supporting complex enterprise applications. The idea, as you’d expect, is to support structured and unstructured search queries and investigative questioning.

Kibana is a utility sitting over Elasticsearch that provides a user interface to the search infrastructure. Data can be visualized in Kibana into histograms, pie charts, and other patterns. The tool comes with many visualizations, including its own infographic-like creation system, Canvas, and can be extended via an open source visualization grammar called Vega that supports definition of your own visualizations. Geospatial and time series data can also be woven into the visual representations.

Beats is an ingestion platform with agents acting as connectors, thus allowing data to be ingested into Elasticsearch through standard interfaces. Beats agents enable ingestion of files, system-level telemetry (CPU usage, memory usage, etc.), network traffic (including packet data), Windows event logs, Linux audit logs, server status monitoring, and cloud service data. “Beats is Elastic’s platform for single-purpose data shippers” explained Paquette.

Logstash is a server-side utility that collects data from various sources, and then supports parsing, structural identification, and formatting into a common representation for subsequent analysis. “The security implications of this type of capability are powerful,” Paquette explained, and I agreed that such processing would be great for the usual plethora of security feeds that require normalization for analysis and correlation.

As you’d expect, my bias was on Elastic Stack for cyber security, and the possibilities here are interesting. Certainly, the open-source platform is well-suited for SIEM augmentation, and many enterprise teams use it in this manner. But during our discussion, Paquette was willing to admit that the Elastic Stack suite can truly serve at the heart of any security analytics platform. Apparently, a significant number of security vendors at RSA this year were demonstrating platforms that use Elastic inside.

Founded in 2012 by Shay Banon, Elastic is now a massive company, with approaching 1500 employees around the world. Their IPO last October – they list on the NYSE – produced an estimated valuation of between two and three billion dollars. Their open source community is also massive with something like 350 million downloads since they started. Monetization of all this open source is the obvious challenge for Elastic: They did $160M in fiscal year 2018, which represented 81% YOY growth.

I find it hard not to be excited at the prospects for the company – less from the perspective of investment (I can barely balance a checkbook), but more from the perspective of a cyber security analyst understanding the immense size of opportunity in my area. And yes, adjacent use-cases must be incredibly enticing, perhaps even distracting, to the leadership team. My advice would be to invest in cyber – but that’s just me.

I hope you’ll find some time to reach out to Elastic, especially if you work in information security. Mike and his team are well-versed in the platform and can give you a nice demonstration of how it all works. And maybe if Elastic helps take on some of your SOC-related data analysis tasks, you can re-focus a portion of your own cerebral cortex on something else, like maybe begging users to stop clicking on phish emails!

As always, please share with us your experiences after you speak with Elastic.