Finding and Protecting Databases

We in computing don’t pay enough attention to our past. Unlike physicists, with their innate sense of history, we tend to value our ancestry about as much as a frayed WordPerfect 4.2 manual. This is simply wrong – and it is a shame that students are not exposed to the amazing stories of the colorful pioneers who invented computer science. Few of these early scientists lived a life as interesting as the inventor of the relational database, Ted Codd.

Born in the UK, Edgar Frank Codd (known as Ted) flew seaplane bombers for the RAF during World War II. After the war, he moved to the US, and despite having studied math at Oxford, he took a job as a sales clerk at Macy’s. After teaching math in Tennessee, he found his way to IBM, where he would begin his research. His career at IBM was interrupted for a few years when he moved to Canada as an expression of his disgust for Joseph McCarthy.

When he returned to IBM, his work focused on the theoretical basis for processing data, and he wrote one of the most influential papers in the history of computing – and arguably in the history of science: A Relational Model of Data for Large Shared Data Banks. If you’ve never read this paper, then you should – because without it, there would be no enterprise computing, and I believe – none of this hypertext processing we call the World Wide Web.

I was thinking about Codd’s amazing contributions last week while talking with Brett Helm, founder and CEO of DB CyberTech. Based in San Diego, and launched in 2012, the company focuses on finding databases and their connected applications in the enterprise, and then improving the security, compliance, and privacy aspects of the structured data. I asked Helm to share the details of their platform approach, and here is what I learned:

“We focus on discovering structured data stores including the tables, elements, and what applications are accessing them,” Helm explained. “We’ve learned that too many IT and security teams do not know where their structured data is located. Our platform uses advanced techniques, including deep protocol inspection and natural language processing, to find databases and determine the value of the data. Once an accurate inventory has been established, then we can secure the data.”

As a career CISO, I agreed with this emphasis on visibility, but I asked about the accuracy of inventories. Why, I wondered, was it so hard for CISO teams to find their databases? Helm responded with this: “Enterprise security teams struggle with their structured data inventories because they lack deep visibility into the data tier,” he explained, “Once DB CyberTech ‘shins a light’ on this infrastructure a wealth of critical insights emerge.”

The discovery component of the platform operates at Layer 7 and analyzes collected data, reviewing flow patterns, source and destination information, application associations, and any other metadata offering hints that a database is being used. “Our deep protocol analysis uses machine learning techniques,” Helm said, “which allows us to train the platform to identify evidence of previously unseen database activity.”

The privacy component of the platform is all about data classification of discovered structured data. This requires use of novel techniques such as natural language processing (NLP) and user behavioral analytics (UBA) to accurately assess the value of business and personal information. The result is a non-intrusive classification engine that allows organizations to support all privacy-related objectives.

The security component of the platform implements the behavioral analytic and machine learning algorithms. Ingested activity metadata is analyzed against autonomously modeled profiles of structured data usage. Any deviations from expected database norms are reported and used to alert security teams that a compromise might be underway. Such dynamic analysis is often the leading indicator that data loss is imminent.

The compliance component of the platform provides advanced support for General Data Protection Regulation (GDPR) articles with emphasis on data subject rights, records of processing, and notification of breaches. “Our platform supports GDPR articles 32 and 33,” Helm explains, “and this type of compliance processing for structured data is increasingly becoming a global requirement.”

One challenge I shared with Helm is that organizations might like to use one platform for discovery and support for both structured and unstructured data. Helm acknowledged that the platforms NLP and similar constructs would make this transition a useful feature to consider for the future. I hope they move forward, at minimum with pre-integrated partnerships that allow inclusion of unstructured data in the mix.

Ultimately, the DB CyberTech approach looks to me like a winner – and the idea that visibility can be improved to build more accurate inventories of enterprise databases also seems essential. If this sounds attractive to you – and I hope it does, then give the team at DB CyberTech a call and ask them to share details of their solution to structured database visibility and control.

Oh – and by the way, just like Ted Codd during WWII, Brett Helm flew Bombers for the USAF in Desert Storm. (Must be something about databases and bombers.) Make sure to ask Helm about his career experience – and please thank him for his service – when you meet and talk.