Data Masking, Discovery and DLP
The data loss prevention space continues to confuse buyers. Recently a question was raised by a company that is looking for a data discovery tool, with the end goal of protecting their sensitive data. The company is reviewing technologies and bouncing around some different spaces: eDiscovery, Data Masking and Data Loss Prevention. Their requirements include discovery of data at rest, but also preventing the loss of sensitive data in motion and at rest. While promoting the idea that comprehensive DLP technologies were in order, I was asked to explain some of the differences between the technologies and the strengths of DLP in this regard. Below is part of my response.
Data Masking and eDiscovery point solutions have some serious drawbacks if you intend on using them to address DLP requirements. Here are just a few we see regularly:
- Accuracy and Detection Methods. The detection methods for identifying sensitive data are limited to very basic approaches like regex pattern and keyword matching. The net results are high rates of false positives (and false negatives) resulting in overflowing incident queues. DLP solutions on the other hand, have many different detection methods that can be layered for more accurate results. And for PII/PCI, we can tune DLP policies for near 100% accuracy (no exaggeration). False negatives might be an even bigger problem and you can’t quantify that impact because the activity is flat-out missed!
- Detection Limited to Certain Data Types. With basic detection methods it’s easy to identify credit card numbers, Social Security Numbers, names, email addresses, medical IDs, ABA bank routing numbers, and financial codes. But what if the data is more complex than that? Data discovery tools alone are not designed to address this. DLP tools are specifically designed to find all types of data and in our experience have proven to be far more effective.
- Comprehensive Coverage. Data masking and discovery tools are not part of a comprehensive data protection ecosystem. So, when you go from discovering data at rest to the next step of DLP (data in motion/at rest), you’ll end up with multiple vendors – and multiple management consoles. Those who don’t live in the DLP world may not recognize the time it takes to triage/manage DLP incidents. Each incident should be seen as a trouble ticket that needs to be worked through to closure. This includes investigating and confirming the incident, bringing in other resources (other individuals/departments who might know the data more intimately), HR, Legal, etc., to finally close the incident. Don’t underestimate the time it takes to triage and manage these incidents. And this problem is exacerbated by solutions with inaccurate detection and high false positive rates.
- Negative Impact of Multiple Solutions. With 3-4 different “DLP” solutions, you have to work across multiple consoles to create and manage the many complex policies, run comprehensive reports and manage incident queues. And if the 3-4 solutions’ detection capabilities are not up to snuff, each solution queue could be throwing off dozens or more incidents each day (we often see hundreds or even thousands of incidents), you end up with a terrible mess that no one can manage. If no one manages it, then you’re not effectively protecting your sensitive data and you’re only getting a fraction of the value possible from your the DLP investment.