Skip to main content

What is Data Discovery?

by Nate Lord on Tuesday September 29, 2020

Contact Us
Free Demo

Learn about data discovery and the role it plays in many data protection solutions in Data Protection 101, our series on the fundamentals of data security.


Data discovery involves identifying and locating sensitive or regulated data to adequately protect it or securely remove it. One of the biggest business intelligence trends in recent years, data discovery, is a priority for many enterprise security teams because it is a crucial component of compliance readiness. Data discovery involves auditing sensitive or regulated information, including confidential or proprietary data as well as protected data such as personally identifiable information (PII) or electronic protected health information (ePHI). Data discovery enables security teams to identify this information to protect it and ensure its confidentiality, integrity, and availability.


In today’s era of remote workers, business is frequently conducted in the cloud where file sharing and storage is the norm. This poses a challenge to enterprises that need to know precisely where their sensitive or regulatory data resides. Considering the interconnectivity that exists today between business processes, data is placed in several systems, applications, databases, and shared files, making its protection, authentication, and confidentiality a challenge for enterprises. Data discovery is a solution for identifying a company’s data in full and making sure that the appropriate controls are in place for the best security practices and regulatory compliance measures.


The true goals of data discovery, therefore, are to identify and classify data to make determining the threat, the affected resources requiring protection, and the fallout of potential data leaks more manageable. Gartner predicted the need for context aware security back in 2012, following the growth of cloud computing, IT consumerization, and the quickly evolving threats to sensitive enterprise data. Gartner touted context aware security as being “able to cope with emerging threats and evolving business requirements for greater openness.” Gartner also suggested that CISO’s begin moving toward context aware and adaptive security infrastructure, coupled with secure web gateways and endpoint protection platforms to replace older, now-insufficient static security infrastructures like firewalls.

Gartner analyst Neil MacDonald described context aware security as “the use of supplemental information to improve security decisions at the time they are made, resulting in more accurate security decisions capable of supporting dynamic business and IT environments.” Gartner was right: by having a full understanding of contextual factors such as file type, sensitivity, user, and location, security teams and the solutions they employ can make more effective and timely decisions when protecting information across a wide range of use cases. Data discovery provides much of these contextual clues by identifying sensitive and regulated data requiring protection.


Enterprise data is moving from one location to another at lightning speed and is being stored in countless devices and cloud storage applications. Employees, partners, and customers are accessing this data from anywhere and at any time, so identifying, locating, and classifying that data in order to protect it is the priority of data discovery security solutions.

The benefits of data discovery and context aware security solutions are far reaching and include:

  • Enhancing the process of understanding the data the enterprise owns, where it is stored, who can access it and where, and how it will be transmitted.
  • Applying pre-defined classifications and protection policies to enterprise data.
  • Continuous and comprehensive monitoring of data access and activity.
  • Automatic data classification based on context.
  • Risk management and regulatory compliance.
  • Complete data visibility.
  • Identification, classification, and tracking of sensitive data.
  • The ability to apply protective controls to data in real time based on predefined. policies and contextual factors


There are several issues that cause concern for organizations who are attempting to better protect and use business intelligence. Most of these issues boil down to three areas:

  • The Sheer Amount of Data: Whether it’s a number of new customers making transactions or sending out emails to a new list of 1000’s of leads — there can be a large amount of data flowing into an organization.
  • Different Data Types: In addition to the inflow of data, there are typically multiple types. Some types of data may be sensitive, while other types are less so, and some not at all. Tracking, securing, and purging becomes more difficult based on the types of intelligence being collected.
  • Managing It All: Collecting, purging and protecting can be done. That said, setting up a standard operating procedure and remaining consistent across the organization can be tricky.


There are five generalized steps when it comes to data discovery.

  1. Gather: All of the data. Both sensitive and non-sensitive data needs to be gathered and easily viewed. To ensure compliance with regulations, the location of the collected information should be condensed as much as possible and documented.
  2. Analyze: Once all the data is in a manageable environment, it’s time to analyze it all. When looking, it’s important to separate the data based on sensitivity (e.g., cardholder data) and data that is necessary, yet not sensitive (e.g., order history). You will also determine what data you need to keep (for SOX compliance, other regulations, or for business purposes) and what data can be discarded.
  3. Purge: All unnecessary data should be purged. A policy should be set for this data to be purged once it is no longer necessary.
  4. Protect: All data should then be protected. These protections should be both physical (e.g., storing data in a locked cabinet or room) and digital (having a firewall, encryption, etc.).
  5. Use: There could be multiple insights gained from the data discovered. These insights could be used to improve your sales practices, operations and other processes.

Enterprises today are creating data at unprecedented rates, making data discovery even more critical to maintaining a firm grasp on your company’s security requirements. Data discovery enables enterprises to adequately assess the full data picture and implement the appropriate security measures to prevent the loss of sensitive data and avoid devastating financial and reputational consequences for the enterprise.

Tags:  Data Protection 101

Recommended Resources

The Definitive Guide to Data Loss Prevention
The Definitive Guide to Data Loss Prevention

All the essential information you need about DLP in one eBook.

6 Cybersecurity Thought Leaders on Data Protection
6 Cybersecurity Thought Leaders on Data Protection

Expert views on the challenges of today & tomorrow.

Digital Guardian Technical Overview
Digital Guardian Technical Overview

The details on our platform architecture, how it works, and your deployment options.