Skip to main content

Data Loss Prevention in AWS: The Complete Guide

by Chris Brook on Monday February 5, 2024

Contact Us
Free Demo
Chat

When it comes to Amazon Web Services, there are native offerings that organizations can implement to help protect information in the cloud, as well as best-in-breed tools like Digital Guardian, which can help extend your DLP coverage further.

Data loss prevention in AWS (Amazon Web Services) is integral to more commercial operations now than ever before. Protecting sensitive information from falling into the wrong hands can make the difference between meeting major milestones and incurring significant costs over compliance failures.

Organizations leveraging the immense variety of tools available to them in cloud suites, such as Amazon Web Services, must not only implement their data loss prevention (DLP) strategies effectively but also adapt them to the distributed ecosystem of the cloud. 

Luckily, AWS offers tools to ease the burden.

You can take advantage of offerings like Amazon Macie, EventBridge, and Security Hub to protect sensitive data in your cloud deployment just as proficiently as you would on your local network. In this guide, we’ll provide an overview of data loss prevention in AWS as well as strategies and tools you can use to bolster your AWS security. 

In this article: 

Image by dennizn via Shutterstock

What is Data Loss Prevention in AWS?

Data loss prevention in Amazon Web Services primarily revolves around Amazon Macie, the platform's proprietary data security service. Macie leverages machine learning (ML) and pattern matching to continuously analyze data in S3 buckets. It also integrates with various remediation tools.

Macie ties into other AWS services like EventBridge and Security Hub to help gather security findings wherever you intend to process them, enriching your DLP workflow with third-party tooling as needed.

Photo by Jessica Lewis via Pexels

How Does Data Loss Prevention in AWS Work?

DLP in AWS is handled first and foremost through the use of Macie. 

Macie uses powerful machine learning, pattern matching (using regular expressions), and rule-based data processing to continually evaluate data on your S3 buckets and automatically discover sensitive data.

You can leverage Macie's analytics features to find anything from sensitive financial information to mission-critical credentials and more. Macie is also capable of detecting unauthorized bucket access and recording the details of such access for further analysis.

Users can also leverage Allow lists to ensure Macie only reacts to data considered sensitive in certain contexts or patterns instead of generating findings for safe, public information present in your data.

The entire DLP process, as handled through the use of Macie on the AWS platform, can be condensed into three key disciplines:

  • Monitoring Data: Macie creates a comprehensive map of your S3 buckets first, then proceeds to track access to them. Data types deemed sensitive are also recorded to provide you with a detailed overview of the security landscape your organization's data presents. 

Sensitivity scores are used to help convey data risk severity at a glance alongside the current encryption and public access settings for each of your buckets.   

  • Analyzing Findings: Macie’s reporting on its findings includes severity ratings and additional details regarding the data in question. 

Whether potentially sensitive data was found to be accessible by unauthorized users or some other issue concerning bucket security popped up on Macie's radar, it generates a finding for your team to assess. 

A dedicated findings section in Macie simplifies the handling of these reports.

  • Remediation: Both manual and automated remediation through Macie involve using additional tools. EventBridge and Security Hub, both tools on the AWS platform, connect directly to Macie to use its findings. 

EventBridge can transmit Macie's findings to various endpoints, while Security Hub can further refine and streamline the format of Macie's findings to match that of other critical reports within your organization's AWS ecosystem.

Image by Photon photo via Shutterstock

How Does Data Loss Prevention in AWS Help with Compliance?

Compliance is certainly made easier by leveraging Macie in AWS. Here's how it measures up to the requirements of some of the regulations organizations must commonly accommodate:

Macie makes it much easier to keep tabs on sensitive health information, effectively identifying the data while at rest and reporting its location. 

Macie's integrations with EventBridge and Security Hub also enable you to send notifications on demand if anything suspicious happens to your data stores.

  • GDPR: Apart from obtaining consent to use an individual’s data, you'll need to restrict data collection to only the essential purposes to comply with the General Data Protection Regulation (GDPR). You should also explain how long data can be expected to be stored and allow individuals to request the deletion of personal information at any time.

Macie's ability to pinpoint the exact location of sensitive data can also be useful in this situation.

The precision with which Macie can match such data, even in huge datasets, makes adding new patterns for specific individuals possible. This can greatly simplify deletion requests and reduce the risk of sensitive data being left behind by accident.

DLP in AWS fully accommodates these requirements as they basically mirror those of the GDPR covered above. Macie can detect specific patterns in your data, allowing you to locate items requiring immediate deletion quickly and effectively. 

  • PCI-DSS: To accommodate the three primary requirements of the Payment Card Industry Data Security Standards (PCI-DSS), you’ll need to locate all cardholder data you have on file, eliminate potential vulnerabilities through the deletion of unnecessary information, and document all threat remediation details to send to the financial institutions you work with.

Macie makes all three of these objectives possible either on its own or through its integrations. For example, steps taken to identify and eliminate vulnerabilities within your S3 buckets are effectively documented by default, saving time and hassle.

We'll cover the DLP tools available to you in AWS in greater detail below, in addition to effective implementation strategies and more.

Image by ra2 studio via Shutterstock

Tools for Data Loss Prevention in AWS

We've already mentioned the three main tools you can expect to use for data loss prevention in AWS above. Let's dive a bit deeper into each of these: 

Macie

Amazon Macie does three things:

  • Identifies data vulnerabilities: Macie leverages pattern-matching and machine-learning approaches to discover at-risk data in your S3 buckets. Monitoring is continuous and starts the moment Macie is enabled.
  • Reports risks: Macie's dashboard presents findings in an organized fashion, offering quick visibility into all data security risks identified in your data stores. Statistics are made available to power more detailed research into your risk profile, and an API can be used for greater convenience.
  • Integrates with remediation tools: Macie makes it possible for you to easily automate remediation processes or signal for manual intervention in real time by integrating with other AWS services (EventBridge and Security Hub), as well as third-party tools like Digital Guardian.

EventBridge

EventBridge facilitates event-driven system designs where applications are only loosely tethered to one another. In the case of supporting DLP workflows in AWS, EventBridge can help move messages from Macie to appropriate endpoints within AWS or your own external systems.

EventBridge works with two types of integrations: pipes and buses. Where pipes are intended to patch a given source over to a single endpoint, buses can handle one-to-many (or even many-to-many) broadcasts. Data can also be altered in EventBridge before transmission.

Security Hub

AWS Security Hub is used to streamline and homogenize the reports and security transmissions generated by all manner of AWS services and many third-party offerings. 

Besides consolidating such security data in a single place, Security Hub also analyzes the state of your organization's network security as it pertains to relevant network security standards, like the PCI-DSS. 

Your security controls are automatically cross-referenced with established best practices to help keep your company in the clear. Security Hub can transmit critical findings to other systems through EventBridge to help power automated remediation strategies.

Image by thodonal88 via Shutterstock

Strategies for AWS Data Loss Prevention

Wondering how to put the tools covered above to optimal use? Here are a few fairly common strategies to start with:

Macie to EventBridge

This approach leverages Macie's built-in integration with EventBridge to convey finding events to endpoints you prioritize. 

An example of this strategy in action would be for findings involving exposed private health information to be sent through EventBridge to Amazon Simple Notification Service or a custom Lambda function that then encrypts the S3 objects involved or restricts access to them.

EventBridge also allows for rules to be implemented that screen findings for qualifying criteria before sending them on their way. This is handled using the EventBridge event schema for findings.

Macie to Security Hub

If you prefer leveraging Security Hub's more comprehensive reporting features, the findings that Macie generates can be sent to AWS Security Hub instead of EventBridge. This approach lends itself particularly well to systems in which findings originate from more than one region.

Security Hub can aggregate findings data from multiple regions for use in industry-standard security reports. 

When Security Hub is leveraged for processing, findings from across all sources are streamlined into a single format, named the AWS Security Finding Format, or "ASFF." Findings sources, resources affected, and status reports can all be found in this common format.

Once assessed in context, if a given finding warrants further action, EventBridge can be connected to your Security Hub service to transmit details to additional endpoints.

Third-Party DLP Automation

Macie's integration with EventBridge can be leveraged to transmit findings event data to various HTTP-based endpoints and AWS services. 

Connecting EventBridge to third-party tools and services is as easy as adding their API endpoints as valid targets for qualifying events. If necessary, messages can be transformed to meet the requirements of the chosen endpoints before they are transmitted by EventBridge.

Digital Guardian can be integrated directly with Amazon Macie to unite S3 bucket-derived insights with the rest of your enterprise's monitored endpoints. This makes it possible to combine Macie’s findings with Digital Guardian endpoint events for security over each file's lifecycle.

Besides supporting more complex reporting and compliance use cases, this direct integration makes it possible for remediation to actually be triggered automatically from both your network's endpoints and the actual data store it pulls from.

For a bit more information about Macie, check out the following video:

Best Practices for Data Loss Prevention in AWS

You can get more out of your AWS DLP strategy by adopting the following best practices.

Use Both EventBridge and Security Hub

Although AWS Security Hub offers a wealth of reporting functionality to assist in meeting strict regulatory standards, it cannot power a fully automated remediation process on its own. 

EventBridge literally bridges the gaps between Macie, Security Hub, and the types of tools you'll need to take action when data is in jeopardy.

Leverage Endpoint Security Tools

Endpoint security tools like Digital Guardian Endpoint DLP enhance the data loss prevention functionality found in AWS by extending your security strategy's reach. 

Instead of being limited to monitoring only the data stored in S3 buckets, you can keep tabs on information accessed or viewed via the devices your team members use on the job as well.

Store Macie Findings in S3 Buckets

Macie’s findings are not stored indefinitely in Macie itself. To guarantee long-term storage of critical findings, you can turn to Macie's EventBridge integration.

EventBridge can be used to automatically send important findings from your Macie instance to an appropriate S3 bucket in AWS. This can help in achieving compliance with regulations that require a certain degree of data persistence beyond Macie's limits.

Photo by Alexander Suhorucov via Pexels

Final Thoughts

Macie makes for an excellent choice for protecting your organization's data right at the source. Amazon offers fairly comprehensive S3 monitoring functionality through Macie, but this can be enhanced considerably through the use of third-party offerings. 

Digital Guardian contributes additional monitoring and reporting power to your AWS DLP deployment, extending its reach to devices on your local network and guaranteeing a higher likelihood of total compliance with relevant regulations at all times.

For more information on how Digital Guardian can help, speak to one of our experts and schedule a demo today.

Photo by fauxels via Pexels

Frequently Asked Questions (FAQs)

Does AWS have DLP?

Amazon Web Services provides a limited data loss prevention (DLP) tool called Macie that can scan S3 buckets for unauthorized access and sensitive data. Macie can connect to other tools and services to support a fully automated DLP strategy.

What is DLP in cloud computing?

Cloud DLP is similar to data loss prevention on a local network of devices. The main difference, besides the location of the systems being protected, is the manner in which data storing and processing resources are actively protected.

Many proprietary clouds offer native, proprietary DLP tools to guarantee granular control over certain operations. 

However, these tools rarely account for resources beyond the bounds of the proprietary network they were built for, and often require additional third-party tools to achieve more comprehensive protection across an organization.

What is DLP in simple terms?

DLP stands for "Data Loss Prevention," a discipline intended to help prevent sensitive data from being stolen or leaked unintentionally.

What is an example of DLP?

DLP can involve restricting access to certain resources on a given network, automatically scrubbing sensitive data from transmitted files or even monitoring individual devices for risky data handling practices in real time.

What are the 3 main objectives being solved by DLP?

Data loss prevention is intended to help organizations achieve regulatory compliance and protect their own interests by preventing the unauthorized use or transmission of personal information, intellectual property, or otherwise restricted information.
 

Tags:  Cloud Security Data Loss Prevention

Chris Brook

Chris Brook

Chris Brook is the editor of Digital Guardian’s Data Insider blog. He is a cybersecurity writer with nearly 15 years of experience reporting and writing about information security, attending infosec conferences like Black Hat and RSA, and interviewing hackers and security researchers. Prior to joining Digital Guardian–acquired by Fortra in 2021–he helped launch Threatpost, an independent news site that was a leading source of information about IT and business security for hundreds of thousands of professionals worldwide.

Recommended Resources


The Definitive Guide to DLP

All the essential information you need about DLP in one eBook.

The Ultimate Guide to Data Protection

Everything you need to know about data protection but were afraid to ask.