Skip to main content

The Data Breach (Amazon) Bucket List

by Paul Roberts on Thursday September 7, 2017

Contact Us
Free Demo
Chat

The leak of data on U.S. veterans this week is just the latest to be tied back to insecure cloud-based storage. What’s going on? Let’s take a look.

Sensitive, personal information on thousands of Americans – some with top secret security clearances – was left exposed to the prying eyes of just about anyone with an Internet connection and knowledge of the repository of some 9,400 documents.

The cache was stored in an Amazon Simple Cloud Storage Service (S3) that lacked any protection – even a password. The data included details about thousands of job applicants for jobs at TigerSwan, a North Carolina private consulting firm. Many of the applicants are or were employed by the U.S. Department of Defense or U.S. intelligence agencies.

This incident is just the latest to draw attention to a growing problem: unsecured or inadequately secured cloud-based data repositories. In July, for example, World Wrestling Entertainment (WWE) acknowledged that a misconfigured Amazon S3 instance was responsible for exposing data on some 3 million WWE fans. Also last month, Verizon was forced to acknowledge that some 14 million of its customers had information exposed by a third party customer services contractor: the Israeli firm NICE Systems.

That’s just the tip of the iceberg. The firm UpGuard, which discovered and disclosed the Verizon leak, has an entire blog devoted to similar discoveries of exposed cloud data. These discoveries include: data on 198 million voters left exposed by Deep Root Analytics, a firm working for The Republican National Committee, data on 2.2 million customers of Dow Jones & Company, 1.8 million Chicagoans whose data was exposed by Election Systems & Software (ES&S), an Omaha based voting machine firm, and critical infrastructure data stored by the engineering firm Power Quality Engineering (PQE).

This isn’t a new problem. More than four years ago, the security firm Rapid7 called attention to the problem with a survey of some 12,000 Amazon S3 buckets that found around 1 in 6 (1,951 of 12,328) were left open and accessible to the public. Within those almost 2,000 publicly exposed S3 buckets, Rapid7 researchers had access to around 126 billion files, many of which contained sensitive information. Two years even before this Rapid7 survey, the researcher Robin Wood was talking about this issue.

So what gives? First: the increase in the cadence of disclosures related to exposed cloud storage is a reflection of the success of cloud storage. In the last six years, almost every organization, from the Fortune 10 on down, has migrated critical applications and data to platforms like Amazon’s EC3 and S3, Microsoft Azure and so on. These platforms offer huge efficiencies and cost savings. They make it very easy to spin up new applications and storage instances, encouraging broader use. Of course, more cloud exposure brings more cloud risk.

At the same time, companies are clearly not learning the lessons of the past. The prevalence of simply unsecured storage assets containing obviously sensitive or regulated data suggests loose practices around these deployments. Why? Third party providers are often part of the problem. Many of the data breach incidents above including the Verizon breach, the RNC breach and the recent TigerSwan consulting breach were the result of actions taken by third party firms hired for a particular purpose (placement, customer support, data analysis). Simply handing over reams of data to a contractor and assuming they’ll do right by it is a huge leap of faith and – as the above incidents suggest – one that involves considerable risk.

Finally, so as not to place all the blame on the victim, I think it’s fair to suggest that so many breaches linked to the same cloud-based storage service suggests that something about the way that storage service operates is part of the problem. The firm Detectify did an analysis in July and found that tools for assigning access permissions both to S3 storage buckets and their contents are awkward and easy to get wrong – especially at scale. When the difference between providing full control over a bucket and read-only access to the bucket is the choice of one drop down menu item or another, mistakes are going to happen.

Detectify notes that tools for auditing your company’s use of cloud-based storage as well as permission grants across an S3 deployment are lacking. Many companies may well be unaware of what data they have in the cloud, how it is being accessed or why. When those links go through third party providers, the job of auditing them gets even harder.

In short, many of these cloud-based assets are secured, in essence, through obscurity, such as with complicated URLs. That ‘security’ is illusory at best. White hat security researchers like Troy Hunt and Chris Vickery have made a career of exposing massive online data breaches. It’s safe to assume that cyber criminals are looking for the same data - but won’t be nice enough to tell you about it when they stumble upon your crown jewels in the cloud.

Companies need to be far more deliberate about their use and management of cloud-based storage and applications, including their relationships with third party providers. Simply shipping off data to a business partner without first verifying their security practices and protocols is a huge risk, as recent events suggest.

Paul Roberts is Editor in Chief and Publisher at The Security Ledger and the Founder of The Security of Things Forum.

Tags:  Data Breaches Cloud Security

Recommended Resources

The Definitive Guide to Data Loss Prevention
The Definitive Guide to Data Loss Prevention

All the essential information you need about DLP in one eBook.

6 Cybersecurity Thought Leaders on Data Protection
6 Cybersecurity Thought Leaders on Data Protection

Expert views on the challenges of today & tomorrow.

Digital Guardian Technical Overview
Digital Guardian Technical Overview

The details on our platform architecture, how it works, and your deployment options.