The Art of Stealing Terabytes
How did hackers manage to extract terabytes of data from the network of Sony Pictures without direct, physical access? It may have been easier than you would think.
There are many superlatives to describe the hack of Sony Pictures Entertainment. It has been called the “worst” and “most destructive” hack of all time. It has been likened to a nuclear bomb. It has been called an act of cyber warfare.
But, behind all the hyperbole, the Sony hack is just another hack – albeit a bad one. And like any other cyber crime, there are questions about the ‘whys’ and ‘how’s’ of the Sony hack that have yet to be answered to anyone’s satisfaction. Chief among them: how the attackers were able to sneak terabytes1 of data off of Sony’s corporate network without being noticed.
Data theft is a common element of any sophisticated attack. Many malware programs record and transmit keystrokes from infected systems. Other Trojan horse programs are, in fact, designed to siphon data from victim networks, often indiscriminately. Malware like the common Zeus Trojan might search for file types that are known to contain sensitive information, like Word documents, Excel spreadsheets, PDF files and the like.
Stolen data can be exfiltrated – or sent outside the network – by any number of methods. Many data stealing malware families rely on built-in File Transfer Protocol (FTP) features or HTTP to send files out of the network without attracting attention. The FrameworkPOS malware used against Home Depot actually used DNS (Domain Name System) requests to exfiltrate credit card numbers stolen from that retailer’s point of sale (POS) terminals. The Target hackers were even more clever: setting up a password protected network share on the Target network to collect stolen credit card numbers from infected point of sale systems before sending them outside the company. According to one analysis, by Dell Secureworks, three separate types of malware were used to steal, store and then transfer the Target customer data. In other cases, anonymizing software like Tor provides a means to hide both the destination of the data and its makeup.
To date, many of these schemes have been predicated on the assumption that the attack in question will be targeted and the data stolen will be measured in megabytes, or maybe gigabytes – not terabytes. Presumably, as the amount of data stolen grows exponentially, the options for stealing it narrow. Sure, you can use FTP to transfer a few gigabytes of data to a remote server, but 11 terabytes? That much FTP action – especially over a short period of time – would get noticed. And sending even one Terabyte via HTTP, DNS or, god forbid – 140 character Tweets – will be both time consuming and mind-blowingly noisy.
But the sad truth may be that making off with terabytes worth of data may be easier than you think.
The consensus among security experts I polled was that the vast, vast majority of enterprise environments aren’t instrumented to look for unusual traffic patterns of any sort – even eye-popping anomalies that will hog network resources and otherwise disrupt service. That doesn’t count the kind of segmented and surreptitious transfers that even moderately sophisticated hacking groups will use to get data off a network.
Asked what methods remote hackers might use to steal as much as 100 terabytes of data, one security pro responded to me by email with “Start transfer. Wait.” “Ask your local IT wonk ‘what was the largest recent transfer outward from your network?’” he continued. “You'll get no answer.”
And the embrace of cloud based platforms and applications make it even easier to steal more data and to cover your tracks while doing it. This post from the Cloud Security Alliance in October notes one common method that many attackers are using: disguising stolen data as rich media files like videos and audio. After stealing data, the attackers split it into files of identical sizes, compress and encrypt them (to prevent detection). The files are then uploaded to media sharing sites (think: YouTube, Vimeo, DropBox, etc.). Sure, the files are large, but from the system administrator’s point of view, uploading a large file to a video sharing site is nothing worth investigating.
One line of argument in attacking the U.S. government’s theory that the government of North Korea carried out the attack via remote software is this issue of the stolen data. How – critics wonder – did they manage to sneak terabytes of the network? And isn’t there evidence that the data that was stolen was transferred at a much faster rate than would be possible without direct, physical access to the network?
That may be a point worth debating. And there is still plenty to debate about the hack of Sony Pictures including (as I see it) who did it. But one thing is true (and depressingly so): the sheer volume of data stolen doesn’t preclude a remote attack using nothing more than software and some creativity to rob Sony of nearly all its data. And that – more than anything – should be a sobering message for everyone.
1. Reports vary as to how many terabytes were stolen. Estimates range from 11 TB to 100 or more TB.