Meltdown and Spectre

The China Syndrome, revisited….
 

Blog post 2018-01-12 by Ralph Moonen, Technical Director at Secura

By now, the dust has settled enough to be able to see the battlefield a little more clearly. In front of us lie the maimed corpses of the main CPU architectures, ravaged by Meltdown and Spectre. Frantic OS patching is being carried out by emergency teams but the scars and wounds of this one will be visible for years to come.

Side channel attacks
I can safely say that the situation is very bad, but there are still some unanswered questions. Are there cases where you can get away with not patching? Will the patches be sufficient? Let's start with some background on side channel attacks.

Side channels atttacks are well known and have been used to break all kinds of security measures. They rely on physical characteristics of systems changing, depending on what they're doing. You can measure and sometimes manipulate these characteristics and by analyzing them, that they should be keeping a secret. For instance, by measuring the power consumption of a chip, it is sometimes possible to extract the encryption key it is processing at that time. Or, by measuring the radio frequency emanations of a CPU, you can statistically deduce the plaintext of a communication. In the current case, it is a timing attack. Measure the time it takes to retrieve a byte from memory and you can tell if it was in the cache, or if it was retrieved from actual RAM. How can we tell the value of the byte from this? Well, we can't, not directly. But we can indirectly. CPU's use a trick to speed things up: when they encounter a branch in the executed code, they execute BOTH possibilities, and only choose which value to return, after evaluating the branch condition.

How does the technology behind it work?
Unfortunately, security checks are only performed on the branch option that is evaluated as true. But that is too late, since the other branch has already executed, and the value already stored in cache. Measuring the time to retrieve that value tells us if that value is in cache or not. Now we can manipulate this process: read a forbidden memory location, and if that byte has a value X, then also read a non-forbidden memory location. Running the code will fail, but if you subsequently read that other non-forbidden location again, and time that read, you will know if that value is X (because it was very fast, from the cache) or not (because it was slow). Obviously, this would take repetitive iterations to determine each value, a bit like time-based blind SQL-injection attack work. But computers are fast, and with some optimization, it is trivial to work out each byte's value. For a very readable and detailed write-up of these techniques, I refer to Bert Hubert's piece.

To patch or not to patch?
 
So, as for the question, can you afford to not patch? Exploitation of these bugs rely on being able to execute code on the target machine. But Javascript in a browser window is also code execution, and there are so many scenarios for an attacker to attain code execution that we can answer that question pretty easily with a resounding 'no'. And doubly so in shared (cloud) environments! In fact, it is for exactly these kinds of attacks that we advise against running any kind of security critical applications in the cloud. Nevertheless, patching is necessary in pretty much all circumstances. But two side notes are relevant: there is a performance hit (the cache exists for efficiency reasons, and removing the use of cache also removes efficiency). And also, Microsoft has introduced a new mechanism to determine if the anti-virus software plays well with the patches. A new registry key is introduced signaling the AV-vendors compatibility with the patches. No registry-key? Read up on this issue here: https://doublepulsar.com/important-information-about-microsoft-meltdown-cpu-security-fixes-antivirus-vendors-and-you-a852ba0292ec.

How about the future?
Will these patches be sufficient? I do not believe they will be fully sufficient in the long run, because you simply cannot fully patch a hardware microcode vulnerability in the OS layer. There will always remain avenues of attack. Also, this is just one side channel. There will probably be others that also relate to the speculative execution of if/then branches other than memory cache-timing attacks. Code cache comes to mind…..

Another mitigation is that if you don’t know what location in the memory the secrets are stored at, you will have a hard time finding it. So randomizing memory maps is also a measure that helps against Spectre and Meltdown, but only in a limited way: passwords and encryption keys are recognizable, regardless of location…. Any address space layout randomization (ASLR) is therefore only of limited use: just read it all and pick out the interesting bits.

Rowhammer, Spectre, and Meltdown
Finally, there are also persistent rumors of other hardware based attacks, similar to the RowHammer attack. RowHammer cannot be mitigated in software, and allows indirect writing of protected memory. So, a combination of the two would be quite a feat: use Meltdown/Spectre to find secrets in memory, while defeating ASLR. Then use RowHammer 2.0 (speculative, but realistic) to write those locations and change behavior of systems. It is not too farfetched that such a combination – HammerDown, if you so wish -  could lead to direct remote root/admin access.

What should you do?
It is clear we’ll be stuck with these problems for a long time to come. First is: make sure your AV doesn’t conflict with the patches. Then, apply the patches, but be prepared for a significant performance hit. Remember, this attack is silent, does not leave traces, and is very hard to detect using Intrusion Detection and Anti-Virus software (since the code to attack can have many forms).

In the long run, we will see new CPU designs that do not have this flaw but for now we are stuck with a less than optimal situation that cannot be handled on the application layer. 

Blog post 2018-01-12 by Ralph Moonen, Technical Director at Secura

Other recent posts by Ralph Moonen:

How broken is WPA2 really and what to do?

A bad week for crypto: following KRACK, now there's ROCA

Practical GPRS MitM attack- mobile setup with YateBTS

 

@Secura 2018
Webdesign Studio HB / webdevelopment Medusa