Grubbing Secure Boot the Wrong Way: CVE-2020-10713

Today, researchers at Eclypsium disclosed a buffer overflow vulnerability in GRUB2, CVE-2020-10713, affectionately termed “Boothole.” It basically results in a total pwn of Secure Boot in systems using GRUB, which is a lot of them — all Linux distros, a bunch of Windows machines, and more. Additionally, the mitigation process is a certified hot mess, so even the proposed solution is kind of grubbing salt in the wound.

Why is it cool?: Secure Boot, as the name suggests, is designed to verify that all the firmware required to start up a computer is trusted. A complete defeat of Secure Boot, therefore, means the computer is essentially helpless against malicious tampering from the first blinky light of its digital dawn after the power button is pressed. Such boot bamboozelry facilitates furtive persistence useful for ransomware (see Lockbit’s variant employing boot record hijinks), Zeus-like keyloggers, cryptominers, espionage activity (see Rockboot by APT41), etc., because the code operates before the OS  — where security tools usually reside — is up and running.

Kelly’s spicy take: We might see more damage from people attempting the mitigation (more on revocations later) rather than attackers leveraging this in dastardly digital crimes.

Digging deeper: The problem essentially lies in GRUB’s inadequate error handling. So, what even is GRUB? It’s a bootloader designed to load and kick off any OS on any hardware. But let’s back up and take a look at a highly simplified version of how a Linux system with Secure Boot boots up:

  1. Firmware loads the smol first-stage bootloader binary that contains a trusted certificate (known as a “shim”)
  2. Shim loads the GRUB binary (another bootloader) and validates it with the certificate
  3. GRUB loads any required configurations, located in grub.cfg (Chekov’s config file in this tale), which point to where the kernel image can be loaded
  4. GRUB validates the kernel via keys stored in the firmware’s trust database (db and dbx, the authorized and forbidden signature databases, respectively)
  5. GRUB hands over control to the kernel
  6. Kernel boots the OS

This vulnerability resides in step #3. GRUB uses a language parser (generated via flex and bison) to read the config file. If the text in the config file is too large, the flex engine says “no thank you!” with the expectation that the processing function will exit or be halted. Quite unfortunately, GRUB’s implementation does not fulfill that expectation. Instead, GRUB is like, “Oh, a fatal error indicating that the string is too big for the buffer? Cool, cool, cool, I’ll copy it straight into the buffer so we can proceed with executing the function!”

As a result, you can put massive strings into grub.cfg (which isn’t signed or verified) that the parser will happily copy into memory, leading to a buffer overflow. From there, attackers can write anything they want to system memory without any constraints (gaining what’s known as a write-what-where primitive). 

Boot and EFI land relies on memory being in fixed locations, so it doesn’t possess ASLR or other fancy exploitation mitigations that exist in OS space — which is a relief for attackers. This total control over the system means the attacker can insert their own bootloader, allowing them to hijack the boot process and maintain control every time the system starts.

Okay, but: Exploiting this vulnerability requires root / admin access to access the grub.cfg file located in the EFI System Partition, which means the attacker must first gain a foothold on the system and escalate privileges (physical access also works). The vuln only helps with persistence across system reboots, so it’s unnecessary — and perilously noisy — for attackers to employ this if they already have root on a system that never reboots. It’s also preposterously unlikely that any attacker will spontaneously write on-the-fly real mode shellcode that will perform boot injection and OS loading. If they do, they probably deserve the win. 

And yet: This will become incredibly bad news if enterprising criminals incorporate this vulnerability into their nefarious bots as part of the standard “be hacker, do crimes” pipeline of bootkit creation -> licensing the bootkit to a bot author -> deploying or selling the bootkit-armed bots (like in a botnet). This pipeline will not pop out pwnage overnight, so the question becomes whether mitigations can be successfully rolled out before criminals can scale this attack.

What’s the impact?: A bunch of Windows machines will be affected, but I’ll be focusing on Linux land. Every Linux distro using Secure Boot with the standard UEFI certificate is affected, since all signed versions of GRUB are vulnerable. The infosec community will tell you that Secure Boot has been broken for 10 years, and yet nobody cared — but the reality is that a non-trivial number of organizations rely on it to protect more sensitive systems.

If you’re using Linux in the cloud, you potentially aren’t impacted, depending on your cloud provider. Google Cloud Platform’s Shielded VMs use Secure Boot, which are now the default for the Google Compute Engine (GCE) service and are also used as the underlying infrastructure for Google Kubernetes Engine (GKE), Cloud SQL, and more. Elsewhere in the digital heavens, Azure cloud instances don’t support Secure Boot, and AWS doesn’t seem to support it, either — so they seem to be unaffected by this vuln.

Is there a patch?: Well yes, but actually no. The first part of the mitigation process is a logistical challenge; the subsequent part is Kafkaesque. The first part requires coordination among Linux distros using GRUB2 (which is all of them), relevant open-source projects, vendors (like blinky box security solution peddlers), and Microsoft. The issue first must be fixed in GRUB2, then the Linux distros, relevant open-source maintainers, and vendors need to update their bootloaders, installers, and shims. These new shims need to be signed by Microsoft, who, as it turns out, is the designated signer of third party UEFI certificates.

On the sysadmin side, you’ll need to update your Linux systems to the latest distro versions — both in your base images and in your deployed systems (don’t forget about your backup and disaster recovery stuff). But the next phase of mitigation descends into a nightmare.

The plan is for Microsoft to release an updated denylist (the UEFI revocation list located in the aforementioned dbx database) that will block malicious shims that can load unpatched versions of GRUB susceptible to this vulnerability. The problem is that updating the denylist before you update the bootloader or OS results in a bricked or borked system that can break workflows. Thus, for now, it’ll be on ops to manually apply the updated denylist given that it’s far too risky to push out updates automatically. Please send good vibes to your ops team when that day comes.

The bottom line: If or when criminals operationalize this in an automated fashion, then you probably need to press the panic button (if you haven’t yet mitigated it at that point). For now, try to apply updates once they’re available, but ops folks should put on their metaphorical safety goggles for the revocation process to avoid borking systems. You can also do some manual threat hunting to look for changes in the config files if you’re the paranoid type, but it’s not strictly necessary.

For Capsule8 customers, you can enable our detection that checks for attempts to write to the boot partition, in addition to detection of users escalating privileges (which is a necessary preliminary step for exploiting this vuln).

The Capsule8 Labs team conducts offensive and defensive research to understand the threat landscape for modern infrastructure and to continuously improve Capsule8’s attack coverage.