Threat Protection Appliances Are as Valuable to Security as Your Toaster

(This post originally appeared on DevOps.com)

Nothing in the IT security community is as widely deployed and universally reviled as Anti-Virus. But, threat detection appliances, including intrusion prevention appliances, application firewalls and advanced threat protection appliances should be almost as reviled. These appliances are nearly as useless as they are toxic. They do a horrible job finding problems and ultimately create more costs for security organizations than they save by crippling organizations with false alerts.

Protection is Impractical

In terms of security, detection appliances have never been effective, since the network makes a lousy location for any kind of security hub from the outset. Few network protection devices can even prevent attacks, much less respond to them – and most appliances are simply used for detection – even those with rudimentary prevention capabilities. In theory, an Intrusion Prevention device can drop a connection before letting the bad stuff through, but, in practice, that seldom happens.

In most enterprises, security detection appliances are sitting off to the side, only capable of looking at a copy of network traffic as opposed to the actual network traffic itself. Organizations choose this less-than-ideal setup to avoid the inevitable network latency issues, which otherwise occur when a single point of failure such as an appliance gets flooded or has a bug, for example.

That’s not to say that a network appliance can’t respond to an attack in progress in any way. If an appliance determines that a particular network connection is malicious, and there is complete confidence in the appliance’s judgement, then an organization can have the device signal something that is in-line, such as a router or firewall. That device can then take action by killing the connection or blacklisting the offending IP address.

The problem with that scenario is that there often isn’t enough context at the network layer. If a network appliance tries to make fast decisions, it’s going to produce an extraordinarily high number of false positives, making its judgement suspect.  Whenever there’s any doubt about an appliance’s judgement, it shouldn’t be trusted to make decisions whether to take action. Action in such circumstances could affect legitimate connections, which could have catastrophic consequences for the entire business.

Appliances: What’s Really Happening

While old-school devices (e.g., traditional Intrusion Prevention Devices) sacrificed accuracy for speed, today’s more sophisticated appliances do a better job of processing available data, so which gives vastly superior results with far fewer false positives.

The typical “next gen” security detection appliance (such as an “advanced threat detection” appliance) tries to give as good an answer as possible about whether the connection represents an attack by simulating the end destination of the traffic to predict whether anything malicious will happen. This is done by running a copy of the traffic into a specialized virtual machine, and then monitoring the simulated outcome. If something malicious is found at the destination in the virtual machine, then there’s reasonably high confidence that the connection represents an attack. The goal of this approach is to limit false positives to the point where the organization can respond to every event. Some organization may even be willing to automate that response, by killing the connection if a suspected attack is detected in time, for example.

While this class of a device tends to reduce noise, it doesn’t eliminate “false negatives,” which allow attacks to sail through without being noticed. In fact, some believe they make the false negatives problem of appliances even worse.

If an attack works against an actual end user, yet not in the virtual environment, it could easily be overlooked by an appliance. Imagine a company buys an appliance and configures it to test network traffic by sending it to a virtual machine running Windows 10, but the traffic being tested only runs its proper course if running on Windows 9. In that case, if the network traffic’s destination is a user’s desktop that is running Windows 9, the attack can easily be successful, since the appliance won’t find the problem (because it’s running Windows 10).

This class of problem was first discovered for Network Intrusion Prevention by Ptacek and Newsham in 1998. While “next gen” technologies are capable of more accurately modeling the systems they’re trying to protect, this problem will never go away when protecting from the network. Even when companies attempt to force all systems onto a single “gold” image, exceptions tend to quickly come out of the woodwork. For example, users within development and technical organizations may often find themselves in need of software that isn’t on the gold image, or they may need to run older software for testing purposes. Also, the gold image approach tends to only work for desktops.

Making things worse, the “black hat” community can get their hands on the same appliances, and reverse engineer the virtualization technology and detection techniques it uses. Then, they figure out how to detect whether they’re running on such a device. Once they’ve done that, their real-life attacks will go unnoticed on the appliance, while still reaching the intended target.

In the “advanced threat detection” appliance space, this problem is prevalent. A dirty secret in the appliance sector is that the majority of detections from these devices are discovered on machines that have already been infected, which are then beaconing out to external command-and-control infrastructure. If instead, these appliances had detected the actual inbound threat, the attack might have been prevented.

Alert Fatigue: Poor Detection Quality

Drowning in alerts is perhaps the single biggest problem IT Security organizations face.  Even after all the raw data coming from around the network goes through a best-of-breed correlation and analysis engine, there is still an overflowing sea of information. These besieged engines generate so many alerts that organizations can’t possibly and cost effectively provide proper review for every priority alert they would like to investigate.  As it turns out, the vast majority of alerts are likely to be false positives anyway. Even the best talent can often and easily miss real incidents when pouring through so much superfluous data.

Imagine the proverbial needle in the haystack, but multiply the number of haystacks so that the entire state of Kansas is covered in hay. Now you can begin imagine the scope of the problem. Sure, you might find an occasional needle or two – especially with some ingenuity. But, there’s virtually no chance that you could tractably look for them all even if you wanted to. That aside, your vigilance is likely to wane for at least a haystack or two, which is all it takes for one devastating needle to puncture your organization.

Security appliances are – particularly more traditional devices – represent the single biggest contributor to alert fatigue by far.  When you’re limiting your investigation to network traffic and trying to get work done as quickly as possible, you don’t have enough context to make good decisions. So, if you want to err on the side of caution, keep in mind that appliances will give you plenty of reason for caution, while uncovering few actual problems.

If we are serious about eliminating the problem over alert-fatigue, we must move beyond appliances and begin managing all of our detection at the machine-level. On a machine we can see what’s happening on the file system, what’s happening in memory, what’s happening in the OS, and even what’s happening in the application (for common applications). That’s far more telemetric data to isolate signals and ignore noise (assuming we use all that data wisely).

Many of the “next-gen” approaches live in a poor middle ground; they try to simulate the machine, but generally can’t afford to create an accurate or effective simulation. If an organization is trying to protect a thousand servers, how many appliances would be needed to thoroughly simulate every inch of network and each bit of traffic that crosses? The answer is about a thousand appliances, which is far too cost prohibitive. Most organizations prefer one appliance to protect hundreds or even thousands of machines, which is clearly insufficient.

The Future: Will Be Worse

The standards community is finally embracing the less is more philosophy as it prepares to launch TLS 1.3. This new protocol will ensure there aren’t straightforward back doors to subvert end-to-end encryption. Right now, most security appliances rely on those back doors as a means to provide detection (they need to be able to see the decrypted traffic). The new standard is going to require IT organizations have man-in-the-middle for every hop if they want their security appliances to work. That will add massive latency and expense, so most people won’t do it, and appliances will become even less reliable as a result.

Additionally, the rise of containerization and micro-services will exacerbate the situation further, since putting container-to-container traffic on the same machine is well beyond the reach of any appliance.

Security appliances may seem better than anti-virus, because they occasionally provide some value and don’t add as much attack surface. On the other hand, Anti-Virus software doesn’t drown organizations in an ocean of spurious reports. Besides, at least AV now has credible alternatives, unlike an appliance. Security appliances are not worth the silicon they’re printed on and should be cursed in the same breath as Anti-Virus.