Resilience as a Core Component of Successful Security


Closing out the 3rd Annual Cloud Native Security Summit (CNSS), hosted by Capsule8 along with Open Raven and Snowflake, was the “#realtalk on Resilience” session moderated by Kelly Shortridge.

The panel featured Bea Hughes, Security Engineer at PagerDuty, Prima Virani, Search Security Engineer at Segment, and Rob Duhart Jr., Head of Federated Security at Google. They each offered in-depth insights into what cyber resilience means and the role it plays in the modern enterprise.

What Is Cyber Resilience?

Cyber resilience is the ability to cope, respond and recover from cyber incidents that you both expect and don’t expect. Resilience in general is about managing change – this can include something as ephemeral as business change, the only true constant in business. In cybersecurity, it might include compromise due to entropy, the general failure of systems over time, either due to the constant development of technology, or due to a specific threat.

If boiled down to simplest terms, however, resilience is the act of ensuring systems don’t break, and when they do break, ensuring engineers know about it as quickly as possible with minimal impact to users.

Security incidents are inevitable. We know they are going to happen, so it’s not a matter of if, but when. Resilience is about being prepared for when that inevitability happens. Do you have a plan? Have you practiced that plan? Have your systems been evaluated and built to the degree that they can recover when the inevitable occurs? Resilience is about responding to a disruption, and how quickly you can recover to a steady, safe state.

The concept of resilience is often equated with a silver bullet tool. A solution that you can buy and implement to protect your systems. In reality, it’s neither a tool nor something you can buy. Resilience is built into a team and the organization as a mindset that assists in addressing challenges before they can overwhelm. While the right tooling is essential, resilience takes time, focus, and energy, and you can’t buy any of those things.

Resilience as a Security Mindset

Resilience is a mindset that needs to be adopted by everyone in an organization to have a positive impact on security efforts. It’s a way in which teams need to think about architecting both whatever they are releasing into production and the systems that are used to observe production. Of course, resilience also means different things to different people, so there’s no one right way to go about achieving it.

Resilience is not new and is not created by the security industry. It’s a key part of any successful team, and it can help security teams learn and improve when emphasized as part of the culture. In traditional industries such as aerospace, defense, or healthcare, resilience is a core tenet, and the cyber security industry can learn from them. It’s important to recognize that sometimes the apparatus used in the security industry are not always designed for business resilience. Imagine those organizations that have a 99.7% SLO for uptime. One hiccup at the security level, and those companies cannot function at the level they need to to be profitable. The key question needs to be “how do we build security systems that are functional and can enable organizations to respond and be resilient while ensuring the uptime needed?”

In the age of containers and ephemeral infrastructure, the definition of resilience continues to evolve. Rather than investing to build an unbreakable monolithic application, teams are looking at how to keep things fast, scalable and reliable by building function based workloads. Resilience, in this case, isn’t about longevity, but about speed and recovery. If you launch a thousand containers and three of them break, the system can be designed to absorb the disruption. If you are operating thousands of containers that are chained together by step functions, one failure can lead to dozens or hundreds more failures. The entire step function can fail as a result. In this situation, the path to resilience comes from visibility and monitoring.

In a lot of cases, resilience and security are tightly knit. Monitoring and visibility give us the ability to both detect and respond when failures happen unexpectedly as well as proactively observe when anomalies happen in general. This is crucial in ensuring a cloud native system thrives and no failures go unchecked.

 

“The best laid plans won’t mean anything if you can’t recover from ransomware.” – Rob Duhart, Head of Federated Security, Google

 

Recommendations for Cyber Resilience

One of the main takeaways emphasized by the security leaders in the #realtalk panel is that no one is hired to do pure academic security work. Their role is to facilitate the business, and that means textbook security products that can get in the way of the healthy operation of a business won’t always be accepted with open arms. Your job as a security leader is to enable the business to do what it needs to do with the least amount of risk possible. Things won’t always be perfect and there are risk calculations to be made. Resilience plays a prime role.

Your success as a security professional will often come down to how many alternatives you can suggest in any given situation. One of the biggest mistakes newcomers to the field make is to offer a single solution and be dejected when that solution is turned down as non-viable. Resilience is about thinking through how best to modify a solution to match all the needs of the business reality.

 

“Curiosity will take you farther than knowledge ever will.” – Prima Virani, Search Security Engineer, Segment 

 

This flexibility is the core of resilience and key to success for a security professional. You cannot rely on patterns, but rather have to modify and evolve your approach with the infrastructure of the systems that you’re working with and the environment in which you live.

Resilience doesn’t come from being strong, but from being flexible. Even with all the tools in your toolbox, it doesn’t mean anything if you don’t have a backup plan for how to use them when something unexpected happens.

 

“Resilience is being able to go with change and stand against it without being blown over.” – Bea Hughes, Security Engineer at PagerDuty

 

In all of this, it’s important to also remember that users are a measure of success for security, not the enemy. Empowering, enabling, and equipping users to live in an unsafe world safely, and ensuring your systems are resilient in a world where resiliency is a challenge are paramount.

As part of our ongoing series on the Cloud-Native Security Summit, stay tuned for Part Two on this topic – “Hacking Failure to Build Resilience.”