Capsule8 successfully wrapped up yet another Cloud Native Security Summit recently. This year’s exclusive virtual event, hosted by Capsule8 with partners Open Raven and Snowflake, focused on Lessons in Cyber Resilience. Pre-Covid, this would have been an in-person event with lots of networking and social interaction opportunities, and it’s so difficult to replicate that experience online. Nevertheless, the level of enthusiasm shown by the guests in learning and sharing trends in cloud-native security and the confluence of thoughts, albeit virtually – was a reminder of how resilient we have been as a society. On a similar note, cyber resilience is a shared responsibility and calls for the cyber community to stand in solidarity.
The series of panel discussions and fireside chats were led by information security experts and practitioners who are driving forces in cyber resilience from Unqork, PagerDuty, ActiveCampaign, Cloudflare, Rain Capital, 4Securitas, Very Good Security, Security Fanatics, Betterment, Snowflake, Segment, Google, HackerOne, YL Ventures, and Color, to name just a few.
CNSS panels and discussions this year provided insightful commentary and debate around building strong compliance programs, detection, visibility, and collaboration between security and operations are needed to build resilient organizations. Here are some key takeaways from the 2021 Cloud Native Security Summit (#CNSS21).
Takeaway #1 – What is Cyber Resilience in a Cloud-Native World?
“Incidents are inevitable…resilience is, how prepared are you for when that inevitability happens?” summarized Rob Duhart, Head of Federated Security at Google. This modern-day equivalent of the Benjamin Franklin aphorism, “If you fail to plan, you are planning to fail,” drives home the importance of implementing a strategy in advance of those potential worst-case scenarios. It is essential that organizations evaluate their systems, have detection and incident response plans, test their plans, and build security systems in a way that they can recover back to a steady-state when the inevitable happens.
“Monitoring is resilience. Monitoring and visibility gives us the ability to both detect when failures happen unexpectedly and detect when anomalies happen in general.”
– Prima Virani, Segment
“Monitoring is resilience,” remarked Prima Virani, Senior SIRT engineer at Segment. “Monitoring and visibility gives us the ability to both detect when failures happen unexpectedly and detect when anomalies happen in general.”
The ephemerality of cloud-native infrastructures makes increased levels of visibility crucial. Containers and serverless functions have extremely short lifespans, so if something goes wrong, you need to be able to detect and react to it almost immediately.
Many developers are increasingly using serverless functions for orchestration use cases. In a serverless example, Lambdas are chained together by step functions. If one Lambda fails, that means one step fails, and the entire step function exits. It’s like when one holiday light goes out, they all go out. “They all go out” isn’t ideal in a production environment, to say the least, so security teams need the visibility to detect an incident as soon as a failure or anomaly happens in production.
Takeaway #2 – Achieve Visibility without Compromising Performance
Head of Information Security at ActiveCampaign Chaim Mazal noted, we need to start with having “appropriate logging, appropriate tooling, appropriate dashboarding, and ensuring that the security team is creating a wide breadth of visibility, and you know what the day-to-day activities are for the majority of your engineering environment.”
However, as an organization scales and its customer base grows, achieving visibility with various tooling and dashboards is not beneficial if this additional data traffic impedes performance. Visibility is not worth sacrificing performance. Security must be engineered in close coordination with DevOps and SRE teams to achieve this balance. This is a collaborative process. Security and operations teams should work together to identify appropriate security tools that not only provide visibility but are lightweight and scalable to ensure the availability of production assets. Chaim echoed this collaboration, noting that “through a process of iteration, we came to an overwhelming conclusion that was beneficial for our DevOps organization, and for security, and for our overall engineering org as a whole.”
Takeaway #3 – New (Cloud-Native) Stack Requires a Shift in the Detection Paradigm
While some security challenges remain the same regardless of whether infrastructure is primarily on-premises or in a cloud environment, the cloud-native tech stack brings new detection challenges. As we noted above, visibility is required at a more granular level (with containers and serverless functions), and the ephemeral nature of cloud workloads makes that more challenging. Other concerns such as performance and actionability remain the same, for the most part, regardless of the environment.
“Cloud is programmatic by nature. So when it comes to detection and response, the skill sets that you need to be able to interact with, it changes as well.”
– Sounil Yu, Author, YL Ventures
Sounil Yu noted, “Cloud is programmatic by nature. So when it comes to detection and response, the skill sets that you need to be able to interact with, it changes as well.”
The cloud-native tech stack allows for more automated detection and response, so it warrants a decentralized approach toward detection and alerting. The cloud is decentralized by design, and application developers are used to building distributed workloads for speed and scale. Sounil continued that it was “just a question of how we adopt those (decentralized) practices for how we do detection in the cloud? And to the degree that we can drive towards a more distributed model…we offload a lot of the speed and storage concerns just because now it’s been so decentralized.”
Some organizations adopt a SOC-less model where instead of centralizing, you detect attacks and send alerts in a distributed model closer to the source of data; however, for these distributed pipelines to work more efficiently, alerting needs to be more intelligent. This calls for actionable alerting and providing rich context so that the alerts get routed to the right teams.
Takeaway #4 – Alerting needs to be actionable
One of the benefits of being cloud native is the rich logging capabilities that come with it. More data is certainly better for forensics and investigations; however, turning on a lot of logging capabilities can impact performance and lead to a lot of noise. As Kathy Wang, CISO at Very Good Security, pointed out, ”there’s an over-alerting situation where we don’t necessarily know what to alert on, so we just alert on everything that looks even remotely suspicious…alerting with fidelity is really, really important”. Kathy adds that “Nobody wants to be alerted for things that are not actionable. Security teams need to be very, very careful that if we’re going to throw out alerts, they better be actionable.”
“Nobody wants to be alerted for things that are not actionable.”
– Kathy Wang, Very Good Security
Making alerting more actionable also requires a DevOps model to threat detection where real-time detection and response is more a continuous pipeline. Derek Chamorro, Security Architect at Cloudflare, pointed that “we encourage our developers to allow them to contribute. That’s the only way we can actually kind of ensure that everybody’s becoming like a security steward on the events that they want to create or things that are actionable.”
Sounil Yu also referred to an article by Ryan McGeehan called ‘Lessons Learned and Detection Engineering’ that talks about making sure that if you’re going to create an alert, you’re also the one who’s going to act on that alert.
“So if you create a whole bunch of alerts that wakes you up at 2:00 in the morning, well, you’re the one who’s going to wake up at 2:00 in the morning to address those,” Sounil continued.“So it creates the right type of incentives to ensure that you will find those and make sure that whatever you stick into the pipeline is actually a queen and whatever comes out is actionable.”
Omer Singer, Head of Cybersecurity Strategy at Snowflake pointed that while threat detection in the cloud brings new challenges, we also have new opportunities. We can use the strengths of cloud and DevOps to create effective detections with fewer false positives or false negatives. He added that “we’re seeing cybersecurity adopt best practices from fields like data analytics and from software engineering”.
Takeaway #5 – Have a Cloud-Native Detection Strategy – Don’t Look for the Needle in the Haystack
Creating high-fidelity alerts requires reevaluating your organization’s detection strategy for a cloud tech stack. Otherwise, you’ll end up collecting data more opportunistically and looking for the proverbial threat needle in the alert haystack. Some organizations look to frameworks such as MITRE ATT&CK, NIST CSF, etc. to guide their detection strategy but as our panelists noted, that isn’t enough.
Kathy Wang noted that “The MITRE ATT&CK framework is one guide, one framework. I don’t think there’s any one framework out there that meets all of our requirements, and all of our needs. I think we have to be able to pick and choose, and prioritize what’s the biggest impact would be if we made the change.”
Kathy called for a more risk-based approach based on an organization’s environment and business. “Understand what are the top risks that we face as a company in terms of customer data, understanding your data classification, and then by prioritizing accordingly, you can then start to measure, from quarter over quarter, what you are doing to raise the bar on security.”
Other organizations embrace a more risk and threat-driven detection strategy instead of a coverage-based one to prioritize detections and generate high-fidelity alerts. Risk ranking, risk management, and ease of exploitation are used to prioritize detections across assets. This approach also helps to strike that critical balance between data and performance that was a theme throughout the entire event.
Takeaway #6 – Take a Risk-Based and Continuous Evaluation Approach for Compliance
Moving to the cloud or using cloud-native technologies can sometimes give organizations a false sense of security in that they rely on the inherent compliant practices of the cloud provider. Like security, compliance is also a shared responsibility between the cloud provider and the customer. If an organization’s compliance strategy is to defer to the cloud provider and add an indemnity clause to the contract, alarm bells should go off.
Al Faiella, Director of Cybersecurity at Unqork, suggests taking a two-fold approach where you evaluate compliance requirements not just based on what the law requires you to do but also in terms of real risks to your business. This approach, according to Al, will yield better results. This strategy enables organizations to be more precise and allows compliance to be engineered and implemented in an impactful way.
“I think since technology is moving incredibly fast, work that you may have done last year could already be stale and making sure that as we continue to rely more and more on automation, to be able to build, deploy, update, maintain infrastructure that we’re doing that thoughtfully.”
– Al Faiella, Unqork
With the increased use of automation and infrastructure code in a cloud-native world, a minor misconfiguration could be magnified exponentially. It is important that your tech stack is constantly tested, evaluated, and further hardened. As Al put it, “…just reevaluate it. I think since technology is moving incredibly fast, work that you may have done last year could already be stale and making sure that as we continue to rely more and more on automation, to be able to build, deploy, update, maintain infrastructure that we’re doing that thoughtfully.”
Takeaway #7 – Improving Cyber Resilience is Building a Cyber Resilience Culture
Reiterating the need to build a cyber resilience culture, Nick Espinosa, Chief Security Fanatic at Security Fanatics noted that “there’s a lot of things that you can do to improve your cyber resilience and all, but first and foremost, the biggest problem we have in cybersecurity isn’t the threat detection capability or the technology we deploy, it’s the culture and it’s the humans.”
“Security is everyone’s problem.”
– Naomi Buckwalter
“Security is everyone’s problem,” commented Naomi Buckwalter, Director of Information Security and IT, echoing Nick’s sentiment. She noted how important it is to build an organizational culture where everyone sees security as their responsibility. Security is not just the security team’s problem. Collaboration across security, DevOps and engineering teams helps build relationships, build trust, and get things done.
Will Gregorian, CISO at Color, alluded to “evaluating and profiling what your operating environment looks like before you actually go about doing the resilience conversation.” The process of identifying and assessing the value of key assets before having the resilience conversation provides a common denominator to talk about security pain points across different teams, get buy-in and come up with an actionable plan.
Wrapping up learnings from CNSS 2021, Security is a team sport and calls for stronger collaboration between security and DevOps teams. According to Fernando Montenegro, principal analyst from 451 Research, this is trending in a positive direction. In a survey conducted by 451 Research, 43% of DevOps teams and security teams say that they are working closer together. Re-emphasizing the need for collaboration, Fernando remarked that, “The way you’re going to be resilient about environments is if you’re able to understand how the other teams are working, how they expect systems to work together and then test how those conditions are going to play out and be ready for them.”