Cyber Resilience vs. Cybersecurity: Planning to Get Hit

I watched a Fortune 500 healthcare company lose $18 million in a ransomware attack. Their security tools were excellent. They had EDR, next-gen firewalls, SIEM, the works. Their prevention controls passed every audit. What they didn't have was a tested incident response plan, current backups they could actually restore, or executives who knew what decisions to make at 3 AM when systems went dark.

The attack itself wasn't sophisticated. The recovery was a disaster because they'd spent years optimizing for prevention and almost no time preparing for what happens when prevention fails.

This is the gap between cybersecurity and cyber resilience. Cybersecurity tries to keep you from getting hit. Cyber resilience assumes you will get hit and determines whether that hit becomes a manageable incident or a company-ending catastrophe.

Prevention Is Not a Strategy

Every CISO builds defenses. We layer controls, harden systems, train users, monitor networks. This work matters. I'm not arguing against it.

But I've seen too many organizations treat prevention as if it's a sufficient strategy. They measure success by the number of attacks they blocked, not by their ability to survive the one that gets through. The board sees a clean audit report and assumes the risk is managed.

The problem is mathematical. Attackers only need to succeed once. Defenders need to succeed every single time, forever. Those odds don't work in your favor, especially when you're dealing with nation-state actors, ransomware-as-a-service operations, or just the reality of human error in complex systems.

In regulated industries like healthcare and defense, this gap is particularly dangerous. A breach doesn't just mean downtime—it means HIPAA violations, loss of CUI, contract terminations, regulatory penalties. The organizations that survive these events intact are the ones that planned for resilience, not just security.

What Actually Breaks Down During an Incident

The pattern I see across incident responses is consistent: technical security isn't usually the failure point during the crisis. The breakdowns happen in three areas.

First, nobody knows who's in charge. Is it the CISO? The CEO? Legal? Communications? I've been in war rooms where executives argued about decision authority while systems stayed encrypted. You can't figure out your command structure during the emergency.

Second, the backups don't work the way everyone assumed. They exist, they're tested periodically, but nobody's actually tried to restore the entire ERP system or validated that the backup architecture can handle a simultaneous loss of primary systems. Or the backups themselves are compromised because attackers had persistent access for weeks before pulling the trigger.

Third, communication falls apart. Internal teams don't know what they're allowed to say. Customers hear about the breach on social media before the company reaches out. Regulators get incomplete or contradictory information because nobody established reporting protocols in advance.

These are resilience failures, not security failures. And they're completely predictable if you look for them before an incident occurs.

Cyber Resilience as an Executive Discipline

Cyber resilience isn't about better firewalls. It's about business continuity, crisis leadership, and organizational decision-making under pressure. This is why it belongs at the executive level, not buried in IT.

The questions that define resilience are business questions: Which systems must stay operational? What's our tolerance for downtime before we lose customers or violate contracts? Who has authority to make spend decisions during an incident? How do we communicate with stakeholders when we don't have complete information?

I frame this for boards and C-suites as a test: if your primary data center went offline right now and stayed offline for 72 hours, what would happen? Not technically—everyone knows systems would go down. What would happen to revenue, to customers, to regulatory obligations, to reputation? Who would make which decisions? What information would they need? Where would that information come from if your normal monitoring tools are down?

Most organizations can't answer these questions with confidence. That's the resilience gap.

The organizations that answer well typically have a few things in common. They've done tabletop exercises that actually tested decision-making, not just technical response. They've identified critical business functions and mapped dependencies. They've designated crisis leadership roles and practiced using them. They measure resilience metrics, not just security metrics.

Resilience Metrics That Actually Matter

Security metrics focus on prevention: time to patch, phishing test results, vulnerability counts. These matter, but they don't tell you anything about resilience.

Resilience metrics measure your ability to survive and recover. Recovery Time Objective (RTO) and Recovery Point Objective (RPO) for critical systems—and whether you've actually tested those numbers. Time from detection to containment in the last tabletop exercise. Percentage of staff who know their role in the incident response plan. Age of the last successful full-system restore test.

One metric I track closely: time from incident declaration to first executive decision. In resilient organizations, this is under 30 minutes because the decision frameworks already exist. In organizations that aren't prepared, I've seen this stretch to 8 or 12 hours while people argue about process.

Does Your Board Understand the Difference?

Carl delivers keynotes on cyber resilience and executive security strategy that cut through vendor noise and focus on the decisions leaders actually need to make. His sessions prepare boards and executives to lead during a crisis, not just prevent one.

Book Carl to Speak

Building an Incident Response Capability That Works

Most organizations have an incident response plan. It's usually a document that lives in SharePoint and gets reviewed annually. That's not the same as having an incident response capability.

A capability means you can actually execute under pressure. This requires four things: clarity about roles, practiced procedures, accessible tools and data, and decision authority.

Clarity about roles means everyone knows who does what when an incident is declared. Not just the security team—legal, communications, HR, finance, operations. I've seen incident responses stall because nobody knew whether legal or the CISO had final say on whether to pay a ransom. That's a question you answer in advance and document clearly.

Practiced procedures means you've actually walked through your playbooks, not just read them. Tabletop exercises are valuable, but they need to be realistic. Don't just run through a scripted scenario where everyone nods along. Inject real ambiguity: incomplete information, conflicting data, time pressure, communication breakdowns. See what breaks.

Accessible tools and data means your incident response resources are available even when your primary systems aren't. Your IR plan shouldn't live only on your internal wiki. Contact lists shouldn't only exist in your email system. Backup credentials shouldn't only be in the password manager that's part of the compromised environment. This seems obvious, but I've seen it fail repeatedly.

Decision authority means your executives know what they're empowered to decide without a committee meeting. Can the CISO authorize emergency spending? Can the communications director release a statement without three layers of approval? Can operations shut down a production line if it's connected to a compromised system? You establish these authorities before the crisis, not during it.

The Ransomware Decision Framework

Ransomware forces a particularly difficult decision: do you pay? The answer depends on factors you should evaluate before you're in the situation.

I help organizations build a decision framework in advance. What's the financial impact of extended downtime versus the ransom demand? Do we have offline backups we've actually tested? What are the regulatory implications of payment? What's the legal exposure? What's the reputation risk of paying versus not paying? Are we dealing with a sanctioned entity where payment might be illegal?

You can't make this decision well at 2 AM under pressure from an attacker's countdown timer. But you can establish the framework, identify the decision-makers, and clarify the information you'd need to evaluate the situation.

Organizations with this framework in place make faster, better decisions. Organizations without it often make choices they regret—either paying when they had viable alternatives, or refusing to pay and discovering their backups don't actually work.

Business Continuity and Disaster Recovery: Not the Same Thing

Business continuity planning and disaster recovery are related but distinct disciplines, and both are essential to cyber resilience.

Disaster recovery is technical: how do we restore systems and data? Business continuity is operational: how do we keep critical business functions running when systems are down?

I've seen organizations with excellent DR plans that had no idea how to actually run their business during a recovery. Customer service couldn't access account information. Manufacturing couldn't process orders. Billing couldn't generate invoices. They could restore systems in 48 hours, but they couldn't operate for 48 hours without those systems.

Business continuity requires understanding your critical business functions and their dependencies. What has to keep working? What can pause for a day, a week, a month? What are the workarounds when primary systems are unavailable?

For a healthcare provider, this might mean: we can delay billing, but we can't delay patient care. What's the paper-based process for prescription orders if the EHR is down? For a defense contractor, this might mean: we can't miss delivery deadlines on active contracts without cure notices, so which production lines are critical and what's the manual process if MES is offline?

The gap I see most often is in dependencies. Teams understand their own systems, but not what those systems depend on. Your ERP depends on your database, which depends on your SAN, which depends on your network, which depends on your directory service, which depends on your authentication system. A failure at any point cascades. Business continuity planning maps these dependencies and identifies the single points of failure.

Testing Is Where Theory Meets Reality

You don't know if your business continuity plan works until you test it. Not a tabletop discussion—an actual test where you simulate loss of systems and try to execute the workarounds.

I recommend starting small: pick one critical business function, simulate loss of its primary system, and try to keep the function running using your documented continuity procedures. You'll discover gaps immediately. The documentation is out of date. The backup process owner left the company. The workaround assumes access to a system that would also be down in a real incident.

Fix those gaps and test again. Then expand scope. Eventually you're testing responses to scenarios like "all systems in this facility are unavailable" or "we've lost access to cloud services for an extended period."

These tests are uncomfortable. They expose problems. But the problems exist whether you test or not. Better to find them during a controlled test than during an actual incident.

Crisis Leadership and Communication

The leadership challenges during a cyber incident are different from normal operations. Information is incomplete. Decisions can't wait for consensus. Stakeholders need updates even when you don't have answers. And everything happens faster than comfortable.

The leaders who handle this well share some common practices. They're comfortable making decisions with incomplete information. They communicate clearly about what they know, what they don't know, and when they'll provide updates. They empower their teams to act without waiting for permission on every detail. And they've practiced these behaviors before the crisis.

In my experience working with executives on cyber crisis preparedness, the biggest adjustment is the communication pace. In normal operations, you can take time to gather data, build consensus, craft messaging. In a cyber crisis, you're providing hourly updates to the board, fielding constant questions from customers and regulators, and making decisions based on partial information.

This is particularly challenging for leaders who came up through technical ranks. The instinct is to wait until you fully understand the technical details. But stakeholders need to hear from you before the technical investigation is complete. You learn to communicate clearly about uncertainty: "Here's what we know as of now, here's what we're still investigating, here's what we're doing to protect customers, here's when we'll update you again."

Stakeholder Communication During an Incident

Different stakeholders need different information at different times. Your board needs strategic updates and decision items. Your customers need to know how they're impacted and what actions they should take. Regulators need factual information about what happened and what you're doing about it. Employees need to know how to do their jobs and what they should tell customers.

I've seen organizations compound a security incident with a communication failure. Customers learn about a breach from the news before the company reaches out. Employees give conflicting information because they weren't briefed. Regulators receive incomplete initial reports and then get frustrated when details change.

The organizations that handle this well have pre-drafted communication templates for different scenarios and different audiences. They have a designated spokesperson for each stakeholder group. They establish a communication rhythm—updates every 4 hours to the board, daily updates to customers, whatever makes sense—and they stick to it even when there's no new information to share.

For regulated industries, stakeholder communication has legal and compliance implications. HIPAA breach notification has specific timing and content requirements. CMMC incidents may require reporting to the DoD. Your communication plan needs to account for these obligations, and your legal team needs to be involved from the start.

Prepare Your Leadership Team for the Real Crisis

Carl's keynotes on cyber resilience and crisis leadership equip executive teams with the frameworks and decision-making tools they need when prevention fails. Based on real incident experience, not vendor playbooks. See all keynote speaking topics or reach out about your event.

Book Carl for Your Event

The Role of the CISO in Building Resilience

As a CISO, my job isn't just to prevent breaches. It's to ensure the organization can survive and recover when prevention fails. That's a broader mandate than traditional security leadership, and it requires different skills and different relationships.

Building cyber resilience means partnering with business units to understand critical functions and dependencies. It means working with finance on the business case for resilience investments. It means educating the board on the difference between security metrics and resilience capabilities. It means collaborating with communications on crisis messaging. It means running exercises that test leadership decision-making, not just technical response.

This is why many organizations are finding value in vCISO arrangements that bring strategic security leadership without the full-time executive overhead. A skilled vCISO can build the resilience framework, establish the processes, train the teams, and provide crisis leadership when needed. For organizations that aren't ready for a full-time CISO or that need deep expertise in specific areas like regulatory compliance or incident response, this model works well.

The key questions any security leader should be asking—whether full-time CISO or vCISO—are business questions, not just technical ones. As I discuss in Questions Every CISO Should Be Asking the CEO, the conversation needs to center on risk tolerance, business impact, and strategic priorities. What level of residual risk is acceptable? What's the business impact of downtime for critical systems? What resilience capabilities justify investment?

Translating Resilience for Non-Technical Executives

One of the hardest parts of building cyber resilience is communicating the need to executives who don't have a technical background. They understand prevention conceptually—keep the bad guys out. Resilience is a harder sell because it requires investment in capabilities you hope to never use.

I frame it in business terms. Prevention is like building security into your facility—locks, cameras, alarms. Resilience is like having insurance, backup generators, and a continuity plan so your business keeps running even if the building floods. Both matter. But the security system doesn't help you much if you can't operate during the recovery.

The business case for resilience is about protecting revenue, avoiding regulatory penalties, preserving customer trust, and maintaining operational capability. These are outcomes executives understand. When you can quantify the cost of downtime, the regulatory penalties for breach notification failures, the customer churn from extended service disruptions, the case becomes clear.

For leaders who want to better understand these concepts, I wrote Cybersecurity for Non-Technical Leaders to bridge this gap. Resilience isn't a technical problem. It's a business continuity problem that happens to involve technology.

Regulatory and Compliance Implications

For organizations in regulated industries, cyber resilience isn't optional. Regulatory frameworks increasingly expect it.

HIPAA's Security Rule requires not just protection of ePHI, but also contingency planning—data backup, disaster recovery, emergency mode operations, testing and revision procedures. You can't be HIPAA compliant with prevention controls alone. The regulation explicitly requires resilience capabilities.

CMMC includes incident response and recovery requirements. You need documented and practiced procedures for incident handling, response, and recovery. The assessment looks at whether you can actually execute these procedures, not just whether the documentation exists.

Even frameworks that focus heavily on prevention, like NIST 800-171, include recovery requirements. You need to establish and maintain system backups, you need to protect the confidentiality of backup CUI at storage locations, and you need to be able to restore systems.

What I tell clients is this: you can't audit your way to resilience, but you will get audited on your resilience capabilities. Compliance frameworks give you a baseline, but they don't guarantee you can actually survive an incident. Use them as a minimum bar, not a target state.

The organizations that handle this well treat compliance requirements as a starting point and build resilience capabilities that go beyond checkbox compliance. They test their backup and recovery procedures more frequently than required. They run tabletop exercises that involve business leaders, not just IT staff. They measure resilience outcomes, not just compliance activities.

Starting to Build Cyber Resilience

If your organization has invested heavily in prevention but not in resilience, where do you start?

First, get honest about your current state. Can you restore critical systems from backup? When did you last test this? Do you have an incident response plan that's actually usable during a crisis? Have your executives practiced making decisions under incident conditions? Do you know your critical business functions and their dependencies?

Most organizations will find gaps. That's not failure—it's information. You can't fix problems you haven't identified.

Second, prioritize based on business impact. You can't build perfect resilience overnight, so focus on the scenarios that would hurt most. For many organizations, this is ransomware. For healthcare providers, it might be loss of EHR access. For defense contractors, it might be CUI exposure. Identify the two or three scenarios that pose the greatest business risk and build resilience capabilities around those first.

Third, test your assumptions. Don't assume your backups work—restore them. Don't assume your team knows the incident response procedures—walk through them. Don't assume executives can make crisis decisions effectively—run a tabletop. Testing will surface problems, and that's exactly what you want while you still have time to fix them.

Fourth, establish the governance structure for crisis decision-making. Who's in charge during an incident? What authority do they have? What decisions require escalation? What's the communication cadence with the board? These questions shouldn't be addressed for the first time during an active incident.

Fifth, make resilience a continuous discipline, not a project. Your business changes, your systems change, your threats change. What worked last year might not work now. Schedule regular testing, update procedures based on what you learn, train new staff on their roles, and keep executives engaged with realistic scenarios.

The organizations that do this well don't treat resilience as a separate initiative from security. It's integrated into how they think about risk. Every new system deployment includes recovery planning. Every change to critical infrastructure triggers an update to business continuity procedures. Resilience becomes part of the culture, not a compliance exercise.

What Success Looks Like

Cyber resilience doesn't mean you never get breached. It means that when you do get breached, you respond effectively, you maintain critical operations, you communicate clearly, and you recover quickly.

I've worked with organizations during incidents who handled this remarkably well. They detected the intrusion quickly because they had good monitoring and their team knew what to look for. They contained it effectively because their incident response procedures were practiced and their team had clear authority to act. They maintained operations because they had tested business continuity procedures and backup systems they could actually use. They communicated transparently because they had prepared messaging frameworks and designated spokespeople. They recovered quickly because they had current, tested backups and documented recovery procedures.

The breach still happened. But it was a contained incident, not a catastrophe. They disclosed it appropriately, they supported affected stakeholders, they learned from it, and they moved forward. That's what resilience looks like in practice.

Compare that to organizations where resilience was an afterthought. They discover breaches weeks or months after attackers gained access because detection capabilities were minimal. They struggle to contain the incident because nobody's clear on who has authority to shut down systems. They can't maintain operations because there are no tested workarounds for critical systems. They fumble communication because they're making it up as they go. Recovery drags on for weeks or months because backups are incomplete or compromised.

Same threat, vastly different outcomes. The difference is cyber resilience.

For executives and boards, this is the strategic question that matters: when prevention fails—not if, but when—will your organization survive intact? Can you maintain operations, fulfill obligations to customers and regulators, make sound decisions under pressure, and recover quickly? If you're not confident in the answer, you have a resilience gap. And that gap is probably the biggest cyber risk you're carrying, whether or not it shows up on your risk register.

The time to build resilience is before you need it. The organization to learn from isn't the one that never gets breached—such an organization doesn't exist. It's the one that gets breached and survives, learns, and emerges stronger. That's what cyber resilience enables.