In March 2026, a cyberattack shut down Stryker's global operations, delaying surgeries, knocking cardiac monitoring systems offline, and leaving 50,000+ employees unable to work. Weeks later, the company still had no restoration timeline, despite actively rebuilding from backups.
That same month, the AWS us-east-1 outage took down 1,000+ companies for 15 hours, not because data was lost, but because the infrastructure layer that connects everything collapsed, and no one had a plan for that.
The truth? The DR industry built a $20B market solving the wrong problem. Because your backup vendor is protecting the wrong 30%.
Open their coverage docs:
Databases ✓
Storage buckets ✓
Server images ✓
What they skip:
VPC configurations ✗
Security groups and IAM roles ✗
Load balancers and DNS ✗
Infrastructure glue (SQS, EventBridge, API Gateway) ✗
AI infrastructure (Bedrock, Vertex AI) ✗
Restore the database without the VPC and load balancer config and you have data with no way to reach it. Forty years of DR engineering focused on data loss — which almost never causes outages. Infrastructure collapse causes nearly all of them.
CSPM fixed security. Now CRPM fixes DR.
Before CSPM, security was aspirational: Teams said, "we follow best practices," checked the box, and moved on. After CSPM, it became measurable: Think: certainty to the extent of being able to say: “94% of our S3 buckets have appropriate access controls. Critical findings down 67%."
CSPM gave security a number. DR still doesn't have one. Enter Firefly.
Cloud Resilience Posture Management (CRPM) applies the same model to recovery readiness: continuous scanning, quantified posture scoring, and automated enforcement.

Six capabilities, One posture score: The key pillars of CRPM
The foundation of CRPM is six interconnected capabilities that together produce a single, defensible posture score you can count on:
1. Unified inventory
You can't protect what you don't know exists. In environments where developers provision infrastructure through API calls and CLI one-liners, untracked resources are the rule. Real-time discovery across every cloud, account, and region is the foundation, and everything else depends on it being complete.
2. Continuous backup validation
Backup policies get documented and then drift. Resources get created outside standard workflows, configurations get invalidated, and procedures never get tested. Automated, continuous verification that policies are actually applied and functioning (not just written down somewhere) is what separates a DR plan from a DR capability.
3. Resilience scoring
A 0–100 posture score showing exactly what's protected versus exposed. It turns "we take DR seriously" into "94% coverage across production workloads with a verified 23-minute average RTO." One is a posture. The other is a number you can defend to leadership, auditors, and insurers.
4. Automated policy enforcement
When deviations occur ( like an unprotected S3 bucket, a database without multi-AZ, or a single-region dependency), CRPM triggers corrective workflows or routes alerts to the right team. Manual enforcement breaks down when dozens of teams are shipping hundreds of infrastructure changes daily.
5. Drift detection and freshness monitoring
A backup taken against a stale infrastructure definition is false confidence. If production has diverged from your recovery blueprint, your RTO numbers are no longer based in reality. Continuous sync between live state and IaC keeps the definition current and the backup valid.
6. Shift-left resilience in CI/CD
The most effective resilience control is the one that runs before deployment. Integrating policy checks into the pipeline blocks non-compliant infrastructure before it ever reaches production — making resilience a property of the build process, not an afterthought in the runbook.
Recovery is just redeployment, but only if your IaC is current
When infrastructure is version-controlled, recovery becomes redeployment. CRPM continuously codifies live cloud state so when a region fails, you're deploying current infrastructure — not reconstructing it under pressure.
Only 1–5% of organizations have implemented infrastructure-level recovery, according to Gartner data. What’s even more concerning?
The 2026 State of IaC Report shows that:
- 17% of teams have no formal DR plan at all, and of those who do, only 37% could restore operations within 4 hours if a primary region failed today.
- The other 63% find out what their DR plan is actually worth during an incident.
The AI race is your DR strategy's biggest threat
Cloud providers are shipping faster than ever. Velocity and stability trade off, and unfortunately, every engineer knows this. Faster releases mean less testing, more edge cases, and more complex dependencies. October's cascading failures? They're what happens when infrastructure complexity outpaces recovery readiness, and that gap is widening.
Outages happen because DNS breaks, IAM gets misconfigured, or a load balancer has no healthy targets in the failover region. Almost never because of data loss.
The DR industry built the wrong solution for decades. CRPM is the correction: continuous, measurable, enforceable resilience that’s built on the same model that transformed cloud security.
If you don't have a posture score, you don't have a true DR strategy.
See how Firefly's CRPM works →
