- English
- français
- Deutsche
- Contact us
- Docs
- Login

On Monday, October 20, 2025, a global hyperscaler experienced a major incident disrupting many internet services for hours, with recovery progressing throughout the day.¹² It was a reminder that even world-class platforms can have bad days and that continuity plans must account for real dependencies across identity, DNS, networking, and third-party APIs.³ This piece is the practical follow-on to our article about When the cloud goes dark: what every IT leader should have ready before the next outage. It is written for CIOs and CTOs who now need a concrete plan to reduce risk without inflating operating cost or complexity.
Expectation setting: Upsun’s multicloud story is about smart initial region choice, portability, and tested business continuity and disaster recovery. Our value is in making restoration predictable and repeatable.
If you lead platform, infrastructure, or application operations and you must brief your board on a credible multicloud strategy, this guide gives you:
Analyst guidance continues to emphasise distributed cloud, portability, and digital sovereignty for I and O leaders.⁴ Uptime Institute’s research shows overall outage trends improving, yet complex IT and networking issues remain an impactful share of incidents.⁵⁶ You cannot eliminate outages, but you can reduce correlated risk and shorten restoration with disciplined preparation.⁵⁶
Multicloud is a strategy for choice and portability, not a promise of seamless failover. Treat it as an enabler for disaster recovery, sovereignty, and negotiating position.⁴ The operating principle is simple: accept a non-zero RTO for severe region events, then engineer for fast detection, clean restoration, and consistent governance.
Outcome by Day 30: a tested restoration path for one Tier 1 service, with artefacts that any on-call leader can run.
Outcome by Day 60: repeatable playbooks for two more services, policy-as-code guardrails, and a shared observability vocabulary.
Outcome by Day 90: one-button restore pipeline from a clean Git checkout, quarterly drill cadence, and a board-ready report.
Executive reporting: Track RTO, RPO, dependency count, change failure rate, and drill results each quarter. IBM’s 2025 data places the average global breach cost at 4.44 million dollars, reinforcing why disciplined resilience work matters when incidents overlap.⁹
Upsun is a multicloud application platform that helps you standardise delivery and make restoration predictable. It is not an automated cross-region failover system. Instead, it gives teams the building blocks to execute BCP and DR with confidence.
Use a single YAML to define services, routes, policies, and scaling. Commit it alongside your code so environments can be rebuilt from a clean checkout. Read the Upsun overview and docs.
2) Create automatic preview environments per branch
Spin up production-like environments for each branch to rehearse restoration steps, validate feature flags, and exercise dependency changes safely. Explore developer resources.
Use instant data cloning to build representative test datasets while protecting sensitive information. This turns drills from theory into practice.
Define dependencies once and let the platform manage start order, health checks, routing, and scale consistently across supported providers. This reduces snowflake runbooks during stressful moments.
Centralise metrics, traces, and logs so the same dashboards apply in primary and restoration targets. This shortens detection and decision time during incidents.
Use one control plane to view utilisation and forecast spend across clouds. This improves governance without forcing you to stitch reports.
What this means for an IaaS region outage: if the hosting region for an Upsun cloud region suffers a severe incident, you would initiate a documented restoration into a different data centre, subject to provider conditions. There is downtime during this process. Your Upsun config, preview environments, data cloning, and orchestration make that restoration predictable.
Automated failover across regions or providers is complex and expensive. Many enterprises adopt a non-zero RTO with tested restores that fit risk tolerance and budget. This aligns with current analyst emphasis on distributed cloud and portability.⁴
Financial discipline: Tie restoration work to avoided incident exposure and regulatory outcomes, not vanity metrics.
Track and present these five core metrics quarterly:
Uptime Institute’s research notes that while frequency and severity have improved in recent years, impactful incidents still occur and can ripple across providers.⁵⁶ Your metrics show how you shorten restoration and contain impact. NIST’s guidance remains a practical scaffold for exercises and playbooks.⁷⁸
Bottom line: start narrow, automate relentlessly, and make restoration a routine muscle. Upsun gives you a clear, Git-driven way to define environments, rehearse changes, and restore with confidence when the cloud has a bad day. To learn more:

