With one supplier weakness able to trigger widespread operational disruption, how can firms boost resilience and manage risk?
In January, US telecoms giant Verizon suffered a massive outage impacting millions of users. Described by internet monitoring company Cisco ThousandEyes as one of the most significant connectivity interruptions in recent history, the incident was widespread, severe and ongoing, leaving some mobile users’ phones stuck in SOS mode.
Verizon soon confirmed that the outage was not the result of a cyberattack, instead blaming the incident on a “software issue”. This is set to be a growing problem as telecoms networks become increasingly interconnected and virtualised.
In Verizon’s case, the impact has been significant, affecting its reputation, and resulting in a fall in share price. The mobile provider soon tried to soften the blow by promising a $20 refund for users.
It comes at a time when outages create an impact that cascades through the supply chain. The CrowdStrike outage of 2024 is just one example. Other more recent incidents have stemmed from issues at infrastructure providers such as Amazon Web Services and Cloudflare.
With one supplier weakness able to trigger widespread operational disruption, firms now need to re-examine the way they approach resilience and manage risk.
Evolving Risk
Technologies such as 5G are changing the game for telecoms firms, with network virtualisation playing a key role. But this means companies are increasingly relying on vast ecosystems of suppliers, cloud services, software vendors and managed infrastructure partners.
Telecommunications networks have become “significantly more complex” over the past decade, making outages “a very real risk”, says Mike Hellers, product development manager at London Internet Exchange. “Every additional dependency creates another potential point of failure, meaning outages are increasingly driven by issues beyond the direct control of the telecom provider itself,” says Hellers.
Beyond being simply technical failures, outages are now becoming failures of ecosystem governance. Yet ultimately, the customer impact reflects on the provider.
“If one critical dependency fails, misconfigures a service, ships a bad update or cannot recover quickly, the customer experiences it as a telecom outage — regardless of whose logo is on the root cause report,” says Mayur Upadhyaya, CEO of APIContext.
Third Party Threats
The Verizon outage and others like it show how modern telecom resilience is only as strong as the weakest supplier in the chain.
Cloud services themselves depend on physical infrastructure, data centres, network routes, and power systems. And while AI and automation are helping operators manage larger, more dynamic software defined networks, they also introduce new challenges.
Increased speed is both a blessing and a curse. “Automated systems can make decisions or implement changes at a speed that leaves little time for human intervention, meaning configuration errors or failures can propagate across interconnected environments much faster than they could previously,” says Hellers.
AI and automation can make things worse because they “compress the time between cause and consequence”, agrees APIContext’s Upadhyaya. He points out that automation can improve resilience, but it can also “propagate failure faster” when governance, observability and ownership are weak. “The question is not simply whether a provider’s own systems are resilient, but whether it can see, test and govern the resilience of the entire chain of services required to deliver connectivity.”
Visibility
Adding to the issue is the fact that infrastructure providers often lack full visibility across interconnected vendor environments.
“Most organizations still lack real-time visibility and clear accountability across their dependencies, and they’re relying on governance models that weren’t designed for this level of complexity,” says Dana Simberkoff, chief risk, privacy and information security officer at AvePoint.
Set against this backdrop, APIContext’s Upadhyaya thinks annual supplier reviews are “becoming dangerously out of date”.
The main problem is, they were designed for a world where third-party risk changed slowly, he says.
This is an issue in the modern age, because telecom infrastructure now changes continuously: “Routing decisions, cloud dependencies, API integrations, automated remediation, AI-assisted operations and vendor-hosted control planes can all shift risk in real time,” points out Upadhyaya. “A supplier that looked acceptable during a yearly review may become a material operational dependency weeks later.”
Regulatory Focus
At the same time, the industry is seeing growing regulatory focus on operational resilience and supply chain accountability, reflecting the fact that digital infrastructure has become critical national infrastructure, supporting everything from financial services to healthcare.
AvePoint’s Simberkoff points out that the Network and Information Systems 2 Directive (NIS2) and the EU Digital and Operational Resilience Act (DORA), both make it clear that organizations are accountable for the resilience of their third-party dependencies. “And both require continuous oversight, stronger supplier accountability and integrated incident response,” she says.
Regulations such as these also force a move away from perimeter-based thinking, towards resilience of the entire ecosystem.
With this in mind, Giles Adams CEO, VQ Communications believes resilience is “more about structural control”.
Communication is “mission-critical”, and infrastructure decisions “matter more than ever”, he tells IO.
Reducing Complexity
One practical way organizations can strengthen resilience is by reducing unnecessary complexity wherever possible. More direct interconnection between networks, cloud providers and digital services reduces the number of intermediary points where failures can occur and gives operators greater control over end-to-end connectivity, says London Internet Exchange’s Hellers. “It also enables faster communication between the organizations directly involved when issues arise, helping to identify and resolve problems more quickly.”
Adams describes how the firm’s customers are increasingly adopting alternative approaches such as multi-site deployments to avoid single points of failure; geographic distribution of conferencing infrastructure; redundant conferencing nodes for failover capability; and hybrid models, combining cloud and private infrastructure.
Alongside technical resilience, Hellers believes transparency remains “fundamental”. Strong governance frameworks such as ISO 27001, continuous supplier assurance and open communication “all play an important role in building trust across increasingly interconnected ecosystems”, he says.
As telecoms networks become more complex with multiple technologies at play, the risk of outages as a result of a third-party dependency continues to surge. In an age of critical dependences such as these, experts emphasise the importance of the ecosystem working together to manage issues and fix them quickly.
Ultimately, resilience is no longer measured solely by how robust a single network is, says Hellers. “It is now about how effectively organizations work together across the wider digital ecosystem to anticipate, withstand and recover from disruption.”
Expand Your Knowledge
Podcast: Phishing for Trouble SO2 E03: Supply Chain Dominoes: Why Their Risk Is Now Your Risk
Blog: Cyber-Risk Management Is Fragmented: Here’s How to Fix It
Blog: Everything You Need To Know About the Cyber Resilience Act









