Disaster Recovery testing, or DR testing has been a cornerstone of business continuity for more than 3 decades. While the need for validating recoverability has never been greater, the way that DR testing is done hasn’t kept pace with new technology or the evolving risk landscape.
It’s an inconvenient truth that the simulated scenarios of DR tests no longer reflect real-world threats. Whether they choose to acknowledge it or not – most IT professionals already know it.
Rubrik’s EMEA CTO Harpinder Singh Powar recently discussed the role of DR testing at Predatar’s annual user summit. He states,
“The value of DR testing has dramatically diminished, and for many organisations the practice has become little more than a tick-box exercise.”
DR testing has the potential to once again become a powerful tool for business continuity. And what’s more, it has a big role to play in the fight against cybercrime. DR testing must evolve. And here’s the exciting part – the evolution is already underway. New approaches to DR testing will help organisations rise from the metaphorical flames of any disaster – and even help to avoid them.
What is Disaster Recovery testing (aka DR testing)?
Disaster Recovery testing is the process of validating an organisation’s disaster recovery plan (DRP) to ensure that IT systems, data, applications, and infrastructure can be effectively restored after a disaster or disruption.
Typically, most organisations execute DR tests on a quarterly, or annual basis. During these tests specific elements of the DRP will be tested, for example failover mechanisms or backup restores.
Why does Disaster Recovery testing need to evolve?
Resource challenges:
IT systems are getting bigger and more complex by the day. At the same time, there is an ongoing global shortage of skilled technical people. DR testing is already time-consuming and resource intensive. This is only getting worse with more edge devices, Internet of Things 4.0 (IoT 4.0), and big data models for AI.
Under-resourced IT teams are struggling to keep up with basic scheduled DR testing, let alone expand the scope to reflect the new data landscape.
The threat landscape:
As the name suggests, Disaster Recovery testing is all about how an organisation will respond in a disaster. It’s always wise to plan for the worst-case scenario, and historically the worst-case scenario was something like a fire or flood taking out your data centre. Following 9/11, terror attacks became a very real concern too.
Fast forward 20 years. Today the biggest threat is a very different beast. Where once the odds of a ‘disaster’ striking were perhaps 1 in a million, now it’s closer to 1 in 50. The big threat is cyber attacks.
Where ‘traditional’ disasters have tended to be indiscriminate and hit suddenly, cyber attacks are often super-targeted, and are executed over an extended period. They silently spread across networks to cause maximum disruption. Disaster Recovery wasn’t built to deal with this new type of scenario.
How is Disaster Recovery Testing changing?
1. Continuous DR testing
Few people would disagree that increasing the frequency of testing is a good thing to do. But cost, complexity and resource limitations mean that most organisations only run DR tests periodically – typically, on a quarterly, or annual basis. What’s more, these tests only check a very small subset of the data the organisation stores (less than 1% on average).
DR testing is a perfect use case for automation. Organisations that deploy automated DR testing workflows can run continuous recovery tests, 24/7 – with no additional burden on busy IT teams, and no disruption to day-to-day IT systems and operations.
This new approach to testing means that organisations can validate the recoverability of all of their data every few weeks. The most critical systems can be checked every few days.
2. AI-powered DR testing
Artificial intelligence is changing the world, and it’s got a significant role to play in the future of DR testing. AI is already being put to work in many organisations to identify data with the highest likelihood of recovery failure. These potential ‘problem’ workloads can then be prioritised for testing – boosting the chances of finding and fixing issues. This approach will ultimately increase the efficacy of recoveries. AI can also be used to detect signs of a cyberattack by spotting tell-tale patterns of nefarious behaviour in your data. This will enable IT and security teams to act early – before the issue escalates into a crisis.
The third and final application of AI for DR testing we want to highlight is AI-generated scenarios. By understanding the complex data patterns of real-world disaster scenarios, and how the responses play out, AI will be able to test drive DR plans against realistic scenarios and automatically optimise the response for maximum success.
3. DR testing with integrated security tools
In most modern cyber attacks, malicious files are present within the victim’s IT network for weeks – sometimes months – before the attack is activated. Traditional DR methods won’t detect this dormant malware. As a result, a DR test might produce a successful result for recoverability of an infected workload, even though the data could become encrypted and rendered useless as part of a cyber attack.
It’s an eye-opening fact that Predatar has uncovered hidden malware in more than 70% of its customer environments within just a few weeks of deployment. In most cases the malware had been present for several months, and had the potential to cause significant disruption if left undetected.
By integrating cyber security tools such as Endpoint Detection & Response (EDR) and Extended Endpoint Detection & Response (XDR) into DR testing procedures, organisations can validate the cleanliness of their data and remove malware before it can cause damage.
What’s more, by integrating DR testing with SIEM and SOC platforms, DR testing can become more responsive to the real-world threats that cyber security teams are managing every day.
4. DR testing as a proactive threat detection weapon
We’ve just highlighted how a new generation of DR testing capabilities will uncover hidden threats and vulnerabilities within stored data. In some cases, the DR test will be the first alert of a potential issue within an organisation.
Integration with SOC and SIEM platforms not only means IT teams can receive intelligence from security teams, they can provide intelligence to security teams too. DR testing has the potential to be an early warning system for prevailing cyber attacks. In the new world of DR testing, backups are elevated from reactive insurance policy to a proactive threat intelligence tool.
5. Joined-up DR testing
Today, DR tests are often compartmentalised, with tests executed on a systems-by-system basis. In a real-world scenario, bringing back one system at a time is far from optimal. Your business’s most critical applications may have dependencies across multiple systems. By using unified recovery environments and recovery orchestration applications, businesses can build and test recovery plans to restore data from different systems in an optimised sequence. This will enable them to get the most vital systems up and running faster. By minimising operational downtime, IT teams can and reduce the impacts of a cyber incident or other data loss event.
6. DR testing for compliance
the business case for efficiency and cyber resilience are compelling drivers for change. But it’s regulations that are really accelerating the innovation and adoption of new DR testing practices. A new wave of operational resilience regulations is being introduced around the world – FISMA, DORA, HIPAA, PRA and NIS2 to name a few. Not to mention more stringent requirements from cyber insurance too. The need to provide evidence of recoverability is rapidly becoming essential.
As you evolve your DR testing processes and toolsets, be sure to evaluate your reporting capabilities too. In the new world of DR testing, spreadsheets and hand-cranked reports will be a thing of the past. Most modern applications include easy-to-use, configurable dashboards and reporting features. These tools are designed specifically to boost visibility, save time and provide the evidence that regulators and auditors need.
In Conclusion
Disaster Recovery testing needs to evolve to meet the operational resilience challenges facing organisations today. Automation, Artificial Intelligence and integration with security applications will provide the biggest wins. The future of DR testing is closer than you think. Predatar’s Recovery Assurance platform is a practical way to get started with AI-powered, automated recovery testing and malware scanning for backups and snapshots.
Find out more about the world’s most innovative Recovery Assurance platform at www.predatar.com or book a demo now.