The Role of Analytics in Disaster Recovery

This is part 1 of a multi-part series on the evolution of analytics in disaster recovery

It may seem odd to discuss the role of analytics in the field of disaster recovery. These disciplines appear to have little in common. Wikipedia describes Disaster Recovery (DR) as a set of policies and procedures to enable the recovery or continuation of vital technology infrastructure and systems following a natural or human-induced disaster. Analytics is described as the discovery and communication of meaningful patterns in data.

In this series I'll discuss how analytics will improve resilience, lower risk and enhance business continuity. I'll explore how analytic DR services could come to market, which parties stand to benefit most, and some of the challenges that lie ahead. Part 1 will discuss how analytics will enhance disaster recovery (near term) and a vision in which analytics and automation are combined to improve risk management. 

The evolution of DR closely follows the development of IT, providing methods, products, and services to recover systems within required time frames and levels of data currency. From the early 1980’s until about 5 years ago disaster recovery mainly focused on the backup and recovery of physical computer systems. Given the need to recover physical systems to a like environment, vendors aggregated clients with like IT environments to provide shared DR services. These services made DR more affordable to many companies. This model of recovering physical systems worked well when acceptable downtime for most IT systems could be measured with a calendar.

Today, this is no longer true. Over 90% of all new applications are being developed for the Cloud. Cloud infrastructure, application characteristics and data structures are different. Cloud workloads are deployed in virtual environments, often spread across geographic boundaries. Many companies use combinations of private and public (hybrid) Clouds to run their applications. Cloud resources are dynamically added and removed based on capacity demand. And forget that calendar; downtime tolerance for most Cloud systems is minimal, measured with either a clock or stopwatch. 

By capturing and analyzing metadata stored in the Cloud stack companies will be able to gain deep insight into data protection and disaster recovery. Analytics can be applied across the IaaS/PaaS layer and across DR functions to help companies better understand data protection and DR functions such as backup, replication, DR testing, and system recovery. It should be noted that some tools used in physical DR setups capture data that can be analyzed to gain insight into discrete functions, e.g. the success rate of data backups. Cloud analytics will allow companies to gather information across the spectrum of data protection and DR functions to gain insight into how DR is working, and how Cloud resources can be optimized. Analytic data and algorithms will be used to make recommendations on how DR processes can be improved to produce better outcomes.

DR analytics will benefit companies and vendors alike. DRaaS vendors will use analytics to optimize DR capacity and costs across Cloud infrastructure. Metadata can be mined across customer segments to produce useful benchmark data helping customers improve DR and BC management.

The first wave of analytic implementations will be used to help companies improve data protection, monitor compliance, enhance DR testing, and design affordable resilience for critical IT systems. Analytics will also be used to help optimize DR Cloud capacity, costs, performance, and resource allocation.

But the use of analytics will not stop there. Cloud automation, inter-Cloud operability, IoT, and predictive analytics will be combined to usher in a new era that may change how DR is performed today. I define this new era as predictive risk management. Predictive analytics will examine a variety of threat and risk data (in real time) and determine if critical Cloud workloads are exposed to unacceptable levels of risk. These analytic models will be combined with Cloud automation to move workloads out of harms way. This model of resilience will change how companies manage risk and how DRaaS vendors provide service. In future blogs I will discuss how this model might evolve and some of the challenges involved in bringing predictive risk services to market.

Disaster recovery techniques and technologies have evolved greatly over the past 30 years. Analytics in DR and the rise of Cloud computing will bring significant benefits helping companies design truly resilient systems and optimize DR functions in ways never before possible.