3 min read

Disaster Recovery: Beyond Just Backups

Picture of Todd Welfelt Todd Welfelt : April 24, 2024

Compliance GRC

When discussing disaster recovery (DR) within organizations, the most common sentiment I hear is “we back up all of our stuff, so we’re good.”

While backups are a crucial component, they form just one part of a comprehensive Business Continuity and Disaster Recovery (BCDR) strategy.

Effective BCDR plans encompass not only data preservation but also ensure that critical processes and operations can continue during and after a disaster.

Broadening the Scope Beyond Incident Response

Incident Response (IR) plans are critical as they serve as a playbook for addressing security incidents that can range from minor inconveniences like the loss of a laptop to major crises such as a full-scale ransomware attack.

While IR plans typically focus on immediate responses, DR planning plays a pivotal role during the recovery phase, which is often simplified to mere data restoration from backups.

It is important that the executive teams, especially any member of a Security Incident Response Team (SIRT) or similar group within your organization, understand all options available for service restoration and recovery to help them make good decisions for the overall business needs.

Lessons from the Front Lines: The Maersk Incident

In 2017, the Danish shipping giant Maersk was impacted by a NotPetya cyberattack that crippled its infrastructure. Despite having 574 offices across 130 countries and a well-defined data replication plan for Disaster Recovery, all systems were impacted and offline, making recovery impossible.

It was only because of a serendipitous power outage in Lagos that a single domain server was spared from the attack. This server was immediately flown to Maersk headquarters so the company infrastructure could be rebuilt.

The Maersk situation illustrates the necessity of having diverse and redundant DR strategies.

Elements of a Robust Disaster Recovery Program

A well-rounded DR program includes:

A comprehensive inventory: Document all critical systems, infrastructure, and data.
Prioritized recovery actions: Establish a hierarchy of recovery tasks that aligns with business priorities.
Integration with third-party and cloud solutions: Ensure that sensitive data managed by external vendors is also protected.
Clear recovery objectives: Defining specific Recovery Time Objectives (RTO) and Recovery Point Objectives (RPO).
Business impact analysis: Assessing the potential impacts on critical systems to prioritize recovery strategies.

Plan for Multiple Scenarios and Timeframes

Identifying a maximum amount of acceptable data loss (Recovery Point Objective, RPO for short) for sensitive data can protect against data loss, but plan for multiple data retention scenarios and needs.

Some cyber-attacks happen rapidly with less than 24 hours between initial access and detonation of malware.

Other attacks begin with an initial compromise that goes undetected. The attacker quietly gathers information and builds in access methods (persistence). This is then sold to another attacking organization that exploits these systems and performs the visible portion of the cyber-attack. This may result in vulnerabilities and hidden access methods on computers well past the typical 30-day retention period.

Recovering from this kind of attack may require more than a simple recovery and may need additional compensating controls or cleaning processes to prevent follow-on attacks from occurring.

Ultimately, backup frequency, retention, and distribution needs must be weighed against the likelihood and impact of a security event and the cost needed to mitigate these attacks. Having separate RTO, RPO, and Retention settings for different systems and data can help reduce overall cost for Disaster Recovery options while streamlining responses to an incident.

Test and Validate Your DR Plan

A DR plan must not only exist on paper but must be regularly tested and validated.

Perform routine DR exercises: Conduct both scheduled and unscheduled DR tests to ensure functionality and effectiveness.
Continuously improve: Regularly update the DR plan to address new threats, technological advancements, or changes in business processes.

Address Modern Challenges: SaaS and Cloud Dependencies

With the increasing reliance on Software as a Service (SaaS) and cloud platforms, DR planning must extend into these services.

Ask yourself:

How is this data managed as part of your Disaster Recovery Plan?
Is there a backup service for this data?
Do you rely on the service provider to safeguard the data?
Have you validated the effectiveness of those controls and the results of testing by that organization?

Worthy SaaS vendors should be more than happy to provide you with the results of their most recent Disaster Recovery testing to validate the effectiveness of their internal controls.

Prepare for Extended Disruptions

Organizations tend to trust the big names that provide cloud hosting and SaaS solutions, but proper preparation for Disaster Recovery needs to account for a possible extended outage associated with a provider.

Ask yourself:

What kind of communication requirements have you included in your contract?
When do they notify you about a potential security incident?
What level of ongoing communication is expected? Once a day? More frequently?
Is there a method of data backup using a process your organization controls directly?

Conclusion

Effective disaster recovery is an integral part of any incident response policy and requires meticulous planning and execution. Organizations should regularly review and revise their DR strategies to align with evolving business needs and technological landscapes. Engaging non-technical members of executive committees in these discussions ensures a broader understanding and commitment to the DR plans, enhancing the organization's overall resilience.

If your organization needs cybersecurity guidance for creating, implementing, or revising a Disaster Recovery plan, contact PacketWatch today.

Todd Welfelt has an Information Technology career spanning more than 25 years.

Todd has turned his extensive experience with hands-on management and maintenance of computer systems into practical assessment and implementation of security tools to meet the needs of compliance frameworks, as well as provide real-world risk reduction.

Table of Contents