Running DRP Tabletop Exercises

Link to presentation poster

Like many other service providers, our team has had disaster recovery plans (DRP) for many years. However over time we identified several problems with them as a group, such as poor assumptions, incomplete documentation, and inconsistent testing. Revising our DRPs to use a new, common template in 2022 helped some... but not enough.

We wanted to level-set our assumptions, dependencies, and expectations to increase consistency; improve our understanding of how we'd actually implement our DRPs in the event of a disaster; and identify and remediate any gaps or blind spots, including any incorrect assumptions and missing documentation. We wanted everyone to operate with a common set of assumptions, write and test their DRPs similarly, and identify all service dependencies up and down the stack.

We ran tabletop exercises for three of our services' plans over two days: Our VMware service, our primary web hosting service, and our on-campus data center. In each exercise we started with a brief problem description (such as "An attacker has..." or "You were informed that...") and set the team loose to focus on identifying and remediating the problem within the constraints of the DRP.

We measured success both qualitatively and quantitatively, and we improved from the first to the second session. Participants provided useful feedback in the post-event survey. We identified several gaps and made plans to fill them. We also identified ways to make our DRPs better and to run future exercises better.

Category

Current trends/topics/projects

Areas of Focus

  • Knowledge
  • Michigan Technology Community

Objectives

Attendees will have a greater understanding of what a disaster recovery plan (DRP) is and contains, as well as why it's important to write, test, and maintain service-specific DRPs and affiliated documentation. They will have a better idea of how to write and test their own DRPs... as well as what not to do when writing and testing them. Indirectly they may learn about change management and facilitation.

Collaborators

Josh Simon, LSA Technology Services