Normally I write about IT security or identity and access management.
Today I’ll take a break from that and talk about disasters and disaster recovery. Unfortunately, from first hand experience.
As some of you know, Hitachi ID Systems is headquartered in Calgary and we’ve recently had some very serious flooding here. Calgary is a pretty dry place situated at the confluence of two small rivers – the Bow and Elbow. When we get really heavy rains (always in June), it’s not unusual for a few basements to get wet, but what we just experienced is something else entirely. In a city of just over a million people, 100,000 were evacuated from their homes. Water levels in both rivers rose by several meters. Many square kilometers of the city were inundated. The damage is estimated at five billion dollars.
Calgary did not even get the worst of it. There is a nearby town called High River (yes, the irony in that does not escape anyone…) where all 13,000 residents were evacuated and most are still unable to return — some homes there are still completely submerged, about 10 days later.
Dealing with this has been quite the learning experience. It certainly puts into perspective things we see in the news, about Hurricane Sandy that recently hit the East Coast, the Fukushima disaster in Japan, Hurricane Katrina, etc. To be clear, what we suffered here was miniscule in comparison to those disasters – but seeing something like this first hand is certainly eye-opening.
First, the good: the evacuation of 10% of the city’s citizens took place over just 6 hours, in the most calm and orderly fashion imaginable. A laudable combination of responsible, effective government with clear-headed and compliant citizenry. I can just imagine that such an evacuation order, had it taken place in other parts of the world, might not have gone over as well as it did here.
Next, the bad: unimaginable damage throughout the city. Areas that are nowhere near either river and maybe 5m above it got flooded. For safety, power was cut to 20 neighbourhoods and much of it remained off for 5-7 days. Our office lost power for the full 7 days, being situated in one of the worst-hit areas.
Once the water started to recede, something really cool started to happen. Citizens descended on the affected areas by the thousands, to help with clean-up. One day, the mayor called for 600 volunteers at our football stadium. Thousands turned up. The number and energy of volunteers has been so great that the municipality could no longer help orchestrate their efforts, and instead started giving guidelines on what to do and where. Other cool stuff: effective use of social media to keep everyone appraised of road closures, flooding, cleanup processes, power cuts and recovery and more. This is one coordinated city!
We’ve had more than our share of volunteers helping to restore access to our offices too, both employees and contractors responsible for our elevator, electrical system, site security, etc. Thanks everyone!
We’re all very glad of our mayor Nenshi too. While Toronto deals with allegations that its mayor smokes crack with Somali drug dealers in low income housing, and Montreal and Laval have each replaced mayors twice in the past year or so, due to corruption allegations and charges, we have a solid guy working hard, keeping everyone up to date and keeping the recovery moving along smoothly.
So how did we do in maintaining service during this disaster? Our web site, e-mail and other essential services were knocked off-line for about half a day. We brought those up before we could even get back to our buidlding. After about a day and a half, we brought up more services by moving some of our core servers to a co-location site and got all of our Calgary staff to work from home. Everyone was getting in on the disaster recovery, including our hosting data center partner, who got us operational over the weekend.
In short, not too bad. I hope to never have to do this again, but I also know that we learned lots and will undoubtedly do even better next time.
And living through this sure gives me new appreciation for the need for geo-diversity of core services. The software we make does that: for example, our Privileged Access Manager customers routinely deploy servers on different continents and ensure that each server contains a full set of data, so that no single-site disaster would interrupt their access to privileged accounts at other locations. That’s a great sales pitch, but man, it sure feels more concrete when you have to live with the loss of a major data center yourself.