How to Handle Managed Services During a Major Outage

As a Managed Service provider it is imperative that we staff our Support Centre optimally. Having enough staff to answer calls and deal with issues quickly is necessary to ensure continued customer satisfaction. The Erlang C formula is used to determine the number of agents or customer service representatives needed to staff a call centre. What it doesn’t do is tell you how many agents you’ll need when you have an incident like we did yesterday (Wednesday Sept 17th), when it appears that the google DNS servers were victim to some sort of DOS attack.

One of our customers was hit hard by this attack and their 300+ locations were essentially down. When the calls start coming in, there are not enough agents (based on the Erlang C Formula) to handle the call effectively and customers begin to fill up the call queues. How do you handle this??

We implemented a process called “Code Red” and Yesterday was the first time we used it. It was a huge success and not only did it ensure a positive customer experience, it brought departments that do not normally interact, together.

The Code Red process is simple and works like this. Once the Support Centre realizes there is a major outage then send an email/IM to the entire company that simply states “Code Red”. All Staff, Executives, Management, Sales, Administration, Accounting, Professional Services, Technology report to the support centre, are assigned a phone and start taking calls. This allows the technical staff to continue to work on issues that can be worked on, while the reset of the team can handle calls, explaining to the affected users that there is a major outage and we are working on it. Each call is logged and entered into a master ticket.

It was awesome to see it in action. As we all took our seats, one of our Network Professionals wrote instructions on the big white board that included the Master ticket number, what information to collect and what to tell the customer.

At the same time another of our network professionals recorded a custom greeting in the affected call queue, explaining to the customers that there is a major outage and that we are working to get it resolved. This step alone cut down on the number of calls we had to handle dramatically, as most customers just want to know what’s going on and be assured that someone is working on it.

I was extremely pleased with all of our staff. They handled the situation quickly and effectively. Everyone got to experience the Support Centre and got an appreciation of the kind of work these guys and gals are exposed to every day, and the support staff were appreciative of the support they received from the rest of the company. And to top it all off, the customer, while not happy that they were down in the first place, received a level of support that I can confidently say is second to none.