Yesterday’s email about incident management led me to find another interesting and related podcast. Reducing On-Call Engineer Burnout with a Volunteer Management Infrastructure from TopEndDevs.com is a discussion with Brian Scanlon from Intercom. It dives very deep into the way that their on-call system works.
Intercom prefers to use a single 24/7 on call engineer for the week. They can do this because 1) they pay it really well for the inconvenience, 2) they have very consistent reporting and recovery infrastructure across their whole stack and 3) they do it on a volunteer basis.
Brian also goes on to talk about how there were a couple of underpinning design features which enabled them to take this approach. Using a consistent architecture and and deliberately limiting their technology stack, gives their support staff a clearer idea of what to expect even if they are no experts in the whole stack.
If you want to read more about Intercom’s support approach you can read Brian’s blog here.
What I love about this approach is that it’s totally customer centric but not at the cost of the employee relationship. The support process is productised to ensure that those supporting the platform know what to do, and can commit to doing it and get rewarded handsomely for it.
So, why not productise your on-call service before you have a service? When you’re designing your product, you can also design how it’s should be supported and architect your service accordingly. Make your promise to the customer, early.
Richard Bown is a writer and freelance software engineer. He is the author of HUMAN SOFTWARE a novel where small-town folk go up against AI and heartless corporate profiteering. Find out more and buy at humansoftwarebook.com
Thanks for reading this post. If you want to support my work please consider buying my book for yourself or someone you know!