You guys remember Elaine right? Six months ago, (Morale is Low, Workload is Up) I looked into some ticketing system data to try to discover what is going on in our team and how our seemingly ever increasing workload was being distributed along staff. We unfortunately came to some alarming conclusions and hopefully mitigated them. I checked in with her recently to see how she was doing. Her response?
Let’s dig back in and see what we find. Here’s what I’m hoping we see:
- Elaine’s percentage of our break/fix work drops to below 20%. She was recently promoted to our tier-2 staff and her skill set should be dedicated more towards proactive work and operational maintenance.
- George and Susan have come onto their own and between the two of them are managing at least 60% of the ticket queue. They’re our front line tier-1 staff so I would expect the majority of the break/fix work goes to them and they escalate tickets as necessary to other staff.
- Our total ticket volume drops a bit. I don’t think we’re going to get back to our “baseline” from 2016 but hopefully August was an outlier.
That’s not great but not entirely unexpected. We did take over support for another 300 users in August so I would expect that workload would increase which it is has by roughly double. However it is troubling because theoretically we are standardizing that department’s technology infrastructure which should lead to a decline in reactive break/fix work. Standardization should generate returns in reduced workload. If we add them into our existing centralized and automated processes the same labor hours we are already spending will just go that much further. If we don’t do that, all we have done is just add more work, that while different tactically, is strategically identical. This is really a race against time – we need to standardize management of this department before our lack of operational capacity catches up with us and causes deeper systemic failures pushing us too far down the “reactive” side of operations that we can’t climb back up. We’re in the Danger Zone.
Looking over our ticket load per team member things are starting to look bleaker. Susan and George are definitely helping out but the only two months where Elaine’s ticket counts are close to them was when she was out of the office for a much needed extended vacation. Elaine is still owning more of the team’s work than she should, especially now that she’s nominally in a tier-2 position. Lets also remember than in August when responsibility for those additional 300 users was moved to my team, along with Susan and George, we lost two more employees in the transition. That works out to a 20% reduction of manpower and that includes our manager as a technical asset (which is debatable). If you look at just reduction in line staff it is even higher. This is starting to look like a recipe for failure.
Other than the dip in December and January when Elaine was on vacation things look more or less the same. Here’s another view of just the tier-1 (George, Frank and Susan) and tier-2 (Elaine and Kramer) staff:
I think this graph speaks for itself. Elaine and Susan are by far doing the bulk of the reactive break/fix work. This has serious consequences. There is substantial proactive automation work that only Elaine has the skills to do. The more of that work that is delayed to resolve break/fix issues the more reactive we become and the harder it is to do the proactive work that prevents things from breaking in the first place. You can see how quickly this can spiral out of control. We’re past the Danger Zone at this point. To extend the Top Gun metaphor – we are about to stall (“No! Goose! Noo! Oh no!”). The list of options that I have, lowly technical lead that I am, is getting shorter. It’s getting real short.
In summation: Things are getting worse and there’s no reason to expect that to change.
- Since August 2016 we have lost three positions (30% reduction in workforce). Since I started in November 2014 we have seen a loss of eight positions (50% reduction in workforce).
- Our break/fix workload has effectively doubled.
- We have had a change in leadership and a refocusing of priorities on service delivery over proactive operational maintenance which makes sense because the customers are starting to feel the friction. Of course with limited operational capacity putting off PM for too long is starting to get risky.
- We have an incredibly uneven distribution of our break/fix work.
- Our standardization efforts for our new department are obviously failing.
- It seems less likely every day that we are going to be able to climb back up the reactive side of the slope we are on with such limited resources and little operational capacity.
Until next time, Stay frosty.