Tag Archives: management

Morale, Workload and Tickets – A Follow-Up

You guys remember Elaine right? Six months ago, (Morale is Low, Workload is Up) I looked into some ticketing system data to try to discover what is going on in our team and how our seemingly ever increasing workload was being distributed along staff.  We unfortunately came to some alarming conclusions and hopefully mitigated them. I checked in with her recently to see how she was doing. Her response?

Things are just great. Just greatttt…

Let’s dig back in and see what we find. Here’s what I’m hoping we see:

  • Elaine’s percentage of our break/fix work drops to below 20%. She was recently promoted to our tier-2 staff and her skill set should be dedicated more towards proactive work and operational maintenance.
  • George and Susan have come onto their own and between the two of them are managing at least 60% of the ticket queue. They’re our front line tier-1 staff so I would expect the majority of the break/fix work goes to them and they escalate tickets as necessary to other staff.
  • Our total ticket volume drops a bit. I don’t think we’re going to get back to our “baseline” from 2016 but hopefully August was an outlier.

 

Well, shit. That’s not what I was hoping to see.

That’s not great but not entirely unexpected. We did take over support for another 300 users in August so I would expect that workload would increase which it is has by roughly double. However it is troubling because theoretically we are standardizing that department’s technology infrastructure which should lead to a decline in reactive break/fix work. Standardization should generate returns in reduced workload. If we add them into our existing centralized and automated processes the same labor hours we are already spending will just go that much further. If we don’t do that, all we have done is just add more work, that while different tactically, is strategically identical. This is really a race against time – we need to standardize management of this department before our lack of operational capacity catches up with us and causes deeper systemic failures pushing us too far down the “reactive” side of operations that we can’t climb back up. We’re in the Danger Zone.

 

This is starting to look bad.

Looking over our ticket load per team member things are starting to look bleaker. Susan and George are definitely helping out but the only two months where Elaine’s ticket counts are close to them was when she was out of the office for a much needed extended vacation. Elaine is still owning more of the team’s work than she should, especially now that she’s nominally in a tier-2 position. Lets also remember than in August when responsibility for those additional 300 users was moved to my team, along with Susan and George, we lost two more employees in the transition. That works out to a 20% reduction of manpower and that includes our manager as a technical asset (which is debatable). If you look at just reduction in line staff it is even higher. This is starting to look like a recipe for failure.

 

Yep. Confirmed stage 2 dumpster fire.

Other than the dip in December and January when Elaine was on vacation things look more or less the same. Here’s another view of just the tier-1 (George, Frank and Susan) and tier-2 (Elaine and Kramer) staff:

Maybe upgrade this to a stage 3 dumpster fire?

I think this graph speaks for itself. Elaine and Susan are by far doing the bulk of the reactive break/fix work. This has serious consequences. There is substantial proactive automation work that only Elaine has the skills to do. The more of that work that is delayed to resolve break/fix issues the more reactive we become and the harder it is to do the proactive work that prevents things from breaking in the first place. You can see how quickly this can spiral out of control. We’re past the Danger Zone at this point. To extend the Top Gun metaphor – we are about to stall (“No! Goose! Noo! Oh no!”). The list of options that I have, lowly technical lead that I am, is getting shorter. It’s getting real short.

In summation: Things are getting worse and there’s no reason to expect that to change.

  • Since August 2016 we have lost three positions (30% reduction in workforce). Since I started in November 2014 we have seen a loss of eight positions (50% reduction in workforce).
  • Our break/fix workload has effectively doubled.
  • We have had a change in leadership and a refocusing of priorities on service delivery over proactive operational maintenance which makes sense because the customers are starting to feel the friction. Of course with limited operational capacity putting off PM for too long is starting to get risky.
  • We have an incredibly uneven distribution of our break/fix work.
  • Our standardization efforts for our new department are obviously failing.
  • It seems less likely every day that we are going to be able to climb back up the reactive side of the slope we are on with such limited resources and little operational capacity.

Until next time, Stay frosty.

 

Morale Is Low, Workload Is Up

Earlier this month, I came back from lunch and I could tell something was off. One of my team members, lets call her Elaine, who is by far the the most upbeat, relentlessly optimistic and quickest to laugh off any of our daily trials and tribulations was silent, hurriedly moving around and uncharacteristically short with customers and coworkers. Maybe she was having a bad day I wondered as I made a mental note to keep tabs on her for the week to see if she bounced back to her normal self. When her attitude didn’t change after a few days then I was really worried.

Time to earn my team lead stripes so I took her aside and asked her what’s up. I could hear the steam venting as she started with, “I’m just so f*****g busy”. I decided to shut up and listen as she continued. There was a lot to unpack: She was under-pressure to redesign our imaging process to incorporate a new department that got rolled under us, she was handling the majority of our largely bungled Office 365 Exchange Online post-migration support and she was still crushing tickets on the help desk with the best of them. The straw that broke the camel’s back – spending a day to clean-up her cubicle that was full of surplus equipment because someone commented that our messy work area looked unprofessional…  “I don’t have time for unimportant s**t like that right now!” as she continued furiously cleaning.

The first thing I did and asked her what the high priority task of the afternoon was and figured out how to move it somewhere else. Next I recommended that she finish her cleaning, take off early and then take tomorrow off. When someone is that worked up, myself included, generally a great place to start is to get some distance between you and whatever is stressing you out until you decompress a bit.

Next I started looking through our ticket system to see if I could get some supporting information about her workload that I could take to our manager.

Huh. Not a great trend.

That’s an interesting uptick that just so happens to coincide with us taking over the support responsibilities for the previously mentioned department. We did bring their team of four people over but only managed to retain two in the process. Our workload increased substantially too since we not only had to continue to the maintain the same service level but we now have the additional challenge of performing discovery, taking over the administration and standardizing their systems (I have talked about balancing consolidation projects and workload before). It was an unfortunate coincidence that we had to schedule our Office 365 migration at the same time due to a scheduling conflict. Bottom line: We increased our workload by a not insignificant amount and lost two people. Not great a start.

I wonder how our new guys (George and Susan) are doing? Lets take a look at the ticket distribution, shall we?

Huh. Also not a great trend.

Back in December 2016 it looks like Elaine started taking on more and more of the team’s tickets. August of 2017 was clearly a rough month for the team as we started eating through all that additional workload but noticeably that workload was not being distributed evenly.

Here is another view that I think really underlines the point.

Yeah. That sucks for Elaine.

As far back as a year Elaine has been handling about 25% of our tickets and since then her percentage of the tickets has increased to close to 50%. What makes this worse is not only has the absolute quantity of tickets in August more than doubled compared to the average of the 11 preceding months but the relative percentage of her contribution has doubled as well. This is bad and I should of noticed, a long time ago.

Elaine and I had a little chat about this situation and here’s what I distilled out of it:

  • “If I don’t take the tickets they won’t get done”
  • “I’m the one that learns new stuff as it comes along so then I’m the one that ends up supporting it”
  • “There’s too many user requests for me to get my project work done quickly”

Service Delivery and Business Processes. A foe beyond any technical lead.

This is where my power as a technical lead ends. It takes a manager or possibly even an executive to address these issues but I can do my best to advocate for my team.

The first issue is actually simple. Elaine needs to stop taking it upon herself to own the majority of the tickets. If the tickets aren’t in the queue then no one else will have the opportunity to take them. If the tickets linger, that’s not Elaine’s problem, that’s a service delivery problem for a manager to solve.

The second issue is a little harder since it is fundamentally about the ability of staff to learn as they go, be self-motivated and be OK with just jumping into a technology without any real guidance or training. Round after round of budget cuts has decimated our training budget and increased our tempo to point where cross training and knowledge sharing is incredibly difficult. I routinely hear, “I don’t know anything about X. I never had any training on X. How am I supposed to fix X!” from team members and as sympathetic as I am about how crappy of a situation that is there is nothing I can do about it. The days of being an “IT guy” that can go down The Big Blue Runbook of Troubleshooting are over. Every day something new that you have never seen before is broken and you just have to figure it out.

Elaine is right though – she is punching way above her weight, the result of which is that she owns more and more the support burden as technology changes and as our team fails to evenly adopt the change. A manager could request some targeted training or maybe some force augmentation from another agency or contracting services. Neither are particularly likely outcomes given our budget unfortunately.

The last one is a perennial struggle of the sysadmin: Your boss judges your efficacy by your ability to complete projects, your users (and thus your boss’ peers via the chain of command) judge your efficacy by your responsiveness to service requests. These two standards are in direct competition. This is such as common and complicated problem that there is a fantastic book about it: Time Management for Systems Administrators

The majority of the suggestions to help alleviate this problem require management buy-in and most of them our shop doesn’t have: A easy to use ticket system with notification features, a policy stating that tickets are the method of requesting support in all but the most exigent of circumstances, a true triage system, a rotating interrupt blocker position and so on. The best I can do here is to recommend to Elaine to develop some time management skills, work on healthy coping skills (exercise, walking, taking breaks, etc.) and doing regular one-on-one sessions with our manager so Elaine has a venue for discussing these frustrations privately so at least if they cannot be solved they can acknowledged.

I brought a sanitized version of this to our team manager and we made some substantial progress. He reminded me that George and Susan have only been on our team for a month and that it will take some time for them to come up to speed before they can really start eating through the ticket queue. He also told Elaine, that while her tenacity in the ticket queue is admirable she needs to stop taking so many tickets so the other guys have a chance. If they linger, well, we can cross that bridge when we come to it.

The best we can do is wait and see. It’ll be interesting to see what happens as George and Susan adjust to our team and how well the strategy of leaving tickets unowned to encourage team members to grab them works out.

Until next time, stay frosty.

 

Salary, Expectations and Automation

It has been an interesting few months. We have had a few unexpected projects pop up and I have ended up owning most of them. This led to me feel pretty beaten down and a little bit demoralized. I don’t like missing deadlines and I don’t like constantly switching from one task to the next without ever making headway. It’s not my preferred way to work.

One thing that I am continually trying to remind myself is that I should use the team. I don’t have to own everything nor should I so I started creating tickets on the behalf of my users (we don’t have a policy requiring tickets) and just dumping them into our generic queue so someone else could pick them up.

Guess what happened? They sat there. Now there are a few reasons why things played out this way (see this post) but you can imagine this was not the result I was hoping for. I was hoping my tier-2 folks would of jumped in and grabbed some of these requests:

  • Review the GPOs applied to a particular group of servers and modify them to accommodate a new service account
  • Review some NTFS permissions and restructure them to be more granular
  • Create a new IIS site along with the corresponding certificate and coordinate with our AD team to get the appropriate DNS records put in place
  • Help one of our dev teams re-platform / upgrade a COTS application
  • Re-configure IIS on a particular site to support HTTPS.

Part of the reason we have so much work right now is that we are assuming the responsibility for a department that previously had their own internal IT staff (Yay! Budget cuts!). Not everyone was happy with giving up “their IT guys” and so during our team meetings we started reviewing work in the queue that was not getting moved along.

A bunch of these unloved tickets were “mine”, that is to say, they were originally requests that came directly to me, that I then created a ticket for hoping to bump it back into the queue. This should sound familiar. The consensus though was that it was “my work” and that I was not being diligent enough in keeping track of the ticket queue.

Please bear in mind for the next two paragraphs, that we have a small 12 person team. It is not difficult for us to get a hold of another team member.

I’ll unpack the latter idea first. In a sense, I agree. I could do a better job of watching the queue but that’s simply because I was not watching it. My perception was, that as someone who is nominally at the top of our support tier is that our help desk watches the queue, catches interrupts from customers and then escalates stuff if they need assistance. I was thinking my tickets should come from my team and not directly from the queue.

The former idea I’m a little less sympathetic too. It’s not “my work”, it’s the team’s work, right? And here is where those sour grapes start to ferment… that list of tickets up there does not seem like “tier-3 work” to me. It seems junior sysadmins’ work. If that is not the case then I have to ask the question: What are those guys doing instead? If that’s not “work” that tier-1/tier-2 handle then what is?

In the end, of course, I took the tickets and did the work, which of course put me even further behind on some of my projects.

I have puzzled over our ticket system, support process and team dynamics quite a bit (see here, here and here) and there is a lot of different threads one could pull on, but a new explanation came to mind after this exercise: Maybe our tier-2 guys are not doing this work because they can’t? Maybe they just don’t have the skills to do those kinds of things and maybe it’s not realistic to expect people to have that level of skill, independence and work ethic for what we pay them? I hate this idea. I hate it because if that’s truly the case there is very little I can do to fix it. I don’t control our training budget or assign team priorities or have any ability to negotiate graduated raises matched with a corresponding training plan. I don’t do employee evaluations and I cannot put someone on an improvement plan and I certainly cannot let an employee go. But I really don’t like this idea because it feels like I’m crapping on my team. I don’t like it because it makes me feel guilty.

But our are salaries and expectations unrealistic?

  • Help Desk Staff (Tier-1) – $44k  – $50k per year
  • Junior Sysadmins (Tier-2) – $58k – $68 per year
  • Sysadmins (Tier-3) – $68k – 78k per year

It’s a pretty normal “white collar” setup: salaried, no overtime eligibility, with health insurance and a 401k with a decent employer match. We can’t really do flexible work schedules or work-from-home but we do have a pretty generous paid leave policy. However – this is Alaska, where everything is as expensive as the scenery is beautiful. A one bedroom rental will run you at least $1200 a month plus utilities which can easily be a few hundred dollars in the winter depending on your heating source. Gasoline is on average a dollar more per gallon than whatever it is currently in the Lower 48. Childcare is about $1100 a month per kiddo for kids under three. Your standard “American dream” three bedroom, two bath house will cost you around $380,000. All things being equal, it is about 25% more expensive to live here than your average American city so when you think about these wages knock a quarter of them off to adjust for cost of living.

Those wages don’t look so hot anymore huh? Maybe there is a reason (other than our State’s current recession) that most IT positions in my organization take at least six months to fill. The talent pool is shallow and not that many people are interested in wading in.

We all have our strengths and weaknesses. I suspect our team is much like others with a spectrum of talent but I think the cracks are beginning to show… as positions are cut, work is not being evenly distributed and fewer and fewer team members are taking more and more of the work. I suspect that’s because these team members have to skills to eat that workload with automation. They can use PowerShell to do account provisioning instead of clicking through Active Directory Users and Computes. They can use SCCM to install Visio instead of RDPing and pressing Next-Next-Finish on each computer. A high performing team member would realize that the only way they could do that much work was learn some automation skills. A low performing team member would do what instead? I’m not sure. But maybe, just maybe, as we put increasing pressure on our tier-1 and tier-2 staff to “up their skills” and to “eat the work”, we are not being realistic.

Would you expect someone making 44k – 51k a year in Cost of Living adjusted wages to be an SCCM wizard? Or pickup PowerShell?

Are we asking to much of our staff? What would you expect someone paid these wags to be able to do? Like all my posts – I have no answers, only questions but hopefully I’m asking the right ones.

Until next time, stay frosty!

Prometheus and Sisyphus: A Modern Myth of Developers and Sysadmins

I am going to be upfront with you. You are about to read a long and meandering post that will seem almost a little too whiny at times where I talk some crap about our developers and their burdens (applications). I like our dev teams and I like to think I work really well with their leads so think of this post as a bit of satirical sibling rivalry and underneath the hyperbole and good nature-ed teasing there might be a small, “little-t” truth.

That truth is that operations, whether it’s the database administrator, the network team, the sysadmins or the help desk, always, always, always gets the short straw and that is because collectively we own “the forest” that the developers tend to their “trees” in.

I have a lot to say about the oft-repeated sysadmin myth about “how misunderstood sysadmins are” and how the they just seem to get stepped on all the time and so on and so on. I am not a big fan of the “special snowflake sysadmin syndrome” and I am especially not a fan of it when it is used as an excuse to be rude or unprofessional but that being said, I think it is worth stating that even I know I am half full of crap when I say sysadmins always get the short straw.

OK disclaimers are all done! Lets tell some stories!

 

DevOps – That means I get Local Admin right?

My organization is quite granular and each of our departments more or less maintain their own development teams supporting their own mission-specific applications along with either a developer that essentially fulfils an operations role or a single operations guy doing support solely for that department. The central ops team maintains things like the LAN, Active Directory, the virtualization platform and so on. If the powers that be wanted a new application for their department, the developers would request the required virtual machines, the ops team would spin up a dozen VMs off of a template, join them to AD, give the developers local admin and off we go.

Much like Bob Belcher, all the ops guys could do is “complain the whole time”.

 

This arrangement led to some amazing things that break in ways that are too awesome to truly describe:

  • We have an in-house application that uses SharePoint as a front-end, calls some custom web services tied to a database or two that auto-populates an Excel spreadsheet that is used for timekeeping. Everyone else just fills out the spreadsheet.
  • We have another SharePoint integrated application, used ironically enough for compliance training, that passes your Active Directory credentials in plaintext through two or three servers all hosting different web services.
  • Our deployment process is essentially to copy everything off your workstation onto the IIS servers.
  • Our revision control is: E:\WWW\Site, E:\WWW\Site (Copy), E:\WWW-Site-Dev McDeveloper
  • We have an application that manages account on-boarding, a process of which is already automated by our Active Directory team. Naturally they conflict.
  • We had at one point in time, four or five different backup systems all of which used BackupExec for some insane reason, three of which backed up the same data.
  • We managed to break a production IIS server by restoring a copy of the test database.
  • And then there’s Jenga: Enterprise Edition…

 

Jenga: Enterprise Edition – Not so fun when it needs four nines of uptime.

A satirical (but only just) rendering of one our application’s design pattern that I call “The Spider Web”

What you are looking at is my humorous attempt to scribble out a satirical sketch of one of our line-of-business applications which managed to actually turn out pretty accurate. The Jenga application is so named because all the pieces are interconnected in ways that turn the prospect of upgrading any of it into the project of upgrading all it. Ready? Ere’ we go!

It’s built around a core written in a language that we have not had any on-staff expertise in for the better part of ten years. In order to provide the functionality that the business needed as the application aged, the developers wrote new “modules” in other languages that essentially just call APIs or exposed services and then bolted them on. The database is relatively small, around 6 TBs, but almost 90% of it is static read-only data that we cannot separate out which drastically reduces the things our DBA and myself can do in terms of recovery, backup and replication and performance optimization. There is no truly separate development or testing environments so we use snapshot copies to expose what appear to be “atomic” copies of the production data (which contains PII!) on two or three other servers so our developers can validate application operations against it. We used to do this with manual fricking database restores, which was god damned expensive in terms of time and storage. There are no less than eight database servers involved but the application cannot be distributed or setup in some kind of multi-master deployment with convergence so staff at remote sites suffer abysmal performance if anything resembling contention happens on their shared last-mile connections.  The “service accounts” are literally user accounts that the developers use to RDP to the servers, start the application’s GUI, and then enable the application’s various services via interacting with above mentioned GUI (any hick-up in the RDP session and *poof* there goes that service). The public facing web server directly queries the production database). The internally consumed pieces of the application and the externally consumed pieces are co-mingled, meaning an outage anywhere is an outage everywhere. It also means we cannot segment the application in public and internally facing pieces. The client requires a hard-coded drive map to run since application upgrades are handled internally with copy jobs which essential replace all the local .DLLs on a workstation when new ones are detected and last but not least it runs on an EOL version of MSSQL.

Whew. That’s was a lot. Sorry about that. Despite that the fact that a whole department pretty much lives or dies by this application’s continued functionality our devs have not made much progress in re-architecturing and modernizing it. This really is not their fault but it does not change the fact that my team has an increasingly hard time keeping this thing running in a satisfactory manner.

 

Operations: The Digital Custodian Team.

In the middle of a brain storming session where we were trying to figure out how to move Jenga to a new virtualization infrastructure, all on a weekend when I will be traveling in order to squeeze the outage into the only period within the next two months that was not going to be unduly disruptive I began to feel like my team was getting screwed. They have more developers supporting this application than we have in our whole operations team and it is on us to figure out how to move Jenga without losing any blocks or having any lengthy service windows? What are those guys actually working on over there? Why am I trying to figure out which missing .DLL from .NET 1.0 needs be imported onto the new IIS 8.5 web servers so some obscure service that no really one understands runs in a supported environment? Why does operations own the life-cycle management? Why aren’t the developers updating and re-writing code to reflect the underlying environmental and API changes each time a new server OS is released with a new set of libraries? Why are our business expectations for application reliability so widely out-of-sync with what the architecture can actually deliver? Just what in the hell is going on here!

Honestly. I don’t know but it sucks. It sucks for the customers, it sucks for the devs but mostly I feel like it sucks for my team because we have to support four other line-of-business applications. We own the forest right? So when a particular tree catches on fire they call us to figure out what to do. No one mentions that we probably should expect trees wrapped in paraffin wax and then doused in diesel fuel to catch on fire. When we point out that tending trees in this manner probably won’t deliver the best results if you want something other than a bonfire we get met with a vague shrug.

Is this how it works? Your team of rockstar, “creative-type”, code-poets whip up some kind of amazing business application, celebrate and then hand it off to operations where we have to figure out how to keep it alive as the platform and code base age into senility for the next 20 years? I mean who owns the on-call phone for all these applications… hint: it’s not the dev team.

I understand that sometimes messes happen… just why does it feel like we are the only ones cleaning it up?

 

You’re not my Supervisor! Organizational Structure and Silos!

Bureaucratium ad infinitum.

 

At first blush I was going to blame my favorite patsy, Process Improvement and the inspid industry around it for this current state of affairs but after some thought I think the real answer here is something much simpler: the dev team and my team don’t work for the same person. Not even close. If we play a little game of “trace the organizational chart” we have five layers of management before we reach a position that has direct reports that eventually lead to both teams. Each one of those layers is a person – with their own concerns, motivations, proclivities and spin they put on any given situation. The developers and operations team (“dudes that work”), more or less, agree that the design of the Jenga application is Not a Good Thing (TM). But as each team gets told to move in a certain direction by each layer of management our efforts and goals diverge. No amount of fuzzy-wuzzy DevOps or new-fangled Agile Standup Kanban Continuous Integration Gamefication Buzzword Compliant bullshit is ever going to change that. Nothing makes “enemies” out of friends faster than two (or three or four) managers maneuvering for leverage and dragging their teams along with them. I cannot help but wonder what our culture would be like if the lead devs sat right next to me and we established project teams out of our combined pool of developer and operations talent as individual department’s put forth work. What would things be like if our developers were not chained to some stupid line-of-business application from the late ’80s, toiling away to polish a turd and implement feature requests like some kind of modern Promethian myth? What would things be like if our operations team was not constantly trying to figure out how to make old crap run while our budgets and staff are whittled away, snatching victory from defeat time and time again only to watch the cycle of mistakes repeat itself again and again like some kind Sisyphean dystopia with cubicles? What if we could sit down together and I dunno… fix things?

Sorry there are no great conclusions or flashes of prophetic insight here, I am just as uninformed as the rest of the masses, but I cannot help but think, maybe, maybe we have too many chefs in the kitchen arguing about the menu. But then again, what do I know? I’m just the custodian.

Until next time, stay frosty.

One Year of Solitude: My Learning Experience as a Lead

It has been a little over a year since I stepped into a role as a technical lead and I thought this might be a good time to reflect on some of the lessons I have learned as I transition from being focused entirely on technical problems to trying to under how those technical pieces fit into a larger picture.

 

Tech is easy. People are hard. And I have no idea how to deal with them.

It is hard to understate this. People are really, really difficult to deal with compared to technology and I have so much to learn about this piece of sysadmin craft. I do not necessarily mean people are difficult in the sense that they are oppositional or hard to work with (although often they are) just that team dynamics are very complicated and the people composing your team have a huge spread in terms of experience, skills, motivations, personalities and goals. These underlying “attributes” are not static either, they change based on the day, the mood and the project making identifying, understanding them and planning around them even harder. The awareness of this underlying milieu composing your team members and thus your team is paramount to your project’s success.

All I can say is that I have just begun to develop an awareness of these “attributes” and am just getting the basics of recognizing different communication styles (person and instance dependent). I can just begin to tell whose motivations align with mine and whose do not. In hunting we call this the difference between “looking and seeing”. It takes a lot of practice to truly “see”, especially if like me, you are not that socially adept.

My homework in this category is to build an RPG-like “character sheet” for each team member, myself included,  and think about what their “attributes” are and where those attributes are strengths and where they can be weaknesses.

 

Everyone will hate you. Not really. But kinda yes.

One the hardest parts of being a team lead, is you are now “in-charge” of technical projects with a project team made up of many different members who are not within your direct “chain-of-command” (at least this is how it works in my world). This means you own the responsibility for the project but any authority you have is granted to you by a manager somewhere higher up the byzantine ladder of bureaucracy. Nominally, this authority allows you to assign and direct work directly related to the project but in practice this authority is entirely discretionary. You can ask team member A to work on item Z but it is really up to her and her direct supervisor if that is what she is going to do. In the hierarchical, authority-based culture and process driven business world that most of us of work in this means you need to be exceedingly careful about whose toes you step on. Authority on paper is one thing, authority in practice is entirely another.

 

Mo’ People, Mo’ Problems

My handful of project have thus far been composed of team members that kind of fall into these rough archetypes.

A portion of the team will be hesitant to take up the project and the work you are asking them to do since you are not strictly speaking their supervisor. They will passively help the project along and frequently you will be required to directly meet with them and/or their supervisor to make sure they are “cleared” for the work you assigned them and to make sure they feel OK about doing it. These guys want to be helpful but they don’t want to work beyond what their supervisor has designated. Get them “cleared” and make sure they feel safe doing the work and you have made a lot of progress.

Another portion of the team will be outright hostile. Either their goals or motivations do not align with the project or even worse their supervisor’s goals or motivations do not align with the project but someone higher up leaned on them and so they are playing along. This is tough. The best you can hope for here is to move these folks from actively resisting to passively resisting. They might be “dead weight” but at least they aren’t actively trying to slow things down any more. I don’t have much a working strategy here – an appeal to authority is rarely effective. Authority does not wanted to bothered by your little squabbles and arguably it has already failed because chain-of-command can make someone play along, but it cannot make they play nice. I try to tailor my communication style to whatever I am picking up from these team members (see the poorly named, Dealing with People You Can’t Stand), do my best to include them (trying to end-run them makes things ten times worse) and inoculate the team against their toxicity. I am fan of saying I deal with problems and not complaints because problems can actually be solved but a lot of times these folks just want to complain. Give them a soap box so they can get it out of their system so you can move on get work done but don’t let them stand on it for too long.

Another group will be unengaged. These poor souls were probably assigned to the project because their supervisor had to put someone on it. A lot of times the project will be outside their normal technical area of operations, the project will only marginally effect them, or both. They will passively assist where they can. The best strategy I have found here is to be concise, do your best not to waste their time, and use their experience and knowledge of the surrounding business processes and people as much as you can. These guys can generate some great ideas or see problems that you would never otherwise see. You just have to find a way to engage them.

The last group will be actively engaged and strongly motivated to see the project succeed. These folks will be doing the heavy lifting and 90% of the actual technical work required to actually accomplish the project. You have to be careful to not let these guys lean to hard on the other team members out of frustration and you have to not overly rely on them or burn them out otherwise you will be really screwed since they are actually the only people truly putting in the nuts-and-bolts work required for the project’s success.

A quick aside, if you do not have enough people in this last group the project is doomed to failure. There is no way a project composed mostly of people actively resisting its stated goals will succeed, at least not under my junior leadership.

Dysfunctional? Yes. But all teams are dysfunctional in certain ways and at certain times. Understanding and adapting to the nature of your team’s dysfunction lets you mitigate it and maybe, just maybe, help move it towards a healthier place.

Until next time, good luck!

Budget Cuts and Consolidation: Taking it to the Danger Zone

For those of you that do not know, Alaska is kind of like the 3rd world of the United States in that we have semi-exploitative love/hate economic relationship with a single industry . . . petroleum. Why does this matter? It matters because two years ago oil was $120 a barrel and now it is floating between $40 and $50. For those of us in public service or who are private industry support services that contract with government and municipal agencies it means that our budget just shrank by 60%. The Legislature is currently struggling with balancing a budget that runs an annual 3.5 to 4 billion dollar deficit, a pretty difficult task if your only revenue stream is oil.

Regardless of where you work and who you work for in Alaska, this means “the times, they are a changin'”. As budgets shrink, so do resources: staff, time, support services, training opportunities, travel, equipment refreshes and so on. Belts tighten but we still have to eat. One way to make the food go further is to consolidate. In IT, especially these days of Everything-as-a-Service, there is more and more momentum in the business to go to centralized, standardized and consolidated service delivery (ITIL buzzword detected! +5 points).

In the last few years, I have been involved in a few of these type of projects. I am here to share a couple of observations.

 

 

Consolidation, Workload and Ops Capacity

 

Above you should find a fairly straight forward management-esque graph with made-up numbers and metrics. Workload is how much stuff you actually have to get done. This is deceptive because Workload can break down in many different types of work: projects, break/fix, work that requires immediate action, and work that can be scheduled. But for the sake of this general 40,000ft view, it can just be deemed work that you and your team do.

Operational Capacity is simply you and your team’s ability to actually do that work. Again, this is deceptive because depending on your team’s skills, personalities, culture, organizational support, and morale, their Operational Capacity can look different even if the total amount of work they do in given time stays constant. But whatever, management-esque talk can be vague.

Consolidation projects can be all over the map as well: combining disparate systems that have the same business function, eliminating duplicate systems and/or services, centralization of services or even something as disruptive as combining business units and teams. Consolidation projects generally require standardization as a prerequisite; how else would you consolidate? The technical piece here is generally the smallest, People, Process, Technology, right?

And from that technical standpoint, especially one from a team somewhere along that Workload vs. Operational Capacity timeline, consolidation and standardization look very, very different.

Standardization has no appreciable long-term Workload increase or reduction. There is an Increased capture of business value for existing work performed. If there is wider use of the same Process and Technology the given business value of a unit of work goes further, for example if it takes 10 hours to patch 200 workstations it may only take 10.2 hours to patch 2000 workstations.

Consolidation brings a long-term Workload increase with a corresponding increase of Operational Capacity due to addition of new resources or re-allocation of existing resources (that’s the dotted orange line on the graph). For example, if there is wide spread adoption of the same Process and Technology, you can take the 10 hours my team spends on patching workstations and combine it with the 10 hours another team spends on patching workstations. You just bought yourself some Operational Capacity in the terms of having twice as many people deal with the patching or maybe it turns out that it only takes 10 hours to patch both team’s workstations and you freed up 10 hours worth of labor that can go to something else. There is still more work than before but that increased Workload is more than offset by increased Operational Capacity.

Both standardization and consolidation projects increase the short-term Workload while the project is on-going (see Spring of ’15 in the graph). They are often triggered by external events like mergers, management decisions, or simply proactive planning in a time of shrinking budgets. In this example, it is a reduction of staff. This obviously reduces the team’s Operational Capacity. The ability to remain proactive at both the strategic and tactical level is reduced. In fact, we are just barely able to get work done. BUT we have (or had) enough surplus capacity to continue to remain proactive even while taking on more projects, hopefully projects that will either reduce our Workload or increase our Operational Capacity or both because things are thin right now.

Boom! Things get worse. Workload increases a few months later. Maybe another position was cut, maybe a new project or requirement from on-high that was unanticipated came down to your team. Now you are in, wait for it… THE DANGER ZONE! You cannot get all the work done inside of the required time frame with what you have. This is a bad, bad, bad place to be for too long. You have to put projects on hold, put maintenance on hold or let the ticket queues grow. Your team works harder, longer and burns out. A steady hand, a calm demeanor and a bit of healthy irreverence are really important here. Your team needs to pick your projects very, very carefully since you are no longer in a position to complete them all. The one’s you do complete damn well better either lower Workload significantly, increase your Operational Capacity or hopefully do both. Mistakes here, cost a lot more than they did a year ago.

The problem here is technical staff does not generally prioritize their projects. Their business leaders do. And in times where budgets are evaporating, priorities seems to settle around a single thing: cost savings. This makes obvious sense but the danger is that there is no reason that the project with the most significant cost savings will also happen to be the project that will help your team decrease their Workload and/or increase their Operational Capacity. I am not saying it won’t happen just that there is no guarantee that it will. So your team is falling apart, you just completed a project that saves the whole business rap star dollars worth of money and you have not done anything to move your team out of THE DANGER ZONE.

In summation, projects that increase your Operational Capacity and/or reduce your Workload have significant long-term savings in terms more efficient allocation of resources but the projects that will get priority will be those that have immediate short-term savings in terms of dollars and cents.

Then a critical team member finds better work. Then it’s over. No more projects with cost savings, no more projects at all. All that maintenance that was put off, all the business leaders that tolerated the “temporary” increase in response time for ticket resolution, all the “I really should verify our backups via simulated recovery” kind of tasks – all those salmon come home to spawn. Your team is in full blown reactive mode. They spend all their time putting out fires. You are just surviving.

Moral of the story? If you go to THE DANGER ZONE, don’t stay to long and make sure you have a plan to get your team out.