Developer Productivity is Not a Helpful Label

As software developers, we love to label things. In fact, we believe that labelling things well is vital. This approach works for things we need to define and share in software – variables, classes, components, database tables etc. They need to be well-named in order to convey meaning to future selves and others when it comes to sharing and remembering ideas about our code and architecture.

However, software engineers often don’t know when to stop labelling. Properly abstract, nebulous ideas – an absence of an idea – shouldn’t be called anything at all. “Developer Productivity” is I believe among those things which are actively unhelpful to label.

There are two extreme ways we can view “Developer Productivity” as a concept:

It’s a useful term for understanding how a software development organisation’s component parts can work more effectively and efficiently together to create higher-quality software.
It is an unhelpful series of measures that try to solve an imaginary problem causing divisiveness and unclarity and encouraging flawed thinking by treating people as perfect cogs in a perfect machine.

We can take a position anywhere along that continuum between helpful and unhelpful.

The industry is rightly sceptical and unsure about “Developer Productivity” as a concept. Even the excellent and well-considered paper The SPACE of Developer Metrics (Nicole Forsgren et al) opens with:

Developer productivity has been studied extensively. Unfortunately, after decades of research and practical development experience, knowing how to measure productivity or even define developer productivity has remained elusive, while myths about the topic are common. Far too often teams or managers attempt to measure developer productivity with simple metrics, attempting to capture it all with “one metric that matters.”

And so it goes on, also from this paper:

SPACE Metrics Example - Satisfaction & Well Being, Performance, Activity, Communication and Collaboration, Efficiency & Flow

It shows that caution is required to interpret any measure effectively.

While there are some truly hard measures we can use (i.e. number of hops between teams before a ticket gets solved is a good one) for my liking there are too many based on opinion polls (i.e. how satisfied SREs are with the IM process) and too many non-deterministic measures such as “number of issues caught by the monitoring systems”.

2nd Degree System Stability Metrics

In my opinion, if you must measure developer productivity – and I think a certain amount of it can be helpful – then focus on actual second-degree data. By second-degree data I mean the first derivative of a change i.e. an incident was raised because it was deemed that the performance of the website was an issue for users. At that point, some analysis work has already been done to determine if this is spurious or an actual problem. If you use not the raw data itself but a side-effect of the raw data (i.e. an incident being raised in the ticketing system) then you’re more likely to iron out false positives and provide an actual picture of what’s going on.

I would apply this method to all data you want to collect and report from regarding “Developer Productivity”.

Finally, I’m not sure how useful the term “Developer Productivity” is when it comes to incident reporting anyway. Surely this is edging into “System Stability” territory? Developer’s aren’t responsible for what happens with the software they create. Users and systems stress the software and produce outcomes which may or may not be intended by the Developers.

Let us focus instead on system stability and metrics that map derived effects of changes to those systems. Let us remove the focus from developers.

Developer Productivity is Not a Helpful Label

2nd Degree System Stability Metrics

Related