Na na na na na na na na Bug Slot!

<Sing title to the melody of the batman theme song>

In the company I just left we had a weekly bugslot to stay on top of defects: 2 hours every week during which all developers fixed bugs at the same time. It’s not a perfect solution but solved a number of pressing problems:

  • Discouragingly high number of bugs – accumulated over a long time
  • No overview / documentation / communication of what was fixed, what deployed, and what left to fester
    • We had no bugtracker. (The developers never reached agreement, which one to use.)
    • The same bugs appeared again and again, their symptoms often fixed by different people, so that no one recognized patterns
  • No order given in which to fix
    • Decision if and what to fix fell back on each individual dev. And as bugfixing took time away from the sprint and the team commitment…
  • About 30% of all developers were new hires, yet unfamiliar with the systems
    • The systems weren’t documented, so the new devs had no chance to fix bugs on their own. It was left to the “old” ones, who groaned under the additional workload

The situation in my new workplace is similar enough that I suggested a bugslot as well. It has a good chance of doing the trick again.

How exactly does a bugslot work?

  • 2 hours each week for fixing (non-urgent) bugs
    • At the same time every week
    • All developers at the same time
    • Pick a week day and time that most developers will be there. Lunch time’s holy, though 😉
  • Before: Order the bugs. What needs fixing first is first in line
    • Can be done by product owner, stakeholder such as costumer support, …
    • Publish the list highly visible
  • Kick-off meeting: Assemble everyone in one room for the developers to pick bugs from the list
    • Opportunity to exchange information about where to look and words of caution
    • Developers can easily pair, especially new hires
    • Only one or two bugs per dev. Who finishes early comes back to the list to pick another one.
    • If a high-prio bug doesn’t get picked, note down why for further inspection
  • Right after bugslot: Collect feedback on the bugs
    • Fixed? What version? When will it go live? (This part is much easier if you deploy regularly.)
    • If it wasn’t fixed, why not? Too big for bug slot? Time ran out? What new information can you add?
    • Collect in the bug tracker, via mail, Google form, paper, …
  • Inform all interested stakeholders about what’s fixed, when it will be live and what happens with the unfixed bugs

Chaotic environments with few existing processes will benefit the most from such a structured approach.

What are the advantages?

  • Less interruptions and context switches for the developers! Put off everyone until bugslot.
  • Every developer can consult everyone else
  • Great for pairing and knowledge transfer
  • Stakeholders can be sure that bugs will be addressed (though not necessarily fixed) in time. Regularity is often the key to progress
  • Clear communication gives everyone a feeling of control and reliability
  • Strict timeboxing minimizes the risk of bugs seeping time and increase the planability of iterations
  • Even if a bug doesn’t get fixed, you’ll know more afterwards than before (like a mini spike)

Any disadvantages?

  • From a “block of time” perspective it is an additional meeting
  • They who narrowly miss the slot have to wait a week
  • Single developers sometimes avoid taking part. In my experience you’re still better off than without bugslot

We can’t possibily delay for a week. We’ve got time-critical bugs!

So did we. And we addressed these urgent bugs within 24h. In my new workplace 24 hours would be much too long, so we’re currently working on a new approach, something akin to caretaker of the week.

But the majority of bugs is not time-critical and can wait for a week or two.

2 hours don’t sound much…

It’s a starting point. Inspect and adapt after a few weeks.

You might be surprised how much you can achieve if everyone pitches in and is available at the same time. Write a story for bugs that are too big for the 2h-slot. They become part of the regular sprint if they are valuable enough.

What if it takes more than 2 hours to fix a bug? Won’t it spill over and take time from the sprint?

Ah, the magic of timeboxing: After 2 hours everyone stops, finished or not. If you’re not finished, you don’t commit your code. Sounds simple, is hard to do. You document everything you found out and your estimate of how much work is left and afterwards get back to your regular work.

If anyone knows before the slot that a bug will take longer than 2h, don’t schedule it for the slot! Make it a story and add it to the backlog.

(In practise it’s a judgement call: Just another 30 minutes and the nasty bug is gone? You’ll usually invest the time. But that’s 1 or 2 developers, not everyone.)

It’s stupid to fix a bug in slots of 2 hours with a week in between! I’ll forget everything I already knew.

You only work once on a bug during bugslot. If it’s not fixed afterwards, it becomes a story or is dropped. (Exception: That bug wasn’t the first one you worked on, but 2h would be enough time. Those usually re-appear in the next bugslot.)

What about quality?

Testing is part of the 2h-slot. (Unit-tests to the rescue!)

If a bug fix is high risk, don’t schedule it for the slot! Make it a story and add it to the backlog.

Is it perfect? Will it solve ALL the problems?

Nope.

  • Not everything will get fixed. But then you can decide whether to invest the necessary time or not.
  • Sometimes people forget to report and someone needs to chase them down.

The bugslot works quite well for the time invested. I’m surprised I’ve never heard someone describe a similar practise.

The bugslot over time

At my last employer the bugslot changed over time:

  • Beginning: Piled up bugs. Lots got fixed but still many left over after slots.
  • After half a year: All easy bugs are fixed. The ratio of harder bugs (-> stories) increases.
  • After a year: Bugslot turned into “Bugs- and Small-Stuff-Slot” – It’s not only for bugs anymore but all kinds of small tweaks, which are too small to grant the overhead of a story, estimation, planning, …
    • In the UX team we used the same slot-process to collect “small stuff” and work on it in batch.

Do you know an alternative approach to get on top of many bugs with many new hires? Or have you got questions about this one?