This already happened a while ago, but I just now found time to blog about it. It was during preparation of Firefox 4.0Beta11 that I found a critical accessibility issue while the second build candidate was already in testing. The issue, documented in Mozilla bug 631160, was that due to a regression introduced in a combination of bugs 570710 and 630001. The first introduced the problem, the second uncovered it.
The problem was that the nightly build beta 11 was based on arrived late in my day so I had already gone off-line and didn’t see it until build candidates for beta 11 had already started. When I came to my desk early the next morning, downloaded the nightly build, I immediately noticed the problem on sites like the Google homepage where suddenly, NVDA would no longer see the search textfield. On other sites, images, separators and other parts were also missing.
Had we shipped Beta11 with this bug, we would have left users with a largely unusable beta release and would have lost valuable feedback.
My first action item was to talk to Alexander Surkov, the accessibility module owner, about this problem. He then filed the bug since he immediately found what was wrong.
An hour later, he gave me a patch to try. I started a local build based off of the current code base, with the patch applied, and within 90 minutes, was able to confirm that this patch fixed the bug.
I then ran that local build, whose build configuration was very close to what comes out of Tinderbox for releases and nightlies, for the remainder of that day to make sure the patch didn’t introduce any negative side effects. Also, the patch had to get proper reviews and approvals.
In parallel, I wrote an e-mail to the release drivers and QA mailing lists explaining the problem and its severity, and asked for permission to take this patch on the beta11 release branch and respin beta11 with a third build candidate. Luckily, this was a very contained fix that didn’t invalidate any of the other QA testing that had already gone on. I assured juanb about this fact as part of this process. In addition, the patch had unit tests that now properly covered this area of the code so a likelyhood of this regression being reintroduced is now minimized.
Fortunately, we could take another obvious fix, a crash fix, as a ride-along which would have given us false crash data otherwise of a crash that was already fixed.
After the release driver crew had evaluated my proposal, I got approval to land the fixes on the release branch for 4.0Beta11.
I checked in the code myself and pushed to the Mercurial repository, in effect taking responsibility for keeping the tree green or breaking things.
Well, as you all have seen over the past weeks, the tree didn’t break, and you all got Beta 11 early the week after, with NVDA perfectly being able to read Google.
As a post-mortem, I then explained how it happened that this bug slipped me initially.
All in all, this was a very good team effort: From finding the problem, analyzing it and then making sure we could deliver the fix to users at the lowest possible risk, it was a good experience taking charge and driving this forward, working with people from different teams such as a11y, QA, release engineering etc. to get the fixes landed in an orderly manner and without having to re-test everything that had been done already. The delay was minimal, but the gain was extremely high!