First time driver: A critical fix for Firefox 4.0Beta11

This already happened a while ago, but I just now found time to blog about it. It was during preparation of Firefox 4.0Beta11 that I found a critical accessibility issue while the second build candidate was already in testing. The issue, documented in Mozilla bug 631160, was that due to a regression introduced in a combination of bugs 570710 and 630001. The first introduced the problem, the second uncovered it.

The problem was that the nightly build beta 11 was based on arrived late in my day so I had already gone off-line and didn’t see it until build candidates for beta 11 had already started. When I came to my desk early the next morning, downloaded the nightly build, I immediately noticed the problem on sites like the Google homepage where suddenly, NVDA would no longer see the search textfield. On other sites, images, separators and other parts were also missing.

Had we shipped Beta11 with this bug, we would have left users with a largely unusable beta release and would have lost valuable feedback.

My first action item was to talk to Alexander Surkov, the accessibility module owner, about this problem. He then filed the bug since he immediately found what was wrong.

An hour later, he gave me a patch to try. I started a local build based off of the current code base, with the patch applied, and within 90 minutes, was able to confirm that this patch fixed the bug.

I then ran that local build, whose build configuration was very close to what comes out of Tinderbox for releases and nightlies, for the remainder of that day to make sure the patch didn’t introduce any negative side effects. Also, the patch had to get proper reviews and approvals.

In parallel, I wrote an e-mail to the release drivers and QA mailing lists explaining the problem and its severity, and asked for permission to take this patch on the beta11 release branch and respin beta11 with a third build candidate. Luckily, this was a very contained fix that didn’t invalidate any of the other QA testing that had already gone on. I assured juanb about this fact as part of this process. In addition, the patch had unit tests that now properly covered this area of the code so a likelyhood of this regression being reintroduced is now minimized.

Fortunately, we could take another obvious fix, a crash fix, as a ride-along which would have given us false crash data otherwise of a crash that was already fixed.

After the release driver crew had evaluated my proposal, I got approval to land the fixes on the release branch for 4.0Beta11.

I checked in the code myself and pushed to the Mercurial repository, in effect taking responsibility for keeping the tree green or breaking things.

Well, as you all have seen over the past weeks, the tree didn’t break, and you all got Beta 11 early the week after, with NVDA perfectly being able to read Google.

As a post-mortem, I then explained how it happened that this bug slipped me initially.

All in all, this was a very good team effort: From finding the problem, analyzing it and then making sure we could deliver the fix to users at the lowest possible risk, it was a good experience taking charge and driving this forward, working with people from different teams such as a11y, QA, release engineering etc. to get the fixes landed in an orderly manner and without having to re-test everything that had been done already. The delay was minimal, but the gain was extremely high!

Starting an Accessible Name refactor, need your help in testing!

For those of you following the Firefox/Gecko platform development, or for those interested in helping out, this is a call for participation. If you’re not afraid to get your hands dirty a bit and would like to help the Firefox accessibility team, now would be a good time to get involved!

The problem: The code that calculates the names for any created accessibles has been growing over time and became largely unmaintainable. New features suich as adding the aria-label property support requires code duplication for HTML and XUL, and in general the code has many stylish un-niceties.

So, our team has started a code cleanup and code refactoring series to get the code into better shape and maintainability.

As with any refactor, the result should be identical in output with what we started out from. However, as those of you familiar with software development know, the risk of regressions is there and should not be discounted.

While we do have test cases for many of these instances already, there may still be cases we’ve missed. So any help we get from the community will help make sure that the refactor goes smoothly, but also help fill in any possible gaps in our testcases.

So, how can you help? By downloading and installing the latest nightly builds of Firefox 3.1 for Windows or Linux, and testing the heck out of them. Use your favorite screen reader, use your familiar web sites, use it for day-to-day surfing. Obviously the most likely pages you’ll find differences on, if any, will be those pages you visit frequently, sites you know what the output should be.

If you find something that is different from what you know, you can download the last Windows or last Linux build before the refactor, unzip it into a separate folder, and compare your findings using that build.

If you find differences you did not expect to find, you have two main choices that will get the developer team’s attention:

  1. File a bug in Bugzilla:
    • Component: Disability Access APIs
    • Version: Trunk
    • Platform: PC or whichever you use
    • OS: Windows or Linux, depending on where you found the bug.
  2. Post a message on the mozilla.dev.accessibility newsgroup (Google Groups mirror).

In any case, your bug report should contain the URL of the page you are experiencing the difference with, the expected output of the element, and the output you’re now getting. Also, is that element a graphic, link, heading, form field etc.? Also, you should mention what screen reader you’re using. If posting to the newsgroups, it will also help to mention the operating system.

How to update the nightly builds to pick up latest code changes: The builtin Check for Updates feature, if invoked from a nightly build, will always grab an update to the latest nightly build and install it for you. So, you only need to download and go through the installation process once. You can then daily check for updates and get the latest code that way.

The first build to see changes will be the October 11 build, build ID 1.9b2pre/20081011.

Working with different profiles: If you don’t want to put your regular profile into the hands of the Firefox nightly builds, you can start Firefox.exe or the ./firefox executable with the -p option to bring up Profile Manager. You can then create a new profile and start Firefox with that profile. That way, your Firefox 3.0.x profile won’t be touched by the 3.1 nightlies if you choose not to. I’ve found, however, that the nightlies are very stable already, and I often flip back and forth between 3.0.x and 3.1 builds on the same profile without problems. The one thing that most certainly would happen is that some extensions may not work in 3.1 yet.

I’d like to thank all of you in advance who decide to participate in this effort and help everyone who relies on Firefox accessibility by testing out the code refactor. You can make a real difference because we obviously can’t test all of the web pages out there, and yours may just be the one we might miss out on.

Progress on automated testing for the accessibility module

Today, I checked in two changes that allow the unit tests we’ve developed for the accessibility module so far, to run on what we call a staging server. A staging server is a server that simulates production conditions, but isn’t the live thing just yet. It allows us to test new features in build, testing, web sites etc., in close-to-real-life conditions before finally pushing them to production.

Obviously, getting these tests running on the production tinderboxes so we immediately see when we broke something is the next step. But until that can be done, we need to find a solution for bug 441974. Basically what is happening is that tests pass when each test file is run stand-alone, but some of these tests fail randomly when running all files in one big batch. But I made some good connections at the Mozilla summit last week, and as soon as we get these passing we’ll start running those tests. They’ll then run along with the many other unit tests we have for Firefox and the Mozilla platform.

I’d like to thank our intern Lukas Blakk and a bunch of other members of the QA and build teams to help me with getting these configs for buildbot right!