Review of the WebVisum Firefox extension

Today, a post announcing the WebVisum Firefox extension was posted to the newsgroup. The things talked about in this post and on the WebVisum homepage almost sound too good to be true. Among the features are:

  • Ability to tag graphics, form fields, links, and other page elements. While some or all of these features have been available in some screen readers already, this feature is unique in that it works across platforms. It also sends the data back to the WebVisum web service so other members of the community can benefit from the labels someone provided.
  • Optical Character Recognition (OCR) to try and identify those images that absolutely won’t tell us through their SRC what they’re all about.
  • Visual page enhancements such as a high-contrast profile.
  • Suppression of automatic page refreshes or Flash content
  • And most astonishingly: CAPTCHA solving!

A few days ago, I was approached by the WebVisum development team if I would consider beta testing their extension. So, I had a bit of a head start with this tool, and I was very surprised when I started testing some of the features.

The tests

From my main screen reader, I already knew the capability to label graphics or HTML form elements that have missing alt text or labels. Instead of using those techniques, I applied a few labels to the main navigation images on the CakeWalk homepage using WebVisum. After labelling the graphics from my Windows computer, I fired up my Linux box, installed the extension there and surfed to Orca immediately picked up the labels I had given the graphics and used them as the link text.

I then went ahead and labelled the Search combobox on the German Heise Newsticker site. Again, after visiting the page from the other computer, the label for the combobox was read aloud.

And then I actually tried a CAPTCHA. I chose as my first target since I know they also offer an audio CAPTCHA. Of course this is not a 100% satisfactory solution because deaf-blind people are still left dead in the water with this, but it gave me a good reference to compare the results. I went into the new account creation process on digg, and when it came to the CAPTCHA, I let WebVisum do its magic. Within less than 30 seconds, I got a result back, placed on my clipboard by the extension, ready to paste in. I compared it to what the audio CAPTCHA told me, and the results matched!

I repeated this step two more times because I had first chosen a user name that was already taken, and then goofed up something else in the form, and each time, the result was correct. Totally stunning!

I tried the same on Technorati who also offer an audio CAPTCHA, and got the same results: The CAPTCHA was correctly resolved.

As my third target, I chose MozillaZine, who, despite a couple of attempts on my part, still do not offer an audio CAPTCHA for registration or sending a reply to a forum without being logged in. Without this fall-back mechanism, this is a real-world scenario that visually impaired people are being faced with on an almost daily basis. And I’ll be darned, it worked out! I could register with the MozillaZine forums without any sighted assistance.

The conclusion

There are actually a couple of conclusions, concerns and questions that this extension raises.

The educational aspect

So here we are, having been trying to educate web developers all over the world to use W3C accessibility authoring guidelines, comply with section 508 and what not, and now an accessibility comes along that allows for labelling controls, providing alternative text for graphics, and even share this with the community. So did we do all this educational endeavor invain?

The answer can only be a firm and resolute: “No, we didn’t!” While this extension allows to correct for obvious mistakes like a missing alt attribute on an image, it cannot correct all the requirements there are to meet for section 508 compliance. And it should not! On the contrary: All mistakes one has to correct should be counted against a ranking on a “Wall of shame” kind of statistic that depicts the sites requiring the most corrections. Similarly to the Firefox “Report a broken website feature”, that in Firefox 3.0 also has a “Disability Access” component that allows to report an inaccessible web site, this data should be used to advertise for better accessibility in a future relaunch of that particular site.

Furthermore, there are so many websites that are part of the so-called web 2.0 that are not publically-owned or from a big company, but which are just as compelling to participate. These can usually either not be bothered or cannot financially make it to be 100% sec 508 compliant. Having the possibility to enhance these pages will make the web 2.0 a much more compelling place than it already is in the future.

The CAPTCHA solver

This is probably the most controversial feature. The fact alone that WebVisum is able to solve the CAPTCHAs will probably send shivers up and down the spine of many web developers, website administrators, blog owners etc. that have to fight spam every day. The fact that WebVisum can do it probably means that spambots will sooner or later also be able to do it. Even worse, some could argue that the WebVisum service may be abused by spammers to get CAPTCHA resolution for free.

The WebVisum developers assured me that they’ll make sure that only real people will be able to use their service. Furthermore, the number of CAPTCHAs that can be solved per day per site is limited.

While it is correct to advertise for alternatives to visual CAPTCHAs, the reality is that audio CAPTCHAs, which are the most common alternative, do not allow every person to use them. I already mentioned deaf-blind surfers. But also people who have a hearing impairment and have difficulty deciphering the distorted audio have trouble with this alternative. The CAPTCHA resolution feature allows to solve the problems of these people and also anyone who has trouble reading or hearing the text who is not visually impaired.

Also, this allows access to those private sites and blogs that are under no pressure government- or image-wise to implement an audio CAPTCHA. It definitely lowers the barrier for participation in the web 2.0 world!

Aside from all that, CAPTCHAs only offer a false sense of security. There are much more effective ways of fighting spam than imposing these things upon everybody. My blog, for example, has no CAPTCHA entry for commenting, and still my spam fighting measures have kept this blog clean for as long as it has been in existence. But the sad reality is that CAPTCHAs are an “evil” we currently have to cope with, and WebVisum certainly helps a lot in circumventing these artificial barriers.

My hope is that the WebVisum folks manage to keep their user base spambot-free and that there won’t be any other way to abuse the feature for unsolicited activities.

A few wishes for the future

I see for this extension the potential to become much more than “just” a web helper for the visually impaired. For example, I can imagine this being enhanced to allow hearing-enabled people to provide a textual transcription of an audio clip for deaf surfers, sighted people giving a textual description of not just an image, but a video clip or the like, and other similar cross-impairment possibilities. After all, any hearing-enabled blind could provide such textual transcription of an audio clip for a sighted deaf person.

Aside from this larger-scale vision of mine, a few more basic features such as an undo feature that allows to revoke a server-submitted enhancement will hopefully make its way into near-future versions of the extension.

So: To be able to make up your mind for yourself, go check out the website and extension at

Show Comments