What is WAI-ARIA, what does it do for me, and what not?

On March 20, 2014, the W3C finally published the WAI-ARIA standard version 1.0. After many years of development, refinement and testing, it is now a web standard.

But I am often asked again and again: What is it exactly? What can it do for me as a web developer? And what can it not do?

I often find that there are assumptions made about WAI-ARIA that are not correct, and the use of it with such wrong assumptions often lead to sites that are less accessible than when ARIA is not being used at all.

In addition, Jared W Smith of WebAIM just yesterday wrote a very good blog post titled Accessibility Lipstick on a Usability Pig, highlighting another related problem: Even though a website may suck usability-wise, pouring WAI-ARIA sugar on it somehow forces it into compliance, but it still sucks big time.

So with all these combined, and after receiving encouragement by Jason Kiss on Twitter, I decided to write this post about what WAI-ARIA is, what it can do for you as a web developer, and what it cannot do. Or rather: When should you use it, and more importantly, when not.

I realize such articles have been written before, and these facts have also all been stressed time and again in talks by various good people in the field of web accessibility. But it is such an important topic that it is finally time for this blog to have such an all-encompassing article as well.

So without further due, let’s jump in!

What is it?

WAI-ARIA stands for “Web Accessibility Initiative – Accessible Rich Internet Applications”. It is a set of attributes to help enhance the semantics of a web site or web application to help assistive technologies, such as screen readers for the blind, make sense of certain things that are not native to HTML. The information exposed can range from something as simple as telling a screen reader that activating a link or button just showed or hid more items, to widgets as complex as whole menu systems or hierarchical tree views.

This is achieved by applying roles and state attributes to HTML 4.01 or later markup that has no bearing on layout or browser functionality, but provides additional information for assistive technologies.

One corner stone of WAI-ARIA is the role attribute. It tells the browser to tell the assistive technology that the HTML element used is not actually what the element name suggests, but something else. While it originally is only a div element, this div element may be the container to a list of auto-complete items, in which case a role of “listbox” would be appropriate to use. Likewise, another div that is a child of that container div, and which contains a single option item, should then get a role of “option”. Two divs, but through the roles, totally different meaning. The roles are modeled after commonly used desktop application counterparts.

An exception to this are document landmark roles, which don’t change the actual meaning of the element in question, but provide information about this particular place in a document. You can read more about landmarks in my WAI-ARIA tip #4. Also, if you’re using HTML5, there are equivalent elements you might want to use as well.

The second corner stone are WAI-ARIA states and properties. They define the state of certain native or WAI-ARIA elements such as if something is collapsed or expanded, a form element is required, something has a popup menu attached to it or the like. These are often dynamic and change their values throughout the lifecycle of a web application, and are usually manipulated via JavaScript.

What is it not?

WAI-ARIA is not intended to influence browser behavior. Unlike a real button element, for example, a div which you pour the role of “button” onto does not give you keyboard focusability, an automatic click handler when Space or Enter are being pressed on it, and other properties that are indiginous to a button. The browser itself does not know that a div with role of “button” is a button, only its accessibility API portion does.

As a consequence, this means that you absolutely have to implement keyboard navigation, focusability and other behavioural patterns known from desktop applications yourself. A good example can be read about in my Advanced ARIA tip about tabs, where I clearly define the need to add expected keyboard behavior.

When should I not use it?

Yes, that’s correct, this section comes first! Because the first rule of using WAI-ARIA is: Don’t use it unless you absolutely have to! The less WAI-ARIA you have, and the more you can count on using native HTML widgets, the better! There are some more rules to follow, you can check them out here.

I already mentioned the example with buttons versus clickable divs and spans with a role of “button”. This theme continues throughout native roles vs. ARIA roles and also extends to states and properties. An HTML5 required attribute has an automatic evaluation attached to it that you have to do manually if you’re using aria-required. The HTML5 invalid attribute, combined with the pattern attribute or an appropriate input type attribute will give you entry verification by the browser, and the browser will adjust the attribute for you. All of these things have to be done manually if you were using aria-invalid only. A full example on the different techniques for form validation can be found in a blog post I wrote after giving a talk on the subject in 2011.

Fortunately, this message seems to finally take hold even with big companies. For example, the newest version of Google Maps is using button elements where they used to use clickable divs and spans. Thanks to Chris Heilmann for finding this and pointing it out during an accessibility panel at Edge Conf (link to Youtube video) in March 2014!

Here’s a quick list of widget roles that have equivalents in HTML where the HTML element should be preferred whenever possible:

WAI-ARIA role Native element Notes
button button use type=”button” if it should not act as a submit button
checkbox input type=”checkbox”
radiogroup and radio fieldset/legend and input type=”radio” fieldset is the container, legend the prompt for which the radio buttons are the answer for, and the input with type of “radio” are the actual radio buttons
combobox select size=”1″ Only exception is if you need to create a very rich compound widget. But even then, combobox is a real mess which warrants its own blog post.
listbox select with a size greater than 1 Only exception is if you create a rich auto/complete widget
option option As children of select elements or combobox or listbox role elements
list ul or ol Not to be confused with listbox! list is a non-interactive list such as an unordered or ordered list. Those should always be preferred. Screen readers generally also supporting nesting them and get the level right automatically.
spinbutton input type=”number” If the browser supports it.
link a with href attribute Should, in my humble opinion, never ever ever be used in an HTML document!
form form Nobody I know from the accessibility community can actually explain to me why this one is even in the spec. I suspect it has to do primarily with SVG, and maybe EPUB.

The reason all these roles to native elements mappings are in there at all are because of the fact that WAI-ARIA can also be applied to other markup such as EPUB 3 and SVG 2. Also, some elements such as spin buttons and others are new in HTML5, but because WAI-ARIA was originally meant to complement HTML 4.01 and XHTML 1.x, and HTML5 was developed in parallel, roles, states and properties were bound to overlap, but got more defined behaviors in browsers for HTML5.

Likewise, you should prefer states such as disabled and required over the WAI-ARIA equivalents aria-disabled and aria-required if you’re writing HTML5. If you write HTML 4.01 still, this rule does not apply. If you’re specifically targetting HTML5, though, there is not really a need for the aria-required, aria-disabled and aria-invalid states, since the browsers take care of that for you. And yes, I know that I am in disagreement with some other accessibility experts, who advise to use both the HTML5 and WAI-ARIA attributes in parallel. Problem with that is, in my opinion, that it is then extra work to keep the WAI-ARIA attributes in sync. Especially with the aria-invalid attribute, this means that you’ll still have to put in some JavaScript that responds to the state of the HTML5 form validation state.

Why this duplication? Why can’t the browser just also react to the WAI-ARIA attributes as it does to the HTML ones?

This is a question I am getting a lot. The simple answer is: Because WAI-ARIA was never meant to change browser behavior, only expose extra information to assistive technologies. The more complex answer is: WAI-ARIA can be applied to XHTML 1.0 and 1.1, HTML 4.01, and HTML5. The HTML5 attributes can only be applied to HTML5 capable browsers, including mobile, but will give you all kinds of extra features defined in the standard for these attributes. If the WAI-ARIA attributes were suddenly made to influence actual browser behavior, the level of inconsistency would be enormous. To keep WAI-ARIA clean and on a single purpose, it was therefore decided to not make WAI-ARIA attributes influence browser behavior ever.

When should I use it?

Whenever you create a widget that is not indiginous to your host language, e. g. HTML. Examples of such widgets include:

  • tree with treeview children: When creating something the user should be navigating in a hierarchical fashion. Examples can be seen in the folder tree in Windows Explorer or the list of topics and books in the Windows HTML help viewer.
  • menu with menuitem, menuitemcheckbox, and menuitemradio children: A real menu system as can be seen in Windows or OS X apps.
  • grid or treegrid, with gridrow, columnheader, rowheader, and gridcell children: An editable grid similar to a spreadsheet. Not to be confused with data tables which use native HTML table, tr, th, and td elements with proper scope and other attributes.
  • toolbar container with actual buttons or other interactive elements as children.
  • dialog as container for a message or input prompt that needs to be acknowledged or dismissed in a modal fashion, e. g. no other part of the web page or app can be interacted with at the moment. The role alertdialog is related, although I recommend not using it since its support is inconsistent.
  • application, in which you control every single focus and keyboard interaction. For a detailed discussion, see my article on the use of the application role.

In all of the above cases, it is your responsibility to:

  1. Familiarize yourself with the expected user interaction with both mouse and keyboard.
  2. Make sure all elements can be reached via the tab key, and that focus is always visible. Use the tabindex with a value of 0 to put stuff in the tab order at their natural position within the document.
  3. Make sure some items are not necessarily reachable via the tab key. For example in a tool bar, items should be navigated via left and right arrows, and tab should jump straight to the next container (tool bar or other compound widget), and Escape should return to the main application area. Likewise, a tab list should be navigated with left and right arrows, too, and tab itself should jump into the tab panel that is active.
  4. You manage focus appropriate to the type of widget you are in. For example, when opening a dialog (role “dialog”), it is your responsibility to set keyboard focus to it or its first focusable child element, and to manage modality, meaning you must make sure tab does not wander outside the confines of the dialog as long as it is open. Escape should generally cancel, and a dedicated OK/Accept/Save button should close with changing. Make sure once you’re done with a dialog, you return focus to a defined position in your application/document, like the button that opened the dialog in the first place, or keyboard and screen reader users might get lost.

If you do not adhere to the common interaction patterns associated with certain of these roles, your WAI-ARIA sugar might very quickly turn into sour milk for users, because they get frustrated when their expected keyboard interaction patterns don’t work. I strongly recommend studying the WAI-ARIA 1.0 Authoring Practices and keeping them handy for reference because they provide a comprehensive list of attributes and roles associated to one another, as well as more tips on many of the things I just mentioned. Another very good resource is the afore mentioned Using WAI-ARIA in HTML document which provides an extensive technical, yet easy to understand, reference to best practices on how to apply WAI-ARIA code, and also, when not to do it.

Closing remarks

I realize that to someone just getting started with web accessibility, these topics may seem daunting. However, please remember that, like learning HTML, CSS and JavaScript, learning the intricacies of web accessibility means you’re learning a new skill. The more often you use these techniques, the more they become second nature. So don’t be discouraged if you at first feel overwhelmed! Don’t be shy to ask questions, or pull in people for testing! The accessibility community is generally a very friendly and helpful bunch.

One thing that may also help with motivation, and it’s been thankfully mentioned more and more often: Accessibility and usability go hand in hand. The more you improve the usability, the more it gets you there in terms of accessibility, too. And first and foremost, it’s about people! Not about some WCAG technique, not about a law that needs to be fulfilled, but about people actually using your web site or web application. Fulfilling legal requirements and WCAG techniques then come naturally.

So: Make WAI-ARIA one of your tools in your arsenal of tools for web development, but take its first rule to heart: Don’t use it unless you absolutely have to. Get the usability right for keyboard users, make sure stuff is properly visible to everyone even when they’re standing outside with the sun shining down on their shiny mobile phones or tablets (thanks to Eric Eggert for that tip!), and use WAI-ARIA where necessary to provide the extra semantic screen readers need to also make sense of the interactions. With people in mind first, you should be good to go!

Happy coding!

Easy ARIA Tip #7: Use “listbox” and “option” roles when constructing AutoComplete lists

One question that comes up quite frequently is the one of which roles to use for an auto-complete widget, or more precisely, for the container and the individual auto-complete items. Here’s my take on it: Let’s assume the following rough scenario (note that the auto-complete you have developed may or may not work in the same, but a similar way):

Say your auto-complete consists of a textbox or textarea that, when typing, has some auto-complete logic in it. When auto-complete results appear, the following happens:

  1. The results are being collected and added to a list.
  2. The container gets all the items and is then popped into existence.
  3. The user can now either continue typing or press DownArrow to go into the list of items.
  4. Enter or Tab select the current item, and focus is returned to the text field.

Note: If your widget does not support keyboard navigation yet, go back to it and add that. Without that, you’re leaving a considerable amount of users out on the advantages you want to provide. This does not only apply to screen reader users.

The question now is: Which roles should the container and individual items get from WAI-ARIA?Some think it’s a list, others think it’s a menu with menu items. There may be more cases, but those are probably the two most common ones.

My advice: use the listbox role for the container, and option for the individual auto-complete items the user can choose. The roles menubar, menu, and menuitem plus related menuitemcheckbox and menuitemradio roles should be reserved for real menu bar/dropdown menu, or context menu scenarios. But why, you may ask?

The short version: Menus on Windows are a hell of a mess, and that’s historically rooted in the chaos that is the Win32 API. Take my word for it and stay out of that mess and the debugging hell that may come with it.

The long version: Windows has always known a so-called menu mode. That mode is in effect once a menu bar, a drop-down menu, or a context menu become active. This has been the case for as long as Windows 3.1/3.11 days, possibly even longer. To communicate the menu mode state to screen readers, Windows, or more precisely, Microsoft Active Accessibility, uses four events:

  1. SystemMenuStart: A menu bar just became active.
  2. SystemMenuPopupStart: If a SystemMenuStart event had been fired before, a drop-down menu just became active. If a SystemMenuStart event had not been fired before, a context menu just became active. If another SystemMenuPopupStart preceeded this one,  a sub menu just opened.
  3. SystemMenuPopupEnd: The popup just closed. Menu mode returns to either the previous Popup in the stack (closing of a sub menu), the menu bar, or falls out of menu mode completely.
  4. SystemMenuEnd: A menu bar just closed.

These events have to arrive in this exact order. Screen readers like JAWS or Window-Eyes rely heavily on the even order to be correct, and they ignore everything that happens outside the menus once the menu mode is active. And even NVDA, although it has no menu mode that is as strict as that of other “older” screen readers, relies on the SystemMenuStart and SystemMenuPopupStart events to recognize when a menu gained focus. Because the menu opening does not automatically focus any item by default. An exception is JAWS, which auto-selects the first item it can once it detects a context or start menu opening.

You can possibly imagine what happens if the events get out of order, or are not all fired in a complete cycle. Those screen readers that rely on the order get confused, stay in a menu mode state even when the menus have all closed etc.

So, when a web developer uses one of the menu roles, they set this whole mechanism in motion, too. Because it is assumed a menu system like a Windows desktop app is being implemented, browsers that implement WAI-ARIA have to also send these events to communicate the state of a menu, drop-down or context or sub menu.

So, what happens in the case of our auto-complete example if you were to use the role menu on the container, and menuitem on the individual items? Let’s go back to our sequence from the beginning of the post:

  1. The user is focused in the text field and types something.
  2. Your widget detects that it has something to auto-complete, populates the list of items, applies role menuitem to each, and role menu to the container, and pops it up.
  3. This causes a SystemMenuPopupStart event to be fired.

The consequences of this event are rather devastating to the user. Because you just popped up the list of items, you didn’t even set focus to one of its items yet. So technically and visually, focus is still in your text field, the cursor is blinking away merrily.

But for a screen reader user, the context just changed completely. Because of the SystemMenuPopupStart event that got fired, screen readers now have to assume that focus went to a menu, and that just no item is selected yet. Worse, in the case of JAWS, the first item may even get selected automatically, producing potentially undesired side effects!

Moreover, the user may continue typing, even use the left and right arrow keys to check their spelling, but the screen reader will no longer read this to them, because their screen reader thinks it’s in menu mode and ignores all happenings outside the “menu”. And one last thing: Because you technically didn’t set focus to your list of auto-complete items, there is no easy way to dismiss that menu any more.

On the other hand, if you use listbox and option roles as I suggested, none of these problems occur. The list will be displayed, but because it doesn’t get focus yet, it doesn’t disturb the interaction with the text field. When focus gets into the list of items, by means of DownArrow, the transition will be clearly communicated, and when it is transitioning back to the text field, even when the list remains open, that will be recognized properly, too.

So even when you sighted web developers think that this is visually similar to a context menu or a popup menu or whatever you may want to call it, from a user interaction point of view it is much more like a list than a menu. A menu system should really be confined to an actual menu system, like the one you see in Google Docs. The side effects of the menu related roles on Windows are just too severe for scenarios like auto-completes. And the reason for that lies in over 20 years of Windows legacy.

Some final notes: You can spice up your widget by letting the user know that auto-complete results are available via a text that gets automatically spoken if you add it in a text element that is moved outside the viewport, but apply an attribute aria-live=”polite” to it. In addition, you can use aria-expanded=”true” if you just popped up the list, and aria-expanded=”false” if it is not there, both applied to your input or textarea element. And the showing and hiding of the auto-complete list should be done via display:none; or visibility:hidden; and their counterparts, or they will appear somewhere in the user’s virtual buffer and cause confusion.

A great example of all of this can be seen in the Tweet composition ContentEditable on twitter.com.

I also sent a proposal for an addition to the Protocols and Formatting Working Group at the W3C, because the example in the WAI-ARIA authoring practices for an auto-complete doesn’t cover most advanced scenarios, like the one on Twitter and others I’ve come across over time. Hope the powers that may be follow my reasoning and make explicit recommendations regarding the use of roles that should and shouldn’t be used for auto-completes!