Peter Krautzberger · on the web

Being unhappy with accessible SVGs: text elements

This started out last fall and is mostly a quick observation on screenreader behaviors and turned into some meandering thoughts about "linearizable" SVGs.

The other day (month/season/year), I found myself thinking about a text role in ARIA. Not exactly original but at least I was thinking about it in the context of SVG. So technically I was thinking about a role for the (woefully unmaintained) Graphics ARIA Module where I have a hunch a role for text (say, role=graphics-text) could be useful. The simplest reason is probably handwriting in its various forms, whether it's signatures, calligraphy, graffiti, or other heavily stylized writing. Generally speaking, this kind of content is not (and often cannot be) represented as SVG text elements; instead it usually appears as SVG paths, rendering a reproduction of text (handwritten or otherwise). But many other forms of stylized text tend to be realized as paths in SVG as well.

It's still text though.

It's used as text, it looks like text, its text alternative will match. So, duck test passed?

One obvious objection is: it's only text if you use the SVG text element. I think that's a weak argument. SVG's text element is so limited in terms of design that even text that could be using the text element is often turned into paths. There's a reason why every vector graphics editor has a function to replace text with paths. While the SVG spec is finally seeing some activity again, its current charter is focused on maintenance and interop, not new features, so this is not going to improve anytime soon. Besides, ARIA is about expanding developer options to allow solutions that host languages cannot otherwise provide.

A slightly stronger argument might be that text is somehow special. It provides certain fundamental affordances that nothing else can match. I don't disagree but, again, ARIA is for when the correct™ way is not available. And ARIA roles are only ever a (weak) signal, a promise from developer to user; it's the developer's responsibility to make good on that promise.

The graphics-aria module only has 3 roles to offer: graphics-document, graphics-object, and graphics-symbol. I'd say none of these fit so I find myself thinking graphics-text might be reasonable in SVG.

Testing text

Inevitably this made me wonder whether it's actually a good idea (for accessibility) to use text elements in SVGs. What actually happens when you use the text element, and what happens more when things turn a little more complex?

To start off, I took the MDN example for SVG text for a spin, testing different screenreader in different modes (with default settings). A rough list of results:

So despite the possibly questionable SVG-AAM mappings, things aren't terrible but also not great. Firefox sticks out negatively in that I couldn't find a way to step through the text elements individually (i.e. basic exploration).

What bugs me is that the SVG context is never surfaced. The addition of "image" and "graphic" is identical to role/element img; that seems incorrect here. Adding "image" to text element content seems plain wrong (whether once or multiple times). Treating the SVG like a group of some sort and indicating entering/leaving is relevant, but keeping its internal semantics intact is equally so.

Only VoiceOver on Mac surfaces the nature of the text elements by default. While it probably shows my limitations as a screenreader users that I'm not sure other screenreader surface this information some other way, I doubt few people will consider looking when it's announced as image.

I suppose not announcing text element roles is consistent with the view that "real" text is special (or at least it matches the SVG-AAM's mapping to paragraphs). It's the default so it needs no additional information/noise. Still, in the context of an SVG, I wonder if this assumption holds up.

VO announcing text selectability points to the core interaction model of text: you can select it (e.g. for copy&paste). If that's what text is about, then couldn't a graphics-text role be used for similar functionality (e.g. selectability of the accessible name)? Probably a dangerous idea. Move along, nothing to see.

Rambling a bit

When I attended my first W3C event (too) many moons ago, I was amazed by a demo exploring the classic Ghostscript tiger as SVG, deeply annotated with non-visual information. (I want to say it was done by Janina Sajka using Presto-era Opera but I'm not sure I would've known at the time). This stuck with me. It helped me when we incorporated ChromeVox's equation support into MathJax later on and more when Volker further expanded MathJax's non-visual rendering. You can also see that influence in my abuse of trees for content and my more recent experiment with a granularity walker.

If you stay as academic as those, you can find a lot of interesting stuff like this one from the MIT Visualization Group. But in recent years I've lost faith in this kind of thing. It's too academic to ever make a significant difference in the real world. There never seems to be enough understanding of the web's grain to help move things forward to the benefit of the wider (accessibility) community. Neither are these academic tools becoming robust solutions for data journalism (or whatever), nor are they helpful in identifying cow paths (and gaps) in the accessibility infrastructure of the web.

In the general web development world we instead get regular (yearly) posts where someone explains how title and desc elements work to set the accessible name nd description. And how to use role=img to make an SVG behave like img. That makes me a bit sad. At most, you get a post that linearizes something like a simple graph chart. Like, there's this 10 year old talk by Léonie Watson and Chaals McCathie Nevile when that was still fairly new.

That's basically still where we are - unless you build role=application-style solutions like the MIT stuff (or client-side MathJax). Take this recent article from a11y-collective last year. Don't get me wrong: it's good stuff! But just. so. basic. (Obligatory shout-out to one of my old favorites: Heather Migliorisi's 2016 post on CSS Tricks.)

I feel there's a massive divide and we're not working to bridge it. Complex data visualization is one sliver of graphical documents, no matter how important it may be. Nothing good will come from focusing on just its use cases. Take something like this odd cloth-a-person SVG on wikicommons or this quadrant chart from the mermaidjs docs - I doubt any data viz ARIA module will help those much simpler (yet not simple) use cases.

A slightly more real world example

Here's a common pattern I see at work. Many graphics in scientific writing contain diagrammatic content and that content is often annotated by text (in particular, labeled). For example, Figure 2 in this paper,

Usually, these graphics come in a form that allow for fairly simple linearization of their text alternatives. These are not flowcharts with complex chains and loops, these are not complex data visualizations.

Here's a simplified example. (The embedded raster image is by Mamoru800, CC BY-SA 4.0, Wikimedia source).

Left: Average Human specimen (~1.7m) Middle: Average Giant Calamar specimen (~9.5m) Right: Largest known Giant Calamar specimen (12m)

The SVG structure in this contrived example is as follows: first, an image tag linking to a raster graphic; the raster graphic shows a size comparison of giant squids and humans. Next, 3 well positioned text elements annotating 3 areas of the graphic, namely the diagrammatic depiction of a human being as well as the 2 diagrammatic depictions of giant squids (one average and one large specimen).

It's contrived insofar as text is rarely done as text elements and I've chosen an image tag to reduce complexity (and also because I have an old issue open on axe-core about a false positive for image tags with aria-label). Nevertheless this is representative of a significant chunk of content I come across daily and, biased as I am by my own experience, I feel the world would be better if this had a better solution than sticking a long-description somewhere.

Coming back to the example, here's another round of notes from testing this.

Overall, these results matched my expectations after seeing the behavior for text elements.

It's not terribly bad and yet full of gaps in AT behavior, I'd say. While using text elements is rarely an option, I suspect it's not too relevant here (but I don't want to sit on this another season). The SVG-AAM mappings seem to work and it looks like the ATs are falling short here. Very likely because this pattern is not encountered much. Possibly also because graphics-aria offers too little. Chicken and egg (and resources).

Obligatory quote from Adrian

So please test on your own, using this post as a template — not the final word.

Basic

This is just one sliver of SVG content and yet improving these "linearizable" SVG structure seems like a fairly small improvement that would provide building blocks (e.g. improving inline svg announcements, text element announcements, and maybe even adding a role) that help here as well as in other scenarios. And they also don't strike me as high risk for hindering further improvements.