Abstract

This document outlines various accessibility related user needs, requirements and scenarios for real-time communication (RTC). These user needs should drive accessibility requirements in various related specifications and the overall architecture that enables it. It first introduces a definition of RTC as used throughout the document and outlines how RTC accessibility can support the needs of people with disabilities. It defines the term user needs as used throughout the document and then goes on to list a range of these user needs and their related requirements. Following that some quality related scenarios are outlined and finally a data table that maps the user needs contained in this document to related use case requirements found in other technical specifications.

This document is most explicitly not a collection of baseline requirements. It is also important to note that some of the requirements may be implemented at a system or platform level, and some may be authoring requirements.

Introduction

What is real-time communication (RTC)?

Real-time communication (RTC) is an evolution beyond the traditional data exchange model of client to server resulting in real-time peer to peer audio, video and data exchange directly between supported user agents. This allows instantaneous applications for video, text and audio calls, text chat, file exchange, screen sharing and gaming, all without the need for browser plug-ins. While real-time communication (RTC) applications are enabled in the main by specifications like WebRTC, WebRTC is not the sole specification with responsibility to enable accessible real-time communication applications. The use cases and requirements are broad - for example as outlined in the IETF RFC 7478 'Web Real-Time Communication Use Cases and Requirements' document. [[ietf-rtc]] [[webrtc]]

Real-time communication and accessibility

RTC has the potential to allow improved accessibility features that will support a broad range of user needs for people with a wide range of disabilities. These needs can be met through improved audio and video quality, audio routing, captioning, improved live transcription, transfer of alternate formats such as sign-language, text-messaging / chat, real time user support and status polling.

RTC accessibility is enabled by a combination of technologies and specifications such as those from the Media Working Group, Web and Networks Interest Group, Second Screen, and Web Audio Working group as well as AGWG and ARIA. The Accessible Platform Architectures Working Group (APA) hopes this document will inform how these groups meet various responsibilities for enabling accessible RTC, as well updating related use cases in various groups. For examples, view the current work on WebRTC Next Version Use Cases First Public Working Draft. [[webrtc-use-cases]]

User needs definition

This document outlines various accessibility related user needs for RTC accessibility. The term 'user needs' in this document relates to what people with various disabilities need to successfully use RTC applications. These needs may relate to having particular supports in an application, being able to complete tasks or access other functions. These user needs should drive accessibility requirements for RTC accessibility and its related architecture.

User needs are presented here with their related requirements; some in a range of scenarios (which can be thought of as similar to user stories).

User needs and requirements

The following outlines a range of user needs and requirements. The user needs have also been compared to existing use cases for real-time text (RTT) such as the IETF 'Framework for Real-Time Text over IP Using the Session Initiation Protocol (SIP)' RFC 5194 and the European Procurement Standard EN 301 549. [[rtt-sip]] [[EN301-549]]

Window anchoring and pinning

Not all atomic items necessarily are pinned next to other atomic elements but some may be dependent, related or updated synchronously. For example, if there are multiple atomic data points destined for an 80 character braille display that has been sectioned to display 4 atomic items in up to 19 spaces each (leaving at least one blank cell for spacing).

Here the term atomic relates to small pieces of data. For the purposes of accessibility conformance testing, the definitions and use of the terms 'atomic' and 'atomic rules' may also be useful. [[applicability-atomic]] [[rule-types]]

Pause 'on record' captioning in RTC

Accessibility user preferences and profiles

Incoming calls and caller ID

Successful design of operations required for acting on incoming calls, getting informed about who the caller is and connecting relay services should not require complicated sequences of user actions.

Routing and communication channel control

Audio description in live conferencing

Moving beyond mono in this context is also important, as the stereo spread allows audio descriptions to be sound staged. Applications should also inherit customization settings from the user's operating system.

Quality synchronisation and playback

Simultaneous voice, text & signing

This user need may also indicate necessary support for 'Total conversation' services as defined by ITU in WebRTC applications. These are combinations of voice, video, and RTT in the same real-time session. [[total-conversation]]

Emergency calls: Support for Real-Time Text (RTT)

Text and Video relay services (VRS)

To successfully connect video or text relay services should not require a complicated sequence of user actions.

Distinguishing sent and received text with RTT

Call participants and status

Captioning support

Assistance for users with cognitive disabilities

Personalized symbol sets for users with cognitive disabilities

This relates to cognitive accessibility requirements. For related work at W3C see the 'Personalization Semantics Content Module 1.0' and 'Media Queries Level 5'. [[personalization]] [[media-queries]]

Internet relay chat (IRC) style interfaces

Some braille users will also prefer the RTT model. However, braille users desiring text displayed with standard contracted braille might better be served in the manner users relying on TTS engines are served, by buffering the data to be transmitted until an end of line character is reached.

Relationship between RTC and XR Accessibility

There are potential real-time communication application issues that may only apply in immersive environments or augmented reality contexts.

For example, if an RTC application is also an XR application then relevant XR accessibility requirements should be addressed as well. [[xaur]]

Quality of service scenarios

Deaf users: Video resolution and frame rates

Scenario: A deaf user watching a signed broadcast needs a high-quality frame rate to maintain legibility and clarity in order to understand what is being signed.

EN 301 549 Section 6, recommends WebRTC applications should support a frame rate of at least 20 frames per second (FPS). More details can be found at Accessible Procurement standard for ICT products and services EN 301 549 (PDF) and ITU-T Series H Supplement 1 "Sign language and lip-reading real-time conversation using low bit-rate video communication".

Audio frequency bandwidth

Scenario: A hard of hearing user needs better stereo sound to have a quality experience in work calls or meetings with friends or family. Transmission aspects, such as decibel range for audio needs to be of high-quality. For calls, industry allows higher audio resolution but still mostly in mono only.

EN 301 549 Section 6, recommends for WebRTC enabled conferencing and communication the application shall be able to encode and decode communication with a frequency range with an upper limit of at least 7KHz. More details can be found at Accessible Procurement standard for ICT products and services EN 301 549 (PDF)

Quality requirements for video

Scenario: A hard of hearing user needs better stereo sound so they can have a quality experience in watching HD video or having a HD meeting with friends or family. Similarly for video quality, transmission aspects such as frames per second needs to be of high-quality.

A hard of hearing user often combines their perception of speech from audio with their perception of lip movement and other visual clues to create an overall understanding of speech. For the visual parts, the requirements on video are the same as expressed in '5.1 Deaf users: Video resolution and frame rates' about perception of sign language because lip movements are also part of sign language, equally rapid and as detailed as the other parts of sign language.

EN 301 549 Section 6, recommends for WebRTC enabled conferencing and communication the application shall be able to encode and decode communication with a frequency range with an upper limit of at least 7KHz. More details can be found at Accessible Procurement standard for ICT products and services EN 301 549 (PDF)

WebRTC lets applications prioritise bandwidth dedicated to audio / video / data streams; there is also some experimental work in signalling these needs to the network layer as well as support for prioritising frame rate over resolution in case of congestion. [[webrtc-priority]]

Change Log

The following is a list of new user needs and requirements since the publication of the previous working draft:

The following is a list of updated requirements to existing user needs:

The following are other changes in this document:

This user need may also indicate necessary support for 'Total conversation' services as defined by ITU in WebRTC applications. These are combinations of voice, video, and RTT in the same real-time session. [total-conversation]

This document has been updated based on document feedback, discussion and Research Questions Task Force consensus.

Acknowledgments

Participants of the APA working group active in the development of this document:

Previously Active Participants, Commenters, and Other Contributors

Enabling funders

This work is supported by the EC-funded WAI-Guide Project.