This document outlines various accessibility related user needs, requirements and scenarios for Real-time communication (RTC). These user needs should drive accessibility requirements in various related specifications and the overall architecture that enables it. It first introduces a definition of RTC as used throughout the document and outlines how RTC accessibility can support the needs of people with disabilities. It defines the term user needs as used throughout the document and then goes on to list a range of these user needs and their related requirements. Following that some quality related scenarios are outlined and finally a data table that maps the user needs contained in this document to related use case requirements found in other technical specifications.
This document is most explicitly not a collection of baseline requirements. It is also important to note that some of the requirements may be implemented at a system or platform level, and some may be authoring requirements.
Real-time communication (RTC) is an evolution beyond the traditional data exchange model of client to server resulting in real-time peer to peer audio, video and data exchange directly between supported user agents. This allows instantaneous applications for video and audio calls, text chat, file exchange, screen sharing and gaming, all without the need for browser plug-ins. While Real-time communication (RTC) applications are enabled in the main by specifications like WebRTC, WebRTC is not the sole specification with responsibility to enable accessible real-time communication applications. The use cases and requirements are broad - for example as outlined in the IETF RFC 7478 'Web Real-Time Communication Use Cases and Requirements' document. [[ietf-rtc]]
RTC has the potential to allow improved accessibility features that will support a broad range of user needs for people with a wide range of disabilities. These needs can be met through improved audio and video quality, audio routing, captioning, improved live transcription, transfer of alternate formats such as sign-language, text-messaging and chat, real time user support and status polling.
Accessible RTC is enabled by a combination of technologies and specifications such as those from the Media Working Group, Web and Networks IG, Second Screen, and Web Audio Working group as well as AGWG and ARIA. APA hopes this document will inform how these groups meet various responsibilities for enabling accessible RTC, as well updating related use cases in various groups. For examples, view the current work on WebRTC Next Version Use Cases First Public Working Draft. [[webrtc-use-cases]]
User needs relate to what a particular user 'needs' from an application or platform, to perform a task or achieve a particular goal. This is in the context of the users ability and preferences. This document outlines various accessibility related user needs for Accessible RTC. These user needs should drive accessibility requirements for Accessible RTC and its related architecture.
User needs are presented here with their related requirements; some in a range of scenarios (which can be thought of as similar to user stories). User needs and requirements are being actively reviewing by RQTF/APA.
The following outlines a range of user needs and requirements. The user needs have also been compared to existing use cases for Real-time text (RTT) such as the IETF Framework for Real-time text over IP Using the IETF Session Initiation Protocol RFC 5194 and the European Procurement Standard EN 301 549. [[rtt-sip]] [[EN301-549]]
Not all atomic items necessarily are pinned next to other atomic elements but some may be dependent, related or updated synchronously. For example, if there are multiple atomic data points destined for an 80 character braille display that has been sectioned to display 4 atomic items in up to 19 spaces each (leaving at least one blank cell for spacing).
Here the term atomic relates to small pieces of data. For the purposes of accessibility conformance testing, the definitions and use of the terms 'atomic' and 'atomic rules' may also be useful. [[applicability-atomic]] [[rule-types]]
To successfully receive a relay call, using a mobile app, the deaf user needs to toggle between the app and the handset. If the incoming call is a relay call, the user will immediately activate an app, otherwise the call may be lost. In the UK, the party calling the deaf user provides a prefix to connect to the relay center. If the calling party does not use a prefix for the relay call, then the call with not have text relay assistance. In the US for example, with captioned phones the situation is different as users have access to captioning on the go as the calling party does not need to use a prefix.
Moving beyond mono in this context is also important, as the stereo spread allows audio descriptions to be sound staged. Applications should also inherit customization settings from the users operating system.
This user need may also indicate necessary support for 'Total conversation' services as defined by ITU in WebRTC applications. These are combinations of voice, video, and real-time text (RTT) in the same real-time session. [[total-conversation]]
Regardless of the media involved, encryption support should be available for all main and alternate communication channels.
This relates to cognitive accessibility requirements. For related work at W3C see the 'Personalization Semantics Content Module 1.0' and 'Media Queries Level 5'. [[personalization]] [[media-queries]]
Some braille users will also prefer the RTT model. However, braille users desiring text displayed with standard contracted braille might better be served in the manner users relying on text to speech (TTS) engines are served, by buffering the data to be transmitted until an end of line character is reached.
There are potential real-time communication application issues that may only apply in immersive environments or augmented reality contexts.
For example, if an RTC application is also an XR application then relevant XR accessibility requirements should be addressed as well. [[xaur]]
Scenario: A deaf user watching a signed broadcast needs a high-quality frame rate to maintain legibility and clarity in order to understand what is being signed.
EN 301 549 Section 6, recommends WebRTC applications should support a frame rate of at least 20 frames per second (FPS). More details can be found at Accessible Procurement standard for ICT products and services EN 301 549 (PDF)
Scenario: A hard of hearing user needs better stereo sound to have a quality experience in work calls or meetings with friends or family. Transmission aspects, such as decibel range for audio needs to be of high-quality. For calls, industry allows higher audio resolution but still mostly in mono only.
Scenario: A hard of hearing user needs better stereo sound so they can have a quality experience in watching HD video or having a HD meeting with friends or family. Transmission aspects, such as frames per minute for video quality needs to be of high-quality.
EN 301 549 Section 6, recommends for WebRTC enabled conferencing and communication the application shall be able to encode and decode communication with a frequency range with an upper limit of at least 7KHz. More details can be found at Accessible Procurement standard for ICT products and services EN 301 549 (PDF)
WebRTC lets applications prioritise bandwidth dedicated to audio / video / data streams; there is also some experimental work in signalling these needs to the network layer as well as support for prioritising frame rate over resolution in case of congestion. [[webrtc-priority]]
The following table maps some of the user needs and requirements presented in this document with other related specifications such as those defined in RFC 5194 - Framework for Real-time text over IP Using SIP and EN 301 549 - the EU Procurement Standard. [[rtt-sip]] [[EN301-549]]
Related specs or groups | Mapping to RFC 5194 - Framework for Real-time text over IP Using SIP: | Mapping to EN 301 549 - EU procurement standard | |
---|---|---|---|
Incoming calls | WCAG/AGWG, ARIA. | Similar to 6.2.4.2 Alerting - RFC 5194/ pre-session set up with RTT 6.2.1 | Maps to 6.2.2.2: Programmatically determinable send and receive direction |
Accessible call setup | WCAG/AGWG, ARIA. | Under 'General Requirements for Text over IP (ToIP) ' | No Mapping |
Routing | Media Working Group, Web and Networks IG, Second Screen. Audio Device Client Proposal may fulfil this need and allow complex routing and management of multiple audio input and output devices. | No Mapping | No Mapping |
Dynamic audio description values | Media Working Group, Web and Networks IG, Second Screen. | No Mapping | No Mapping |
Audio-subtitling/spoken subtitles | Media Working Group, Web and Networks IG, Second Screen. | No Mapping | No Mapping |
Communications control | Media Working Group, Web and Networks IG. Second Screen API may fulfil this user need. HTML5 supports exists and the streams need to be separable. Could be managed via a status bar. | Similar to R26 in 5.2.4. Presentation and User Requirements. | Maps to 6.2.1.2: Concurrent voice and text |
Text communication data channel | Media Working Group | Similar to R26 in RFC 5194 5.2.4. Presentation and User Requirements. NOTE: Very similar user requirement to 'Audio Routing and Communication channel control' | No Mapping |
Control relative volume and panning position for multiple audio | Web Audio Working Group. Multichannel may be covered by the Web Audio group space, and in audio device proposal, with some WebRTC requirements. | No Mapping | Maps to 6.2.1.2: Concurrent voice and text NOTE: Very similar user requirement to 'Audio Routing and Communication channel control' |
Support for Real-time text | WebRTC | Similar to R26 in RFC 5194 5.2.4. Presentation and User Requirements. NOTE: Very similar user requirement to 'Audio Routing and Communication channel control' | No Mapping |
Simultaneous voice, text & signing | Could be partially enabled via RTT in WebRTC. | Relates to RFC 5194 - under R2-R8 | No Mapping |
Support for video relay services (VRS) and remote interpretation (VRI) | May relate to interoperability with third-party services. | Relates to RFC 5194 - under R21-R23 | No Mapping |
Distinguishing sent and received text | May relate to interoperability with third-party services. This is not WebRTC specific and may be a user interface accessibility issue. | Relates to RFC 5194 - under R16 - but this does NOT fully address our use case requirement. | Maps to 6.2.2.1: Visually distinguishable display |
Warning and recovery of lost data | This is not WebRTC specific and may be a user interface accessibility issue. | Relates to RFC 5194 - under R14-R15 | No Mapping |
Quality of video resolution and frame rate | No Mapping | No Mapping | EN 301 549 Section 6, recommends WebRTC applications should support a frame rate of at least 20 frames per second. |
Assistance for older users or users with cognitive disabilities | Needs further clarification/review may be an accessible user interface or personalization issue. | Relates to RFC 5149 - Transport Requirements/Text over IP (ToIP) and Relay Services. | No Mapping |
Identify caller | WCAG/AGWG, ARIA. Needs further clarification/review may be an accessible user interface issue. Identity may be handled by the browser via Identity for WebRTC 1.0. | Similar to R27 in RFC 5194 5.2.4. Presentation and User Requirements | Maps to 6.3 Caller ID |
Live transcription and captioning | Browser APIs needed to implement this are available; needs better integration with third-party services (e.g. for sign language translation). Possibly covered by general requirements for Text over IP (ToIP) contained in RFC 5194. | Covered under 5.2.3 (transcoding service requirements). Referring to relay services that provide conversion from speech to text, or text to speech, to enable communication. | No Mapping |
The following is a list of new user needs and requirements in this document:
The following is a list of updated requirements to existing user needs:
The following are other changes in this document:
This user need may also indicate necessary support for 'Total conversation' services as defined by ITU in WebRTC applications. These are combinations of voice, video, and real-time text (RTT) in the same real-time session. [total-conversation]
This document has been updated based on document feedback, discussion and Research Questions Task Force consensus.
This work is supported by the EC-funded WAI-Guide Project.