Various approaches have been employed over many years to distinguish human users of web sites from robots. The traditional CAPTCHA approach asking users to identify obscured text in an image remains common, but other approaches have emerged. All interactive approaches require users to perform a task believed to be relatively easy for humans but difficult for robots. Unfortunately the very nature of the interactive task inherently excludes many people with disabilities, resulting in a denial of service to these users. Research findings also indicate that many popular CAPTCHA techniques are no longer particularly effective or secure, further complicating the challenge of providing services secured from robotic intrusion yet accessible to people with disabilities. This document examines a number of approaches that allow systems to test for human users and the extent to which these approaches adequately accommodate people with disabilities, including recent noninteractive and tokenized approaches. We have grouped these approaches by two category classifications: Stand-Alone Approaches that can be deployed on a web host without engaging the services of unrelated third parties and Multi-Party Approaches that engage the services of an unrelated third party.
Both large and small web sites which provide interactive services have long sought to limit their services only to human users. They seek to avoid exposing their collected data and content publishing services to ever more cleverly articulated web robots. Whether the service be travel and event ticketing, email, blogging, or calendaring services, social media services, or some combination of these and many more, experience has demonstrated that even authenticated login provides inadequate protection from malicious actors. Such sites still need to know their interacting user is a human individual, and not a software robot. Arguably the industry's need for reliable Turing testing is only growing more critical.
An early (and still widespread) solution relies on the use of graphical representations of text in registration or comment areas of a web site. The site attempts to verify that the user is in fact a human by requiring the user to complete a task referred to as a "Completely Automated Public Turing Test, to Tell Computers and Humans Apart," or CAPTCHA. The assumption is that humans find this task relatively easy, while robots find it nearly impossible to perform.
The CAPTCHA was initially developed by researchers at Carnegie Mellon University and has been primarily associated with a technique whereby an individual identifies a distorted set of characters in a bit-mapped image, then enters those characters into a web form. This approach is widely familiar to users of the web, though the term CAPTCHA is generally recognized only by web professionals.
In recent times the types of CAPTCHA that appear on web sites and mobile apps have changed significantly. Since our concern here is the accessibility of systems that seek to distinguish human users from their robotic impersonators, the term “CAPTCHA” is used in this document generically to refer to all approaches which are specifically designed to differentiate a human from a computer, including fully noninteractive approaches.
It will surprise noone that we applaud the recent emergence of noninteractive approaches because functional noninteractive approaches pose no accessibility challenge to users. Unfortunately, some current noninteractive approaches come at the price of exposing much data about the individual user to the noninteractive host analysis engine that user might rather prefer to keep confidential. We are further heartened by the even more recent development of tokenized approaches that promise trustable Turing testing requiring only minimal interaction with users.
While online users continue broadly to report finding traditional CAPTCHAs frustrating to complete, it is generally assumed that an interactive CAPTCHA can be resolved within a few incorrect attempts. The point of distinction for people with disabilities is that a CAPTCHA not only separates computers from humans, but also often prevents people with disabilities from performing the requested procedure. For example, asking users who are blind, visually impaired or dyslexic to identify textual characters in a distorted graphic is asking them to perform a task they are intrinsically least able to accomplish. Similarly, asking users who are deaf or hard of hearing to identify and transcribe in writing the content of an audio CAPTCHA is asking them to perform a task they’re intrinsically least likely to accomplish. Furthermore, traditional CAPTCHAs have generally presumed that all web users can read and transcribe English-based words and characters, thus making the test inaccessible to a large number of non-English speaking web users worldwide.
While Accessibility best practices require, and assistive technologies expect, substantive graphical images to be authored with text equivalents, alternative text in CAPTCHA images would clearly be self-defeating. CAPTCHAs are, consequently, allowed under the W3C's Web Content Accessibility Guidelines (WCAG) provided that "text alternatives that identify and describe the purpose of the non-text content are provided, and alternative forms of CAPTCHA using output modes for different types of sensory perception are provided to accommodate different disabilities."
It is important to understand the limitation of the WCAG CAPTCHA exemption. It applies only to the content of the CAPTCHA. WCAG still requires that alternative text identify the graphical object as a CAPTCHA. Conformance with all other WCAG guidelines also remains critical for web accessibility.
The rationale for this highly specific exemption in WCAG is simple. A CAPTCHA without an accessible and usable alternative makes it impossible for users with certain disabilities to create accounts, write comments, or make purchases on such sites. In essence, such CAPTCHAs fail to properly recognize users with disabilities as human, obstructing their participation in contemporary society. Such issues also extend to situational disabilities whereby a user may not be able to effectively view a traditional CAPTCHA on a mobile device due to the small screen size, or hear an audio-based CAPTCHA in a noisy environment.
Malicious activity on the web has only grown over the years to comprise an alarmingly high percentage of all Internet traffic. While we would certainly not suggest the web's woes arise from sloppy or ill-considered CAPTCHA implementations, we do suggest current conditions only reinforce the importance of well considered and closely monitored security and privacy strategies consistent with appropriate user support that includes people with disabilities. Getting CAPTCHA right needs to be part of the solution.
It is important to acknowledge that using a CAPTCHA as a security solution is becoming increasingly ineffective. Current CAPTCHA methods that rely primarily on traditional image-based approaches, logic problems, or audio CAPTCHA alternatives can be largely cracked using both complex and simple computer algorithms. Research suggests that as character-based CAPTCHAs become increasingly vulnerable to defeat by advancing optical character recognition technologies, more severe distortion of the characters is introduced to resist these attacks. However, such enhanced distortion techniques also make it progressively less feasible even for humans who are well endowed with sensory and cognitive capacity to solve CAPTCHA challenges reliably, ultimately making character-based CAPTCHAs impracticable [[captcha-ocr]].
Pattern-matching algorithms can achieve an even higher success rate of cracking CAPTCHAs in some instances, as demonstrated in CAPTCHA Security: A Case Study [[captcha-security]] and HMM-based Attacks on Google’s reCAPTCHA with Continuous Visual and Audio Symbols [[recaptcha-attacks]]. While efforts are being made to strengthen traditional CAPTCHA security, more robust security solutions risk reducing the typical user’s ability to understand the CATPCHA that needs to be resolved, e.g., Defeating line-noise CAPTCHAs with multiple quadratic snakes [[defeat-line-noise]]. A recent study at the University of Maryland has demonstrated 90% success rate cracking Google's audio reCAPTCHA using Google's own speech recognition service. Indeed, as noted below, Google's V. 2 reCAPTCHA service has recently begun declining to actually provide the audio CAPTCHA alternative clearly proffered onscreen.
In fact it is arguable that online services which offer the content developer a ready solution for distinguishing human users from robots may well be helping defeat that very function. For example Google's reCAPTCHA proclaims : "Hundreds of millions of CAPTCHAs are solved by people every day. reCAPTCHA makes positive use of this human effort by channeling the time spent solving CAPTCHAs into annotating images and building machine learning datasets. This in turn helps improve maps and solve hard AI problems." It is legitimate to consider whether it also describes a classic vicious cycle which is helping defeat the effectiveness of visual and auditory CAPTCHA deployments.
It is therefore highly recommended that the purpose and effectiveness of any deployed solution be carefully considered and evaluated across multiple browser and operating system environments before adoption, and then closely monitored for effective performance. Alternative security methods, such as two-step or multi-device verification, along with emerging protocols for identifying human users with high reliability should also be carefully considered in preference to traditional image-based CAPTCHA methods for both security and accessibility reasons.
Many techniques are available to web sites to discourage or eliminate fraudulent activities such as inappropriate account creation. Several of them may be as effective as the visual verification technique while being more accessible to people with disabilities. Others may be overlaid as an accommodation for the purposes of accessibility. We group our review by interactive and non-interactive categories
The traditional character-based CAPTCHA, as previously discussed, is largely inaccessible and insecure. It focuses on the presentation of letters or words presented in an image and designed to be difficult for robots to identify. The user is then asked to enter the CAPTCHA information into a form.
The use of a traditional CAPTCHA is obviously problematic for people who are blind, as the screen readers they rely on to use web content cannot process the image, thus preventing them from uncovering the information required by the form. Because the characters embedded in a CAPTCHA are often distorted or have other characters in close proximity to each other in order to foil technological solution by robots, they are also very difficult for users with other visual disabilities. This common CAPTCHA technique is also less reliably solved by users with cognitive and learning disabilities, see The Effect of CAPTCHA on User Experience among Users with and without Learning Disabilities [[captcha-ld]]. Because they’re intentionally distorted to foil robots, they also foil users who are more easily confused by surrealistic images or who do not possess sufficiently acute vision to “see” beyond the presented distortion and uncover the text the site requires in order to proceed.
While some sites have begun providing CAPTCHAs utilizing languages other than English, an assumption that all web users can understand and reproduce English predominates. Clearly, this is not the case. Arabic or Thai speakers, for example, should not be assumed to possess a proficiency with the ISO 8859-1 character set [[iso-8859-1]], let alone have a keyboard that can easily produce those characters in the CAPTCHA's form field. Research has demonstrated how CAPTCHAs based on written English impose a significant barrier to many on the web; see Effects of Text Rotation, String Length, and Letter Format on Text-based CAPTCHA Robustness [[captcha-robustness]].
To re-frame the problem, text is easy to manipulate, which is good for assistive technologies, but just as good for robots. One logical solution to this problem is to offer another non-textual method of using the same content. To achieve this, audio is played that contains a series of characters, words, or phrases being read out which the user then needs to enter into a form. As with visual CAPTCHA however, robots are also capable of recognizing spoken content—as Amazon’s Alexa and Android’s Google Assistant, among other spoken dialog systems, have so ably demonstrated. Consequently, the characters, words, or phrases the user is to uncover and transcribe in the form are also distorted in an audio CAPTCHA and are usually played over a sonic environment of obfuscating sounds.
The industry recognized this problem early. CNet reported in Spam-bot tests flunk the blind [[newscom]] that “Hotmail’s sound output, which is itself distorted to avoid the same programmatic abuse, was unintelligible to all four test subjects, all of whom had good hearing.”
If the sound output, which is itself distorted to avoid the same programmatic abuse, can render the CAPTCHA difficult to hear; there can also be confusion in understanding whether a number is to be entered as a numerical value or as a word, e.g.,‘7’ or ‘seven’. Often the audio CAPTCHA user will hear sounds which seem to be words or numerical values that should be entered, but turn out to be just background noise.
Sound is also intrinsically temporal, but the import of this unavoidable fact is too often under appreciated—perhaps because the world we live in as seen through the eyes is also temporal. Unlike the real world seen through the eyes however, the traditional CAPTCHA is a still image that can be stared at until comprehension dawns. Sound has no analog to the visual still image.
Whenever any portion of an audio CAPTCHA is not understood; at least some part of the CAPTCHA must be replayed, usually several times. Currently, few audio CAPTCHAs provide an easily invoked and reliable replay feature, let alone an independent volume control or a pause, rewind, and fast-forward feature. Consequently, an entirely new audio CAPTCHA is often played should any part of one audio CAPTCHA prove difficult to understand.
Some audio CAPTCHA tacitly admit this failure by offering a link allowing the user to Download the audio CAPTCHA, typically as a mp3 file. The implicit assumption is that the user will use a favorite audio player—which does provide for independent volume control and pause, play, rewind, and fast forward capabilities—to play the audio CAPTCHA MP3 file again and again until comprehension dawns, perhaps pausing and rewinding the playback and perhaps writing down on the side the text destined for the web form. Clearly this is very inconvenient and subject to web site time outs. It also illustrates why simply providing an audio CAPTCHA alternative to the traditional visual CAPTCHA does not provide equivalent access to the user.
Furthermore, just as not all web users should be presumed proficient with English in visual CAPTCHA, they should not be presumed capable of understanding and transcribing aural English in an audio CAPTCHA. Unfortunately, non English audio CAPTCHAs appear to be very rare indeed. We are aware of only one multilingual CAPTCHA solution provider with support for a significant number of the world's languages.
Users who are deaf-blind, don’t have or use a sound card, find themselves in noisy environments, or don’t have required sound plugins properly configured and functioning, are thus also prevented from proceeding. Furthermore, relatively few audio CAPTCHAs properly support all the various browsers and operating systems in use today. Similarly, users of browsers which do not support easy direction of sound output to a particular audio device, or to all available audio devices on the system, are also hampered.
Although auditory forms of CAPTCHA that present distorted speech create recognition difficulties for screen reader users, the accuracy with which such users can complete the CAPTCHA tasks is increased if the user interface is carefully designed to prevent screen reader audio and CAPTCHA audio from being intermixed. This can be achieved by implementing functions for controlling the audio that do not require the user to move focus away from the text response field; see Evaluating existing audio CAPTCHAs and an interface optimized for non-visual use [[eval-audio]].
Experiments with a combined auditory and visual CAPTCHA requiring users to identify well known objects by recognizing either images or sounds, suggest that this technique is highly usable by screen reader users. However, its security-related properties remain to be explored, as mentioned in Towards a universally usable human interaction proof: evaluation of task completion strategies [[task-completion]].
The goal of visual verification is to separate human from machine. One reasonable way to do this is to test for logic. Simple mathematical or word puzzles, trivia, spatial tasks, or similar logic tests may raise the bar for robots, at least to the point where using them is more attractive elsewhere.
The use of logic puzzles as a CAPTCHA technique, however, introduces substantial barriers to access for people with language, learning or cognitive disabilities. An individual living with dyscalculia will understandably find even simple arithmetic puzzles challenging. A blind individual will be unable to identify the hammer from among graphical depictions of common tools. When puzzles are used, therefore, it is advisable to support a variety of puzzles so that someone unable to solve a given puzzle can obtain a different kind of puzzle when requesting another challenge.
Any development of CAPTCHA challenges in this direction should be accompanied by thorough usability research involving people with a variety of language, learning, and cognitive disabilities, as such an approach remains largely unexplored in practice and in the research literature. It should also be noted that answers may need to be handled flexibly, if they require free-form text. Also, a system would have to maintain a vast number of questions, or shift them around programmatically, in order to keep spiders from capturing them all for use by web robots. Puzzle-based CAPTCHA challenges are also readily subject to defeat by human operators engaged in crowd-sourcing activity on behalf of attackers.
There are a number of CAPTCHA techniques based on the identification of still images. This can include requiring the user to identify whether an image is a man or a woman, or whether an image is human-shaped or avatar-shaped among other comparison approaches, such as CAPTCHAStar! A novel CAPTCHA based on interactive shape discovery, [[captchastar]], FaceCAPTCHA: a CAPTCHA that identifies the gender of face images unrecognized by existing gender classifiers [[facecaptcha]], and Social and egocentric image classification for scientific and privacy applications [[social-classification]].
While alternative audio comparison CAPTCHAs could be provided such as using similar or different sounds for comparison, the reliance on visual comparison alone would make these techniques difficult, if not impossible for people with vision-related disabilities.
A 3D representation of letters and numbers can make it more difficult for OCR software to identify them, in turn increasing the security of the CAPTCHA, described in On the security of text-based 3D CAPTCHAs [[3d-captcha-security]]. However, this solution raises similar accessibility issues to traditional CAPTCHAs.
This process is based on the movement of interactive elements such as a slider or the completion of a basic video game as a CAPTCHA, like Game-based image semantic CAPTCHA on handset devices [[game-captcha]]. The benefits include removal of language barriers, and the removal of CAPTCHA frustration due to the intuitiveness of the associated task and the enjoyment of playing video games.
Importantly, the implementation of this CAPTCHA would need to support multiple input interfaces as different devices may lack some input methods such as a keyboard or touchscreen. Another potential issue is that screen reader support for interface elements may unintentionally provide a backdoor for the CAPTCHA to be bypassed by allowing a bot to play the game.
Biometric identifiers have become a very popular authentication mechanism, especially on mobile platforms which routinely now provide the requisite hardware. Some physical characteristic of the user, such as a fingerprint or a facial profile, is first acquired and then recognized to verify the individual’s identity. This process effectively limits the ability of web robots to create a large number of false identities.
However, biometric authentication mechanisms also need to be carefully designed to avoid introducing accessibility barriers. Individuals who lack the biological characteristics required by a particular authentication method, e.g., fingers, or who are unable to perform the enrollment procedures, e.g.,senior citizens whose fingerprints can no longer be reliably sensed due to aging, are effectively precluded from using a fingerprint biometric. This can result in denial of access to certain users with disabilities and explains why reliance on a single biometric identifier is insufficient to satisfy public sector procurement standards in the European Union EN 301 549, section 5.3 [[en-301-549]] and regulations under section 508 of the Rehabilitation Act and Section 255 of the Communications Act, 36 CFR 1194, Appendix C, section 403 in the United States [[36-cfr-1194]].
For this reason, biometric identification systems should be designed to allow users to choose among multiple and unrelated biometric identifiers. With that sole caveat, properly designed biometric identification systems are particularly attractive in situations where it is necessary to identify a particular human user. Their reliability is high, the cognitive load placed on the user low, and they are particularly difficult to foil. They have not yet been rendered suitable, however, in circumstances when it is necessary to preserve the user’s anonymity (i.e., the task is verifying that the user is human, without providing identifying information).
Users of free accounts very rarely need full and immediate access to a site’s resources. For example, users who are searching for concert tickets may need to conduct only three searches a day, and new email users may only need to send the same notification of their new address to their friends. Sites may create policies that limit the frequency of interaction explicitly (that is, by disabling an account for the rest of the day) or implicitly (by slowing the response times incrementally). Creating limits for new users can be an effective means of making high-value sites unattractive targets to robots.
Drawbacks to this approach include the need to perform sufficient testing and data collection to determine useful limits that will serve human users yet frustrate robots. It requires site designers to look at statistics of normal and exceptional users, and determine whether clear demarcation exists between them.
While traditional CAPTCHA and other interactive approaches to limiting the activities of web robots are sometimes effective, they do make using a site more cumbersome. This is often unnecessary, as non-interactive mechanisms exist to check for spam or other invalid content typically introduced by robots.
This category contains three non-interactive approaches: spam filtering, in which an automated tool evaluates the content of a transaction, heuristic checks, which evaluate the behavior of the client, and honeypots which are designs intended to be ignored, or even to be invisible to users while trapping robotic responses.
These techniques can be regarded as an alternative to the implementation of CAPTCHA. Significantly, they can also complement CAPTCHA, or other strategies identified in this document. Since a CAPTCHA can be circumvented by an attacker (e.g., using crowd-sourcing techniques), cryptographic keys can in some circumstances be misappropriated, and users' interactive sessions can be usurped for malicious purposes, detecting and responding to web robots that have successfully gained access to a web resource is desirable even in the presence of other measures. The advantages for people with disabilities of limiting the imposed sensory and cognitive demands, however, only arise if these non-interactive strategies are used alone, or if they are combined with other CAPTCHA-avoidance approaches.
Applications that use continuous authentication and “hot words” to flag spam content, or Bayesian filtering to detect other patterns consistent with spam, are very popular, and quite effective. While such risk analysis systems may experience false negatives from time to time, properly-tuned systems can achieve results comparable to a traditional visual CAPTCHA, while also removing the added cognitive burden on the user and eliminating access barriers.
Most major blogging software contains spam filtering capabilities, or can be fitted with a plug-in for this functionality. Many of these filters can automatically delete messages that reach a certain spam threshold, and mark questionable messages for manual moderation. More advanced systems can control attacks based on posting frequency, filter content sent using the Trackback protocol, and ban users by IP address range, temporarily or permanently.
Heuristics are discoveries in a process that seem to indicate a given result. It may be possible to detect the presence of a robotic user based on the volume of data the user requests, series of common pages visited, IP addresses, data entry methods, or other signature data that can be collected.
Again, this requires a careful examination of site data. If pattern-matching algorithms can’t find good heuristics, then this is not a good solution. Also, polymorphism, or the creation of changing footprints, is apt to result, if it hasn’t already, in robots, just as polymorphic (“stealth”) viruses appeared to get around virus checkers looking for known viral footprints.
Another heuristic approach identified in Botz-4-Sale: Surviving DDos Attacks that Mimic Flash Crowds [[killbots]] involves the use of CAPTCHA images, with a twist: how the user reacts to the test is as important as whether or not it was solved. This system, which was designed to thwart distributed denial of service (DDoS) attacks, bans automated attackers which make repeated attempts to retrieve a certain page, while protecting against marking humans incorrectly as automated traffic. When the server’s load drops below a certain level, the CAPTCHA-based authentication process is removed entirely.
Providing a CAPTCHA visible to robots but not to humans appears to be sufficiently successful to be supported in several content management systems such as Drupal Honeypots and in several commercial WordPress plugins. The form is created to attract robots and then hidden from the user by markup such as CSS-Hidden. It's an approach that is easily implemented even in hand authored markup and should be considered. The Hilton Hotel Corporation has used a honeypot CAPTCHA on the Sign In page for Hilton Honors, its loyalty program website where a prominent focusable field is labeled: "This field is for robots only. Please leave blank."
Acquired in 2009 from Carnegie Mellon University, Google's reCAPTCHA overwhelmingly dominates CAPTCHA deployment on the web today. Version 2 provided an API that was most effectively marketed as the "no CAPTCHA re CAPTCHA," and its checkbox proclaiming: "I'm not a robot" became a cultural icon, spawning various cultural offshoots in art, theater, and popular music.
The checkbox was, of course, never a checkbox in the traditional HTML sense. The pseudo-checkbox process became a prodigious collector of user data well beyond mouse movement and keyboard navigation, including The date, the language the browser is set to, All cookies placed by Google over the last 6 months, CSS information for that page, an inventory of mouse clicks made on that screen (or touches if on a touch device), an inventory of plugins installed on the browser, and an itemization of All Javascript objects, all to determine whether the user is human or robot. Of course Google also generally knows much about individual users, including their customary IP addresses, the telephone numbers and email addresses of their friends, family and colleagues, where they have been at every moment of every day, as well as their web search and YouTube habits. This is why the simple checkbox could keep the CAPTCHA process disarmingly simple, though it also explains why a link to Google's privacy policy has always accompanied the "no CAPTCHA reCAPTCHA". Disclosure and certain provisions of the Privacy Policy are required to satisfy legal requirements in California and in the E.U.
Even though specific WCAG failures were often noted, Google's reCAPTCHA V2 was for a time regarded the most accessible CAPTCHA solution for one simple reason, it was capable of being comfortably completed using a variety of assistive technologies. More recently it has been widely observed that utilizing keyboard navigation, as many assistive technology users do, no longer works. Instead, users are presented with a traditional inaccessible CAPTCHA as a fall-back mechanism. Our own tests with various browsers on various operating environments have been generally successful with Google's own reCAPTCHA test page. However, browsing in incognito mode, clearing or blocking cookies, and additional factors can apparently trigger a fallback to traditional CAPTCHA these days for many assistive technology users.
One reCAPTCHA V. 2 innovation seemed most promising. Rather than reproduce characters, users were asked to type the words they saw (or heard). It even appeared unnecessary to spell these correctly or to enter all the words presented in order to be adjudged human.
Most disappointingly, it now appears that audio CAPTCHAs previously available with V. 2 implementations are now no longer being provided. Instead users see a message that reads: "Your computer or network may be sending automated queries. To protect our users, we can't process your request right now." Users who have depended on audio CAPTCHA, who were previously able to function with reCAPTCHA v.2., were thus suddenly and completely locked out and denied service on sites still using V.2.
Late in 2018 Google released reCAPTCHA V3 promissing to eliminate "the need to interrupt users with challenges at all." Google also informed us that their goals with V.3 included increasing "the accessibility of the web by removing traditional CAPTCHAs" entirely. Obviously, fully noninteractive Turing testing is a most welcome development direction for accessibility. When the noninteractive Turing test returns a score indicating high confidence that the user is human, or indeed a score indicating high confidence that the user is a robot, and experience has demonstrated the noninteractive engine is reliable, we can only offer praise and gratitude for technological progress that more effectively supports persons with disabilities.
Of course no approach will always return unambiguous results. In such situations Google advises that content providers "use a secondary challenge that makes sense in the context of their site such as two-factor authentication, send the post to moderators, or combine the score with signals specific to their site to make a more informed judgement." Google intends that traditional CAPTCHA no longer be used as a fallback mechanism and has dropped it from V.3 reCAPTCHA, though it remains in their slightly older, 2017 reCAPTCHA V.2 Invisible service.
The reality is that what action is taken in response to an ambiguous core returned by V. 3 is in the hands of the content provider. Services like reCAPTCHA gain their market share by offering to relieve the content provider of the hard work inherent of mounting effective and accessible Turing testing. Sadly this leaves the door open to any fallback approach acontent provider might choose. It is therefore imperative that methods for disambiguating an ambiguous noninteractive score be well documented and easily implementable in order to better overcome the tendency to simply adopt the old familiar approach.
It has become common for many, though by no means all users to access various on line services through multiple devices such as desktop and mobile computers, smart phones, tablets, and wearables such as smart watches. This proliferation has led to online services delivering identification solutions that take into account a combination of multi-device and multi-platform vectors for simple and effective user authentication, including persons with disabilities [[auth-mult]]. We note that several major service providers (such as Facebook) now support cross-site user authentication. However, in relation to the specific ability to tell a human and bot apart, it appears only Google's V. 3 reCAPTCHA API provides cross-site CAPTCHA services without actually passing specific identifying data.
We would expect Google's V. 3 reCAPTCHA system would score no need to present a CAPTCHA whenever another browser tab is already properly logged in to a Google product such as Calendar on whatever registered user device. However, while this may prevent the third party site from collecting personal data, it does assist Google in acquiring more user data. This constitutes a significant cost to the user's privacy in an industry so capable of cross-referencing massive amounts of data in the absence of meaningful regulations and controls on where and how that data may be used. This is a very strong accessibility concern as people with disabilities are generally reluctant to disclose any information about their disability on the web except when and only when they expressly shoose to reveal that information themselves for their own particular reasons.
The multi-device environment is widely used to authenticate a human user by requiring some action on a registered second device, most often a smart cellular telephone. Known as "dual factor authentication," this process is mandatary at each of the three largest email service providers, Gmail, Yahoo, and Outlook which accept outbound mail only after the user has authenticated through a telephone number they are required to provide. Similarly, should Twitter spot activity it consider suspicious, it will hold tweets until the user revalidates through both a reCAPTCHA challenge and a telephone call. Yet increasingly, as telemarketing calls proliferate, web users are reluctant to provide data aggregators their personal telephone numbers. Clearly, a voice only authentication approach also cannot serve deaf and hard of hearing users properly.
Access a Google account service through a new browser or laptop and Google will hold off granting access until the user responds to a pop-up "toast" message on the registered telephone device showing the user's photo and asking: "Was that you?" The user must verify that it was indeed they before access can continue. A variant previously common at Google and still in use elsewhere places a voice call or sends a text message with a short code the user is required to input into a form field to continue.
Another variant of this approach, one employed by Cisco's Webex teleconference service, asks the user to press any DTMF key on their telephone to continue. This is easy enough on a desk phone, but it becomes problematic for the TTS dependent screen reader user who must now hear the screen reader's speech in order to get the dialpad to pop up, and then find a DTMF key all at the same time as the service voice is also speaking, repeating something like: "Press 1 to continue, press 1 to continue ..."
Providing the user the option of contact via voice and/or text is good, but some services offer only text. This disadvantages the user without an accessible text capable device, and there are many such situations. As ever, the rule must be to provide options for the user to choose among, including fallback options.
Many large companies such as Microsoft, Apple, Amazon, Google and the Kantara Initiative have created competing “federated network identity” systems, which can allow a user to create an account, set his or her preferences, payment data, etc., and have that data persist across all sites and devices that use the same service. Due to large companies now requiring a federated identity to use cloud-based services on their respective digital ecosystems, the popularity of federated identities has increased significantly. As a result, many web sites and services allow a portable form of authentication and identification across the Web. Perhaps some of these could also provide cross-site CAPTCHA services.
Single sign-on services provide multiple disparate services to users, including to third parties through APIs. Because of this wide scope, they need to be among the most accessible services on the Web in order to offer equal benefits to people with disabilities. They also pose two significant risks for users. One is the aforementioned opportunity to collect massive amounts of personal data. The other risk comes into play should user credentials ever be exposed where users can suddenly find themselves exposed on multiple sites.
Another approach is to use certificates for individuals who wish to verify their identity. A party relying on a certificate offered by a user attempting to access online services can assess the trustworthiness of the certificate's issuer, and the likelihood that the private key has been compromised, in evaluating the risk that the offerer is actually a web robot rather than a human agent. Highly trusted certification authorities such as governments, as in Estonia's e-Residency Program require evidence of an individual's identity as a basis for issuing a certificate. Provided that the private key is not compromised and cannot be misused by an attacker, there is a high degree of assurance that messages cryptographically signed by it which could serve to establish the user's identity to web-based services have genuinely been authorized by the certificate holder.
The use of certificates as an indicator that an access attempt has been authorized by a human discloses the user's identity to the web service provider, and thus should not be deployed in circumstances in which anonymity is necessary. In addition, Transport Layer Security (TLS) client certificate authentication, as defined in TLS 1.2 and earlier versions of the protocol, gives rise to privacy concerns [[tls-tracking]].
A variant of this concept, in which only people with disabilities who are affected by other verification systems would register, is sometimes proposed. Such approaches raise significant privacy and stigmatisation concerns and are usually opposed strongly by people with disabilities themselves and by organizations that serve them. Such approaches should not be confused with situations where people voluntarily self-identify as individuals with disabilities. An example is the U.S. based Bookshare whose services are only available to persons with documented print disabilities under the terms of an international copyright treaty administered by the United Nations' World Intellectual Property Organization (WIPO) and known as the Marrakesh Treaty. [[marrakesh]]
The "cloud" has become a well-known term among computer users. It describes the growing concentration of web content and software service delivery in content delivery networks (CDN) such as Akamai, Cloudflare, and Amazon Cloudfront. These CDNs provide the value add of localized last mile cached content delivery and the ability to effectivly deflect various malicious activity such as denial of service (DOS) attacks. As almost two-thirds of Internet content is now delivered by CDNs, they are now also unintentionally forced to become Turing test arbiters. This in turn has resulted in the development of fresh innovative approaches to CAPTCHA such as Privacy Pass [[privacy-pass]], now available as a browser extension on Cloudflare.
While Privacy Pass still begins with a CAPTCHA challenge, it does provide the user a trove of cryptographically blinded tokens which can satisfy further challenges in the background and dramatically reduce interactive CAPTCHA challenges. Most refreshingly it offers meaningful privacy protection, even anonymity, while reliably validating the user is human. Essentially, the CDN is function as a trust broker on the user's behalf. When a user "spends" a token, they're saying to the site they're accessing: "You don't trust me, but you do trust the entity that issued this token, and they're vouching for me." As this approach is developed further, we can reasonably hope the onus of the initial challenge can be further mitigated with robust support for web accessibility, perhaps by expanding available initial CAPTCHA validation approaches, e.g.,adding support for biometrics.
CAPTCHA development has certainly become more sophisticated over time. This has included the development of several alternatives to text-based characters contained in bitmapped images, some of which have served to support access for persons with disabilities. However, it has also become clear not only that traditional CAPTCHA continues to be challenging for people with disabilities, but also that it is increasingly insecure and arguably now ill suited to the purpose of distinguishing human individuals from their robotic impersonators.
Yet the need for reliable and accessible solutions persists. In fact the need has arguably become more urgent as the limits of authenticated login alone have become more and more evident in misuse of major services around the globe.
It is therefore highly recommended that the purpose and effectiveness of any deployed CAPTCHA solution be carefully considered before adoption, and then closely monitored for effective performance. As with all good software and on line content provisioning, analysis should begin with a careful consideration of system requirements and a thorough understanding of user needs, including the needs of persons with disabilities.Clearly, some approaches such as Google's reCAPTCHA, two-step or multi-device verification can be easily and affordably deployed. Yet problems persist even in these systems, especially for non English speakers. Furthermore, deployers of such approaches should be aware that they are participating in exposing their users to a massive collection of personal data across multiple trans-national data profiling systems, quite apart from any societal governance.
It is important, therefore, also to consider available stand-alone approaches such as honeypots and heuristics, along with current image and aural CAPTCHA libraries that support multiple languages. As always, testing and system monitoring for effectiveness should supply the ultimate determination, even as we recognize that an effective system today may prove ineffective a few years from now.
We summarize our conclusions as follows:
In other words, while some CAPTCHA approaches are better than others, and while more recent approaches offer clear advantage over older approaches, there is still no single, ideal solution. It is important to exercise care that any implemented CAPTCHA technology correctly allow people with disabilities to identify themselves as human.
The following terms are used in this document:
This publication has been funded in part with U.S. Federal funds from the Health and Human Services, National Institute on Disability, Independent Living, and Rehabilitation Research (NIDILRR) under contract number HHSP23301500054C. The content of this publication does not necessarily reflect the views or policies of the U.S. Department of Health and Human Services, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. Government.