I joined IBM Research-Tokyo in 1985 as the only visually impaired researcher at a time when there were very few female researchers at the lab. Since then, I have brought a diversity perspective to my work in accessibility research, one of the fields in Human Computer Interaction (HCI). Aiming to optimize Braille book creation and sharing, I participated in the research and development of digital Braille editing system, Braille dictionary system, and Braille information sharing network system after joining the lab. I could move the research forward because of my visual impairment which allowed me to understand the value of digitizing Braille. Starting in the mid-1990s, I worked on a talking web browser for the Internet. This idea also emerged from the needs of the visually impaired, and since then it has spread in ways I never expected. Today, I am working on new technologies using smartphones, Internet of Things (IoT), artificial intelligence (AI), and other rapidly advancing technologies to better support people. In this article, I would like to discuss the role of diversity as I have experienced it through these projects.
I lost my sight when I was in junior high school due to a swimming pool accident. Since I was young, I have experienced social participation issues for the visually impaired It is widely said that there are two major barriers for the young people with visual impairments to receive education and participate in the society. First, the information barrier, and second, the mobility barrier. When I lost my sight, there were no personal computers, no Internet, and no smartphones. The only way to read was Braille books created by punching dots in the paper. There were few Braille books and they rarely included any of the textbooks required for higher education. Since Braille translation is time consuming, several months would pass between requesting and obtaining a textbook required for college classes. These experiences inspired me to start the Braille digitization project after joining IBM. With digitalization, it became possible to edit text and delete characters like we do on a word processor, and the Braille translation work could be shared among people over a network. In addition, Braille book data could be downloaded and printed on a Braille printer anywhere in Japan. It became possible to search a text, and portable electronic Braille dictionaries were produced. [i] These technologies changed education for the visually impaired in important ways.
The amount of available information expanded with the digitalization of Braille, but information sources were still limited to Braille and talking books. Then, the Web came on the scene in the mid-1990s. Since the Web was still a new technology at the time, it was only used by engineers and a few other users. When I first accessed the Web with the help of other researchers at the IBM Research lab in Tokyo, I was convinced that the vast amount of text and voice information would become a new information resource for the visually impaired. I started the research and development of a voice browser for the Web combined with a voice synthesis engine.[ii] Later, the effort was turned into a product called Home Page Reader which became the popular de facto standard. Gradually, as the need to access the Web using voice became widely recognized, voice access consideration was incorporated into the international standard for the Web as a mandatory item, and compatibility with a diverse range of needs, such as access methods, input devices, screen size, became a major focus of Web development. In addition, the websites of the federal agencies in the United States must be accessible in a variety of ways in line with the 1998 amendments to Section 508 of the United States’ Rehabilitation Act.
As a result, the development of the information technology has vastly improved information accessibility for the visually impaired. The information sources for the visually impaired have grown exponentially from Braille on paper to digital Braille, and then the Internet. This has also had a great impact on technology standards and government legislation.
It is not well known that the visually impaired played a major role in the development of the voice synthesis technologies. The history of voice synthesis technologies dates back to research and development that began in the 1960s, and the first voices had a robot-like sound. When personal computers became popular in the 1980s, general users had more opportunities to hear the synthesized voices, but the voice quality was still a long way from the human voice. Yet, voice synthesis technology was indispensable to the visually impaired when using personal computers on a daily basis to read text information and to create text using word processing software. With the exception of some special applications, the visually impaired were almost the only users of voice synthesis technologies in the 1980s and 1990s. When I developed the Home Page Reader in 1997, many able-bodied people commented that they were having difficulty understanding what the voice said, but the visually impaired had no problem. The voice synthesis was revolutionary in a sense that it expanded the sources of information, and the quality of the sound was not an issue at all. The visually impaired had continually used voice synthesis technologies from the days when the sound quality lacked clarity, and they also played a role in the development of voice synthesis technologies by providing feedback to developers. Now, in 2017, voice synthesis technologies exist all around us. They are used everywhere including car navigation systems, smartphones, at train stations and airports. It would have been difficult to develop the technologies without the efforts of the visually impaired who persevered and continued using them from the 1980s to the 2000s.
The examples of technologies that were developed and became widespread after emerging from the needs of people with disabilities are too numerous to mention. If we trace history, we will find that the telephone was originally invented in the process of developing a communication tool for the hearing impaired. It is said that keyboards were allegedly developed as a means for people with upper limb impairments to write. Character recognition was first used in text reading devices for the visually impaired. Voice recognition technologies were developed as a method for the hearing impaired to converse by voice. Around 2010, a major goal of self-driving cars was to develop cars that could be operated by the visually impaired. The perspectives of diversity and the extreme needs imposed by not being able to see or hear have triggered the creation and development of new technologies.
When I was a child, I watched a television program that featured a bird-shaped robot that assisted a boy going to fight against evil. The robot sat on the boy’s shoulder and whispered into his ear, telling him about everything from an approaching opponent to the weather. Since I lost my sight, I recalled that TV program and wished for a bird robot. Of course, this robot was simply a science fiction drawn in the 1960s. However, as the age of AI and IoT approaches, I think that it is within the range of what technology can do. We are referring to AI technologies that will be there for you like that bird robot as cognitive assistant technologies. Cognitive assistant technologies help augment human’s missing or weakened cognitive functions. Cognitive assistant is a new concept in accessibility technologies using AI, and research and development efforts are starting to flower worldwide.
With the help of cognitive assistant technologies, the visually impaired will be able to recognize obstacles at street crossings, traffic lights and on the sidewalks. Additionally, they will be able to recognize the information, such as stairways, escalators and elevators, they need to independently walk. Cognitive assistant technologies should also be able to recognize the ages and expressions of conference participants and to communicate the information to the visually impaired as necessary. By memorizing everything that the elderly sees, they could also serve as tools to complement memory. Cognitive assistant technologies will always be at person’s side ready to provide assistance as needed.
Four groups of technologies are indispensable to make cognitive assistant technologies a reality. We have localization technologies. To assist the user in the day-to-day environment, it is necessary to measure indoor and outdoor location with a high degree of accuracy. Since GPS technology today do not necessary offer the level of precision needed and cannot be used indoors, there are ongoing efforts to develop technologies to measure location with a high degree of accuracy using Wi-Fi, Bluetooth Low Energy (BLE) beacons, and image processing technologies. The system called NavCog, developed in collaboration with Carnegie Mellon University, uses BLE beacons to measure position with an accuracy of one to two meters. The NavCog system has been installed in the three buildings of the School of Computer Science at Carnegie Mellon University to guide users to their destinations with the help of a high precision navigation that identifies classrooms and labs inside the building.
Next is the recognition technologies. Image processing technology being the most important one when realizing cognitive assistant technologies. If visually impaired persons are able to recognize people, objects and the environment such as persons and their expressions, products, structures inside buildings (stairs, escalators, elevators, doors, etc.), obstacles, they will be able to obtain the information they need for their social life in a timely fashion. This would be a change similar to when we realized with the information accessibility.
Knowledge is necessary to make use of the outcomes of recognition. Recognition of products, calorie information, and social media reputation is a given, but it may also be possible to make cognitive assistant technologies more relevant by using knowledge about the individual such as behavior history or health information. Lastly, the interaction technology. Voice interaction is a given, but there is also potential for cognitive assistant technologies that can be used seamlessly in daily life with the help of glasses-style interfaces for always-on recognition, or gesture interfaces. It is also important to broaden the field of application beyond devices such as smartphones and wearable technologies to robot technologies.
The technologies needed for cognitive assistant are varied and have the added dimension of a showcase for integrating AI technologies. It is something that a single organization would find difficult to achieve and that can only be accomplished by combining the technologies of universities and the private sector. To implement such integration, open source is likely to have an important role in the future. Today, many companies and universities use TensorFlow, Google’s machine learning library. Inception, the object recognition engine based on Deep Learning running on TensorFlow, is an example of the rise in the use of open source. Aiming to popularize measurement technologies, we open sourced NavCog while streamlining it in a reusable form.[iii] We hope you will make use of it.
Open data is another important issue. Indoor mapping information is necessary to achieve indoor navigation. However, indoor mapping information is normally not available to the public as it is the property of the building owner, completely different from how outdoor mapping is managed by the country. Huge amount of image data of product packaging is required for learning purposes to recognize and read package of a candy bar or other product in a store. However, the manufacturer owns the copyright to such image data and it is not possible to use it freely. As we move into the AI era, we will need new rules for open data. To facilitate reading with NavCog, we are considering setting up an open server to register information in our immediate vicinity such as store information, sale information, signboards, and information about places where there are crowded.
To make cognitive assistant technologies a reality we must face the issue of open data. Moreover, it is no exaggeration to say that open data is an issue that society as a whole should engage with as we move toward using AI technologies. As history has shown, the needs of people with disabilities will trigger and facilitate open data, and eventually research and development of artificial intelligence. This will add to the list of precedents where diversity has opened up a new future.
We often hear about the importance of diversity in innovation. However, it is difficult to cite examples. This article introduces historical examples based on my own experience. In the process of information accessibility advancement, a variety of technologies were created and popularized. To make cognitive assistant a reality, it will be necessary to develop a wide variety of technologies. Every day, I sense the beginnings of great innovation. I hope that readers of this article will find it useful in help familiarizing them with innovation through diversity.
Translated from ”Tokushu I: Jenda to kagaku no atarashii torikumi ― Tayosei ga hiraku inobeishon (Special feature: New efforts for gender and science ―Diversity Opens the Path to Innovation),” Gakujutsu no Doko (TRENDS IN THE SCIENCES), November 2017, pp.24-28. (Courtesy of Japan Science Support Foundation) [November 2017]