History of Computers - Speech Recognition

Ryder Desenberg ^[1]

Overview

Speech recognition is the translation of spoken words into text. Researchers first started to investigate the possibility of speech recognition in the 1930's ^[2] ^[3]. However, the early technology would be put to shame by the advanced software of today. In 1952, after years of research, researchers at Bell Labs created Audrey, the first speech recognition system ^[4]. It could only understood digits, and could only recognize one voice. Ten years later, IBM released the Shoebox, which was capable of understanding 16 English words^[5] ^[6]. Researchers around the world continued to build new systems, but it wasn't until DARPA began its Speech Understanding Research program in 1971 that another large step would be made. The five year program resulted in Carnegie-Mellon's Harpy program. Harpy could understand over 1000 words, about the vocabulary of a three year old. Over the next 20 years, speech recognition technology continued to improve, but never made it to consumers until finally, in 1990, Dragon Dictate was released^[7] ^[8]. Dragon Dictate worked fairly well, but not well enough to be practical for everyday life, and certainly not well enough to justify its $9000 dollar price tag. 7 years later Dragon NaturallySpeaking was released. This program allowed users to speak at up to 100 words per minute, however they had to "train" the program for 45 minutes. For the next decade, progress came to a near halt. By about 2008, accuracy was at about 80%, and had been there for nearly a decade. Then came the google voice search. In addition to the normal analysis, Google's app added analysis of what had been said by all the other people using the app to better determine what was probably being said. Then in 2010 Google began to record voice searches to improve the speech model. This helps ensure that the program knows how you talk and allows it to take into account factors such as cadence and accents.

Types of Speech Recognition

1. Speaker Dependent: The program studies the users voice over time or during a "training session" in order to improve accuracy ^[9] ^[10]

2. Speaker Independent: The program does not learn from the user, or adapt in any way once it has been installed^[11] ^[12].

Examples of Speaker Dependent Systems

1. Google Now^[13] ^[14]

2. Google Voice Search^[15] ^[16]

3. Dragon^[17] ^[18]

4. Most High End Software^[19]

Examples of Speaker Independent Systems

1. Siri^[20]

2. Most Speech to Text applications^[21] ^[22] ^[23]

Significance

While voice recognition is a handy tool for most people, it is far from vital. However for those who have disabilities, it is the only reason they are able to use much of the technology that is so vital to every day life. For instance those who can not type rely solely on speech recognition. When compounded with basic AI software, voice recognition can be an invaluable tool and can even serve as an effective assistant.

References

[1] ↑

[2] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[3] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[4] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[5] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[6] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[7] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[8] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[9] ttp://www.techhive.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html

[10] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[11] ttp://www.techhive.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html

[12] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[13] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[14] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[15] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[16] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[17] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[18] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[19] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[20] ttp://www.techhive.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html

[21] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[22] ttp://www.techhive.com/article/243060/speech_recognition_through_the_decades_how_we_ended_up_with_siri.html

[23] ttp://www.dragon-medical-transcription.com/historyspeechrecognition.html

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

[17]

[18]

[19]

[20]

[21]

[22]

[23]

History of Computers - Speech Recognition

Contents

Overview

Types of Speech Recognition

Examples of Speaker Dependent Systems

Examples of Speaker Independent Systems

Significance

References

Navigation menu

Views

Personal tools

Navigation

Search

Tools