This speech synthesis article explainswhat speech synthesis is and how speech software and speech text are used. Also learn more about the origination and history of speech synthesis worldwide.
Speech synthesis refers to artificial production that imitates human speech, and the computer system that creates it is called a “speech synthesizer.” It may be created through a “text-to-speech” (TTS) system, which turns regular text into speech or use a special system that is specialized and not readily readable. To understand more about speech synthesis, keep reading.
What Is Speech Synthesis?
Speech synthesis can be based on recordings of speech segments, stored in a database and called up by a program that links them together. The size of the segments will depend on the use to which they will be put. Speech synthesis can also incorporate a model that synthesizes a human voice on the spot.
The first attempts to produce human speech mechanically may have been those of Christian Kratzenstein, a Danish scientist working at the Russian Academy of Sciences, who - in 1779 - built a model that could create five sounds, the ones often referred to as “long vowel” sounds. Consonant sounds were first known to be producible when Wolfgang von Kempelen unveiled his Acoustic-Mechanical Speech Machine in 1791 in Vienna.
Experiments using mechanical and semi-electrical means continued into the 1960’s, but in the meantime, electrical synthesis of speech was also being developed, beginning in 1922. VODER (Voice Operating Demonstrator), generally considered to be the first speech synthesizer, was shown at the 1939 New York World’s Fair by Homer Dudley. George Rosen made articulatory synthesis possible in 1958 with DAVO (Dynamic Analog of the Vocal Tract), developed at MIT, and the first complete text-to-speech system for English was, interestingly enough, created in Japan by Norko Umeda and associates working at Electrotechnical Laboratory.
An aid for reading that included an optical scanner was first introduced in 1976 in the Kurzweil Reading Machine for the Blind. The 1970s through 1980s saw the development of commercial TTS and speech synthesis products. A new element in the field is the proposed Speech Synthesis Markup Language (SSML) Version 1.1 recommended by W3C. This specification aims to assist in standardizing the ability to control elements of speech, including volume, pitch, rate, and pronunciation.
Speech synthesis is still a developing field. The best way, perhaps, to get an idea of the current state of speech synthesis, is to listen to some examples. This website at the University of Stuttgart in Germany features a collection of Examples of Synthesized Speech, presented by Gregor Möhler. http://www.ims.uni-stuttgart.de/~moehler/synthspeech/#english
Uses of Speech Synthesis
Speech synthesis has a number of applications, some of which you may have heard yourself.
Speech synthesis is used by commercial firms to provide customer interaction opportunities to access routine information (your bank balance) and at times when staffing is limited (at nights and on weekends).
Speech synthesis is used to help with spelling and pronunciation when a person is learning a new language, and can create opportunities to learn when no human teacher is available. A speech synthesizer can be a proofreading aid, helping you go through a document aurally, which can compliment, and - for many - improve upon, visual checking for grammar and style issues. Spelling mistakes may also become easier to detect.
Speech synthesis has a variety of applications in assistive technology. Some systems are designed to give Deaf and vocally handicapped persons a means of aural communication, either in person or over the telephone. Systems allow keyboard input as well as choices or pre-constructed phrases.
Speech synthesis has been a important development for the sight-impaired, whose access to written material was once limited to what a human reader could provide synchronously, or the audio books that various publishers chose to make available. Reading machines using speech synthesis have changed that. Today, speech synthesis can work both with printed materials and with web pages and other virtual documents.
- Global Positioning Systems (GPS)
One of the most recent applications of speech synthesis is in GPS technology, so that people being given directions by the system while driving do not need to look away from the road to receive information.
AT&T Text to Speech - research.att.com
Review of Speech Synthesis Technology - www.acoustics.hut.fi
World Wide Web Consortium - w3.org
Information Week: GPS Update - informationweek.com