Learning from AT&T Labs Text-to-Speech Demo

I'm trying out the AT&T Labs Text-to-Speech demo that came to my attention on Twitter. I encourage you to do the same!

It answered a question that had bothered me for some time. How do text-to-speech tools handle spelling errors and emoticons? Well, a lot depends on context, but I still think people should stop being sloppy with their spelling. (Grrr!)

  • A typical typo for the word "the" is "teh". The misspelled version was a garbled sound. I fed the TTS demo this sentence: "This is a longer sentence and I want to see if I can hear teh difference when I misspell the word "the"." I barely caught the "teh", and the word in quotes at the end was very abrupt – I would have drawn out the word in quotes, but "Crystal" (the voice I chose) spat it out in a nanosecond!
  • The abuse of "your" versus "you're" drives me batty. It looks like people who use TTS are spared too much anguish. I tried "you're welcome to your opinion" and "your welcome to your opinion". There was no difference in pronunciation. I also tried "Is this you're book?", but heard no difference.
  • I ventured into "lolspeak" with cheezburger vs cheeseburger. In the lolspeak version, the "g" was pronounced like the "g" in the word "German". I assume the system defaulted to some basics when it encountered unfamiliar words.
  • Emoticons were familiar. For the smiley, both the two-character version and the three-character version were recognized. Colon plus close parentheses and colon, hyphen, plus close parentheses give you "smile". The emoticon colon plus open parentheses gives you "frowning". That is anatomically correct, so to speak, but I always read that as sad. That was a surprise. Maybe I am just not that fluent in emoticons! (By the way, I spelled out the emoticons because no matter which html tags I used, WordPress insisted on converting my characters to visual emoticons!)

All in all, this is a learning experience. I know there are many, many people out there who are completely unfamiliar with any type of assistive technology (AT). I suggest that those people learn to play with any AT tools like this to gain insight into a different angle on the world! I think you can become a technical communicator from such experimentation. There is an FAQ on this site where you can learn much more.

Maybe your next step is trying out tools such as Fire Vox for your Firefox browser or the Opera browser voice options. (For Opera Voice on the Windows platform, read the instructions on the Opera site; for the Mac platform, use the built-in VoiceOver (find it through the Opera browser or Mac online help.))

Have fun!


Speech accessibility and communication aids enable people who are unable to talk, or to talk clearly. These users may have acquired brain damage, autism, cerebral palsy, Down syndrome, intellectual impairment, or strokes. Many speech recognition systems are unable to recognize the speech of these users because they are based on average speakers. Because of the inconsistency of most impaired speech, speaker dependent systems do not have a high rate of accuracy. Speech recognition should not be the only form of input. In addition, some users with impaired speech may have additional motor skills accessibility problems because of impaired dexterity.

Speech accessibility includes difficulties with language and meaning, and difficulty producing intelligible speech. Language and meaning difficulties can also be related to Cognitive impairment. See the tags list below for related resources.

Reference Books and Resources

There are several excellent books related to speech. See the suggested reading list for general information and detailed reference books for your library.

Speech Resources

Learning and Speaking

Find more resources using the Areas of Focus Speech category search.

Recent and Relevant

Advances in Converting Text to Speech

  • Apple® Accessibility Features Vision built into all Macintosh computers provides adjustable keyboard, an ergonomic mouse, CloseView screen magnification software, Easy Access system software (StickyKeys, SlowKeys, MouseKeys), electronic documentation, key-repeat disable, text-to-speech synthesis and voice recognition (PlainTalk), sticky mouse, and visual alert cues. The VoiceOver spoken English interface for Mac OS X is a fully integrated, built-in screen reader technology providing access to the Macintosh through speech, audible cues, and keyboard navigation.
  • NaturalReader a powerful Text To Speech reader: Listen to PDF files, webpages, e-books, e-textbooks, office documents, and printed books.