Even if you jumped through all of the hoops to gain access, the reading the voice felt unnatural and the lack of settings (no ability to change the voice, reading speed, volume or control playback) didn’t make the user experience particularly good. It was hidden deep within an obscure options menu and once found, you had to manually add it to the Quick Access Tool Bar or Ribbon. Using text to speech in office for the longest time has quite frankly been a bit of a faff. Most of my writing is in Microsoft Word and although it has had text to speech capability for a while, it’s not something I was ever tempted to use unless there was no other option. Even today, with the wealth of assistive technology (AT) at my finger tips, text to speech is still the main weapon in my arsenal. Text to speech was the first piece of productivity-boosting tech I used when first looking into software that could help with some of my dyslexic difficulties. And even if you are not dyslexic if you speak English as a second language, want to give your eyes a rest or just wanted ramp up your productivity, using a computer reader can be a really effective tool. Speech analytics systems, automated speech self-service, mobile devices, automobile navigation systems, car infotainment with climate control systems and media players, hands-free phones, personal navigation devices and other smart devices are all examples of the way speech recognition has penetrated our lives, all originating from IBM’s early vision in this area.While speed reading and skimming are generally the go-to methods for more efficient reading, for me using ‘ Text to speech‘ has always been the best way to power through documents. Today, speech recognition technology appears in a very broad variety of applications that go beyond the desktop. The Mandarin system was so impressive it was demonstrated to the President of China, Jiang Zemin, when it was initially launched. It was no surprise that the technology could work for languages such as German, Spanish, French, and Italian, but the team continued to demonstrate the power of the statistical methodologies by also creating high successful dictation systems for Mandarin and Japanese in conjunction with colleagues from the China Research Lab and Tokyo Research Lab. In 1997, IBM introduced IBM ViaVoice, the first ever continuous dictation product that was offered in multiple languages. With this system, a radiologist would dictate the examination of a patient’s X-ray, and MedSpeak/Radiology would convert the comments into a written report. It was the world’s first continuous-speech recognition dictation and work-flow product. IBM MedSpeak/Radiology was also released that year. The dictation function included a 22,000 to 42,000 word vocabulary (depending on the language), and supported US English or Spanish dictation, and a spelling dictionary of 100,000 words. ® applications, making it useful in offices, schools and even homes. This voice recognition software worked with In 1996, VoiceType Simply Speaking was released. Both systems were used mostly in the medical and legal fields, and in business and government. It was later renamed IBM VoiceType Dictation, and was capable of recognizing 32,000 words at a rate of approximately 70 to 100 words per minute, with 97 percent accuracy. The next year brought the IBM Personal Dictation System, the first dictation system for the personal computer. In 1992, IBM released its first dictation system, the IBM Speech Server Series. Within only a couple of years, the technology had all been ported by the team to special purpose hardware that ran on an IBM-PC AT. At the time, an IBM mini-computer and three array processors filling a whole room was needed. The first real-time large vocabulary dictation system was demonstrated in 1984 by the IBM speech team. The 1980s saw the development of real-time speech recognition systems embodying statistical methodologies. Hidden Markov Models, statistical language modeling, the use of Viterbi and Stack Decoders, all now completely ubiquitous, were all pioneered by IBM Research in the 1970s by Fred Jelinek and his team. From the beginning, IBM took a statistical approach to speech recognition technology, grouping sound into thousands of different units based on their characteristic combinations of frequencies.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |