Cockpit VR finding applications in medicine

Some call it a peace dividend, but it was the wages of war that purchased most of the voice recognition (VR) technology now being developed for medical and health care applications.

"Cockpit VR was seen as an important tool for the fighter pilot flying at a high rate of speed and trying to operate a number of sophisticated weapons systems at the same time," said Larry Afrin, M.D. Testing civilian VR software packages and tracking their development, Afrin anticipates this cold war spinoff to be both a time (and money) saver for the physician and a significant step toward the prompt and efficient delivery of high quality health care.

But the problem in the early days of (civilian) VR was the slow processing speed of the personal computer (PC). It simply couldn't do the computing for VR fast enough to keep up with someone's speech.

Now that processing speeds have increased, VR technology has advanced. The first wave of PC-based VR systems boasted an 80 to 90 percent accuracy rate. "That sounds good, but you would have to really believe in the technology to make it work for you," Afrin said. "At it's best, you would have to endure 50 errors in a 500-word document."

He explained further that the early VR packages were discrete speech systems—you had to pause briefly between each word because the VR algorithms (steps in the process) weren't yet sophisticated enough to pick out individual words in a continuous stream of speech. Also, these systems handled only relatively small vocabularies, needed extensive training to understand the user's speech patterns, and couldn't understand anyone other than the person who had trained it.

"It was much more a case of the user adapting to the computer rather than the computer adapting to the user," Afrin said.

Second wave

By the time the second wave of VR software came out, PC processors had even more horsepower. This not only provided quicker recognition but also allowed for an increased vocabulary. Some of the systems could be programmed with specialty vocabularies, making them more suitable for medical applications.

Despite quicker recognition, though, a second generation system still required discrete speech (pausing between each word) and was still speaker-dependent, only recognizing the speech of the one user who had trained it.

Speaker-independent VR is a major hurdle, Afrin said. It requires the computer to filter out the aspects of sound that make a voice sound high or low (female or male, for example) or accented for one dialect or another, and instead focus on the basic sounds of speech—phonemes—that we string together to form a specific word. The word "ar-tur-ee" (artery), as spoken by a middle-aged native South Carolina female, would only be recognized by the speaker-dependent system that that physician had trained, whereas a speaker-independent system would just as well understand it if spoken by an elderly British male or a young Turkish female.

Speaker-independent systems are available now —but only with very limited vocabularies. For example, many airlines now have automated phone systems that allow you to either punch numbers on the touch-tone telephone or simply say the numbers to the computer listening on the other end. Led by the new need to convert military technology for civilian use, speaker-independent technology is now making great strides and will be available soon with acceptably large vocabularies.

Editor's note: As MUSC rolls out its Electronic Medical Records systems, the prospect of using computers to turn spoken words into electronically stored data to be retrieved in printed form is an attractive one for physicians. For the next few issues The Catalyst will examine this technology, how it's being used and what it promises for the future.

Catalyst Menu | Community Happenings | Grantland | Research Grants | Research Studies | Seminars and Events | Speakers Bureau | Applause | Archives | Charleston Links | Medical Links | MUSC |