SHANE SCHICK
Globe and Mail Update Last updated on Tuesday, Mar. 31, 2009 10:26PM EDT
We should have been able to throw out our keyboards by now.
If you watch old reruns of Star Trek, for example, the crew might be punching some buttons and pulling some levers to navigate the Enterprise, but no one does any typing. When they need information, they ask for it, and they ask for it out loud.
That was the interface sketched out for television audiences more than 30 years ago. In reality, the only progress we've made is to shift the burden of typing from our fingers (on PCs or laptops) to our thumbs (on mobile phones or handhelds). People who work in the speech recognition industry would tell you that's not progress at all.
Most of us encounter speech recognition for the first time when we order a pizza from a chain like Domino's. You dial a number, and a voice asks you what toppings you would like.
The software behind the system recognizes the commands and sends the order to the appropriate kitchen.
In Domino's case, the software behind that process was made by a company called Tellme out of Mountain View, Calif. Last week, Microsoft bought Tellme, which could mean the promise of speech recognition was more than just talk.
When I first started profiling speech recognition providers back in the 1990s, there were a lot of competitors. Many of those, like Lernout & Hauspie and Conversa, either merged or died off completely.
That Tellme survived despite the dot-com bust suggests it's either doing something right, or it has lasted long enough to take advantage of the increased use of mobile phones for data applications.
"The fact is that right now the phone has been relatively unchanged in decades -- you still pick it up today and you hear two tones meshed together as the dial tone, and that you have to type in a bunch of numbers to get something done, or to reach somebody. We think that that world is ripe for change," Tellme founder Mike McCue said in a conference call after Microsoft announced the acquisition.
"We think that when you pick up a phone, the phone should ask you, 'what do you want to do, who do you want to call.' And you can say what you want."
This is a lot different from how the speech recognition market started.
Originally, the software was used as an alternative to writing, either by the disabled or by people who wanted a glorified dictation device.
That market was pretty much dominated by IBM's ViaVoice, but since Big Blue sold off its PC business to Lenovo, the latter hasn't done a lot to advance the product. Grant Fairley, a Toronto-based consultant who specializes in selling and training users on ViaVoice, said he isn't confident a new version will come out any time soon.
"I think they saw Microsoft coming, and decided there's no point in competing on [software features] that are going eventually to be included for free," he says.
Products like ViaVoice worked well, Mr. Fairley said, but they required customers to "train" the software to understand their individual speech patterns.
The earliest products demanded "discrete" speech. "You had to talk . . . Like . . . This . . .," Mr. Fairley says. "The learning curve begins when you start using it, whereas for Word or other office programs, they tend to build on things you've already learned how to do."
A lot of people just didn't bother, and as a result the speech recognition market didn't really reach its potential as a transcription tool. As a conduit for connecting with databases, though, it can be pretty intuitive.
Think of the 411 service, where callers can tell the system what language they prefer, the name of the person or business they're looking up, and receive a number. Mr. McCue says Tellme can take it one step further, by allowing callers to ask for information and have it sent as text to their mobile phone.
"People are on the go, they want to use their mobile devices, they want that productivity experience to carry over to that," says Jeff Raikes, the president of Microsoft's business division.
"They want the ability to use voice as a way to interface there, whether it be to access information, or whether it be to connect with colleagues."
Cellphones plus speech recognition is a powerful combination. It's just easier to talk than it is to type on the go.
Imagine if the owner of a small business, for example, could dial a number on his cellphone and ask for the latest sales figures that are stored on the company's Microsoft SQL database, or for some background information on a customer that's sitting in a Word file on his desktop. That's the dream, anyway.
"Microsoft will do a good job of creating interest," Mr. Fairley says. "Bill Gates has said speech recognition is important, and I think he means it."
Hopefully, the interest Microsoft generates will attract some other vendors back to this market, so that there will be a couple of compelling software programs with built-in speech recognition capabilities to choose from.
Competition might also mean the technology develops more quickly, and we could get a little closer to that Star Trek scenario.
I don't imagine a lot of small businesses are going to go out and buy speech recognition programs, but as it gets easier to use, customers are going to expect it as a form of communication and information retrieval.
You can already use speech recognition to obtain flight information or find the closest restaurant, and at some point they'll want other services as well.
Once those customers come calling, your business better have software that's prepared to listen.
Shane Schick is editor of Computing Canada.
Join the Discussion: