IBM Developing Voice Recognition for Handhelds

Researchers at IBM's Thomas J. Watson Research Center have been working for years at developing workable speech recognition for Palm handhelds, according to the NY Times. This is based on decades of research by the company into speech technology. Liam D. Comerford, a speech engineer, has a heavily-modified prototype Palm III that has both speech recognition and text-to-speech capabilities.

He can dictate memos and have the handheld read them back. He can also verbally enter Date Book entries. Instead of beeping for an alarm, Mr. Comerford's Palm can announce that he has an appointment. He can also dictate expense items.

His modifications included adding a speaker, a microphone, and an additional processor to the Palm.

Thanks to Ophir Prusak for the tip. -Ed

Related Information:

Article Comments

 (20 comments)

The following comments are owned by whoever posted them. PalmInfocenter is not responsible for them in any way.
Please Login or register here to add your comments.

Comments Closed Comments Closed
This article is no longer accepting new comments.

Down

Ooookaayy....

I.M. Anonymous @ 10/13/2001 7:51:48 PM #

Useless until all that stuff can fit the 505 form-factor.

Even more importantly...
I.M. Anonymous @ 10/14/2001 1:16:16 AM #
No way, dude!

What it really needs to do is fit in a brand new unit that is 1/8 inch thick, as narrow as the Clie, shorter than the 505, with a larger screen than either that runs at 640x800, with a virtual grafitti area, 32 bit color, litium ion batteries with a minimum 1/2 year between recharges, full multi-media support at 60 frames a second and CD quality audio, Bluetooth, always on wireless, support for CF, SD and MemStick, and has a integrated full-size fold out keyboard for when your voice is hurting (but that adds nothing to the size of the unit). And of course it can't cost more than $250 or it will never sell.

Until they do that, voice recognition is pretty much worthless and really a complete waste of time to continue to research. Besides, it'll never amount to anything useful.

Sheesh, some of the posts on this site.

RE: Ooookaayy....
I.M. Anonymous @ 10/14/2001 8:02:37 AM #
That's funny. I can't get voice recognition to work probably on my desktop. What chance that there will be tolerable recognition levels on my Palm?

RE: Ooookaayy....
robrecht @ 10/14/2001 9:47:39 AM #
Voice recognition for autodial on cell phones doesn't seem to work that well either.

Thanks, Robrecht
RE: Ooookaayy....
I.M. Anonymous @ 10/14/2001 11:13:50 AM #
I saw this device at an IBM conference for User Centered Design, and it worked fairly well in terms having the Palm look up an address, read you your appointments, etc. The form factor was a very narrow sled that fit on the back of a Palm III (which was the top of the line when I saw it).

The only situation in which I could see myself using such a device is in the car when using the stylus is difficult and dangerous. Looking things up on a Palm is already very fast.

Dictation is a possibility, especially for some professions (e.g., in the medical field), but I wouldn't use it myself.

RE: Ooookaayy....
I.M. Anonymous @ 10/14/2001 7:08:21 PM #
It don't need to be 505 size. A sled or whatever is fine because as the above people have noted you would really only use it in the car.

It would be great for the car even if it don't work great. (It couldn't be any worse than the garbled crap I input driving at 90mph.)

RE: Ooookaayy....
I.M. Anonymous @ 10/15/2001 12:32:24 PM #
I remember that the first Phillips Nino (remember that one?) had limited voice recognition for accessing contacts or other simple functions. It worked so bad it was funny - like those old Doonsbury cartoons of the first Newton's handwriting recognition capabilities.

I would love usable voice recognition on my PDA - especially for dictation of memos, emails, minutes, etc. But I am so self-concious, I doubt I could ever see myself talking into my PDA, even if it did work.

--Charlie

RE: Ooookaayy....
Zachrey @ 10/12/2002 3:19:09 AM #
Hi Folks,

I believe it is possible to have highly accurate speech recognition on a Treo (cell phone integrated w/handspring PDA).

Currently, there is simply not enough computing horsepower and memory on a PDA to get good user independent speech recognition. That horsepower needs to reside at the speech recognition server at the cell phone service provider.

If you are talking to someone on the Treo phone and you want to setup an appointment with them, initiate a 3 way call with the speech recognition service and state the date, start time, end time, and location and record a brief voice memo associated with the event. The speech recognition server interprets the speech and pushes the appointment onto your pilot and stores the memo associated with it on your voice mail.

If you needed driving directions, the server could forward the call to TellMe (800-555-8355) and it will give you driving directions for free. If TellMe would be willing, they could make arrangements to push the text of the driving directions onto your Treo, or you could simply record the directions into voice mail.

Pretty neat huh?

Zac

Great for Medical charting

I.M. Anonymous @ 10/13/2001 8:09:57 PM #
nt


Battery Life

palm_pilot_guy @ 10/13/2001 8:18:43 PM #
Is it just me or wouldn't the Palm-on-steroids drink electricity like a drunken sailor?

palm_pilot_guy
-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷-÷
a.k.a. Skinner @ www.trainsimstation.com
The LORD of Palms and Microsoft TrainSim, mostly the Acella and Dash 9!
RE: Battery Life
I.M. Anonymous @ 10/14/2001 10:41:17 AM #
Don't be silly, drunken sailors drink beer, not electricity.

Where's the beef?

I.M. Anonymous @ 10/14/2001 11:23:44 AM #
Well, we've been seeing demonstrations of this IBM voice recognition/voice recording sled thing for two years now. Palm/Workpad hardware has undergone two changes since then (V then the 500 series), so any sled that IBM brings to market for the Palm III shape won't be very useful.

RE: Where's the beef?
GregGaub @ 10/14/2001 11:59:36 AM #
While I'm not a real big Handspring Visor proponent, it sounds like the Springboard technology would an ideal testing ground for this technology. The Visor already has the microhpone. All the module would need is the extra processor, ROM holding the necessary software and system hacks, and a speaker for the text to speech stuff. It might cost $150 at first, but I bet there are a bunch of people chomping at their bits to try this out in the real world. Some might even purchase a Visor to go with it just to try it. I'm not so sure a sled thing for other devices would work all that well, since the HotSync connector doesn't have the same kind of system integration that the Springboard slot does, but their largest market would be for a V/Vx or m50x sled.
-Greg

RE: Where's the beef?
I.M. Anonymous @ 10/14/2001 5:32:20 PM #
Dont be silly, IBM don't sell rebranded Visors. Supporting the competition is not what I'd want to do.

RE: Where's the beef?
I.M. Anonymous @ 10/16/2001 8:43:03 AM #
The Springboard would be an ideal hardware platform if it weren't on the way out due to its size. We're talking two or three years, here, of hardware development. By that time, high-end users will not be using devices that are large enough to support a Springboard slot. Think Edge.

Sounds like a jobs for...A Springboard Module!

TDS @ 10/14/2001 5:45:05 PM #
I think if IBM looked at throwing this hardware into a Springboard module, this could fly. Toss in a nice, fast DSP and some RAM, and it should not take up much room at all. The Visor already has a built in Microphone that does a darn nice job (At least with the dictation modules I have sold).

I wonder if IBM has some kind of marketing agreement with Palm that forbids them from creating Springboard Modules. Being that they make a custom 500 & 505 and all.

The previos poster who mentioned Doctors is correct. Dicatation is a very important part of a Doctors day, and having a nice visor Edge sized device with a voice Recognition module would be fantastic!

Good for IBM!

RE: Sounds like a jobs for...A Springboard Module!
I.M. Anonymous @ 10/14/2001 6:05:49 PM #
The previos (sic) poster who mentioned, "Dont (sic) be silly, IBM don't sell rebranded Visors. Supporting the competition is not what I'd want to do." is correct.

Don't be silly. Voice recognition is barely functional on PCs right now, but will probably be an important interface Very Soon Now. While voice recognition for PDAs is definitely the Holy Grail, the technology won't be here for at least another year.

Right now, PocketPCs appear to be the most likely to incorporate this into their feature set. Microsoft already has the software experience in voice recognition and we'll probably see chips released that offload voice processing from the PDA's CPU by next year.

My ideal device?
- Palm Vx size
- CLIE N760C color screen
- Voice regognition/transcription
- Seamless running of programs from CompactFlash
- Ability to link with cellphone for wireles access (e.g. via Bluetooth)
- PalmOS (or backward-compatible derivative)

I fully expect Sony will have something similar by the end of 2002, except it will use Memory Stick. Hopefully, CompactFlash-to-Memory Stick adapters will be widely available by then.

RE: Sounds like a jobs for...A Springboard Module!
I.M. Anonymous @ 10/15/2001 8:00:19 PM #
> Right now, PocketPCs appear to be the most likely to incorporate this into their
> feature set. Microsoft already has the software experience in voice recognition
> and we'll probably see chips released that offload voice processing from the PDA's
> CPU by next year.

Microsoft has experience in voice recognition???
chips to offload voice processing???

Since when did Microsoft have "experience in voice recognition"? Since when did Microsoft start looking toward co-processors to offload functionality from the main CPU? Neither is happening on this planet anyhow.


Guys - this is *research*

I.M. Anonymous @ 10/15/2001 3:42:47 PM #
Take off your rush to go to market hats for a minute - the only thing the research group is trying to do is see if they can get something like this to work. These are scientists, in the true sense of the word - IF they can get it to work in a way that is acceptable to them, someone can figure out what shape it will be,etc. This is at the core of R&D for arguably one of the leading US companies in research. So, don't worry - if the experiment works, I trust that big blue can figure out how to shrink it, stretch it, paint it a different color, license it, market it, etc. etc.

RE: Guys - this is *research*
Zachrey @ 10/12/2002 3:37:36 AM #
Hi again folks,


Currently, there is simply not enough computing horsepower and memory on a PDA to get good user independent speech recognition. That horsepower needs to reside at the speech recognition server at the cell phone service provider.

If you are talking to someone on the Treo cell-phone/PDA and you want to setup an appointment with them, initiate a 3 way call with the speech recognition service and state the date, start time, end time, and location and record a brief voice memo associated with the event. The speech recognition server interprets the speech and pushes the appointment onto your pilot and stores the memo associated with it on your voice mail.

If you needed driving directions, the server could forward the call to TellMe (800-555-8355) and it will give you driving directions for free. If TellMe would be willing, they could make arrangements to push the text of the driving directions onto your Treo, or you could simply record the directions into voice mail.

I need this service like I need a hole in my head! I'm tired of trying to get my palm pilot to recognize the letter "R" on the grafitti pad! And I still don't know how it can suddenly jump to a completely different date and time while I'm trying to enter the text for the appointment I'm creating.

Who wants to be a millionaire?

Cheers!

Zac


Top

Account

Register Register | Login Log in
user:
pass: