Inside OS X 10.8 Mountain Lion GM: Dictation & speech

paulmjohnson · July 13, 2012 11:59AM

Quote:

Originally Posted by mjtomlin

DragonDictate is a Nuance product. The Nuance speech recognition engine is a learning system. The more you use it, the more accurate it gets. Since Apple's use of the engine is through their servers, it of course would be far more accurate after the millions and millions of translations it has performed since last October.

I am amazed at how much better it has become since its release; it almost always recognizes what I say.

I agree with you. I thought it (and Siri with it) was a pile of crap when it first launched. It maybe got 20% of what I said, even if I spoke with exceptional diction. Now I'd say it's up to 80% and seems to be getting better all the time.

zozman · July 13, 2012 12:06PM

Quote:

Originally Posted by SolipsismX

I love dictation in Mountain Lion. In fact i'm using it right now. OK now how do I turn it off. No that's not it. Ah there it...

Hahaha Smart ass

These are the times i wish i just paid the money to get dev versions of stuff.

pendergast · July 13, 2012 12:37PM

Potentially unrelated, but I would like the ability on the iPhone to have Siri read me whatever I'm looking at in Safari's Reader. I.e., I'd pull up a page that's compatible with Reader, hit the home button, and say "Siri, read this to me."

solipsismx · July 13, 2012 12:43PM

pendergast wrote: »

Potentially unrelated, but I would like the ability on the iPhone to have Siri read me whatever I'm looking at in Safari's Reader. I.e., I'd pull up a page that's compatible with Reader, hit the home button, and say "Siri, read this to me."

I've never tried that with Reader in Safari but it is pretty good about reading back text. You can set up a triple-click option in Accessibility to enable and disable VoiceOver. It's not exactly what you want but it might be effective enough as a stand in.

tallest skil · July 13, 2012 12:48PM

aeronprometheus wrote: »

How much does anyone want to bet that the new MacBook Pro has the hardware for full Siri support?

As Siri is server-side, anything with a microphone has the hardware for Siri.

danielrucci · July 13, 2012 2:40PM

Does anyone know the limitations for dictation?

- How much content can you dictate at once?

- How does it recognize the "end" of your dictation? when it hears a period of silence? Is there a button to click to stop dictation?

- Does it send to apple after you've finished speaking or clicked the "stop" button (if it exists), or does it do it in chunks during?

-DR

solipsismx · July 13, 2012 2:48PM

danielrucci wrote: »

Does anyone know the limitations for dictation?

- How much content can you dictate at once?
- How does it recognize the "end" of your dictation? when it hears a period of silence? Is there a button to click to stop dictation?

- Does it send to apple after you've finished speaking or clicked the "stop" button (if it exists), or does it do it in chunks during?

-DR

- Essentially limitless.
- You have to start and stop it manually. The default it hitting the unused fn key to initiate and end he session. It's instant. We aren't quite at Star Trek level voice commands yet.
- I think it sends it while you are speaking. One could test this by using very long strings of dictation to see how fast it responses after you close the session. It's fast so you might need to read it War and Peace or a couple hours.

macky the macky · July 13, 2012 3:21PM

Quote:

Originally Posted by flowney

I hope that Apple exposes this technology to third parties via an API (may already be available). This could power speech-to-text apps that create captions and subtitles for video, I can see this in iMovie (for home movies) and in conferencing software such as Bb Collaborate and in webcasting apps such as WireCast. Legislation relating to media accessibility is being enforced more rigorously and accessibility groups are suing non-compliant entities.

It has already been shared. Watch for it in your next version of Microsoft Office.

macky the macky · July 13, 2012 3:23PM

Will it be available in the old Bullwinkle voice? God, how I miss him.

fredaroony · July 13, 2012 4:06PM

Quote:

Originally Posted by mjtomlin

DragonDictate is a Nuance product. The Nuance speech recognition engine is a learning system. The more you use it, the more accurate it gets. Since Apple's use of the engine is through their servers, it of course would be far more accurate after the millions and millions of translations it has performed since last October.

I am amazed at how much better it has become since its release; it almost always recognizes what I say.

Exactly, it's used by countless Doctor's around the world for dictation.

justadcomics · July 13, 2012 10:49PM

Well, this kind of sucks. While I'm looking forward to the Dictation feature, I'm not looking forward to losing the Speech Recognition commands. I actually use them quite frequently. Phooey.

aeronprometheus · July 14, 2012 10:22AM

Quote:

Originally Posted by Tallest Skil

As Siri is server-side, anything with a microphone has the hardware for Siri.

The iDevices that support Siri have a special chip and mic hardware with a unique noise-cancelling bit. Remember how they focused on the fact that the new MacBook Pro has a new mic with a dual receiver? I think they left the backdoor open for Siri there.

tallest skil · July 14, 2012 10:24AM

aeronprometheus wrote: »

The iDevices that support Siri have a special chip and mic hardware with a unique noise-cancelling bit. Remember how they focused on the fact that the new MacBook Pro has a new mic with a dual receiver? I think they left the backdoor open for Siri there.

But that's not necessary for Siri to work perfectly. Do you really think they'll be releasing an updated Thunderbolt Display with two microphones and prevent owners of the 24" Cinema Display, 27" Cinema Display, and Thunderbolt Display from using Siri at all? I don't think so.

macpro · July 14, 2012 10:47AM

tallest skil wrote: »

But that's not necessary for Siri to work perfectly. Do you really think they'll be releasing an updated Thunderbolt Display with two microphones and prevent owners of the 24" Cinema Display, 27" Cinema Display, and Thunderbolt Display from using Siri at all? I don't think so.

Hard to say you know, Apple have become a wee bit brutal on what can and what can't do stuff of late. All perfectly understandable really if it is hardware limitations in play but it can be painful. Look at mid 2010 MBP running ML, they can't utilize airplay for video even though they run ML fine otherwise. I assume this is related to the CPU.

macpro · July 14, 2012 10:50AM

solipsismx wrote: »

I've never tried that with Reader in Safari but it is pretty good about reading back text. You can set up a triple-click option in Accessibility to enable and disable VoiceOver. It's not exactly what you want but it might be effective enough as a stand in.

I got to see a funny and unintentional use of diction in ML a few days ago. The Mac was attached to an external Apple LCD via USB at the time. Safari was playing a CNN news video and dictation was transcribing the audio to text, pretty well too!

I can see the day coming when you can have two Apple devices having a conversation with each other using Siri ... / smile

macpro · July 14, 2012 11:25AM

blitz1 wrote: »

Why recognition is done on the Apple Servers iso at home is beyond me.

Is this a serious post?

tallest skil · July 14, 2012 12:56PM

digital clips wrote: »

Hard to say you know…

Ah, I have my proof: there's a single microphone on the iPad 3. Siri does not require multiple microphones as per Apple's own admission.

Look at mid 2010 MBP running ML, they can't utilize airplay for video even though they run ML fine otherwise.

This will be one number on one line of one .plist to fix.

snourse · July 14, 2012 1:02PM

Who told you that Speakable Items was being discontinued?

I would do your homework on that one...

jpellino · July 14, 2012 2:37PM

Me of the iPhone4 with 3G am presuming the lack of Siri is mostly due to the networking needed to throw stuff back and forth to the servers, and less an issue of processing power. Any evidence to support this?

solipsismx · July 14, 2012 2:46PM

jpellino wrote: »

Me of the iPhone4 with 3G am presuming the lack of Siri is mostly due to the networking needed to throw stuff back and forth to the servers, and less an issue of processing power. Any evidence to support this?

Yeah. Very little is processed on the device and it's certainly less resource intensive than a voice call. The issue comes down to what Siri can handle. There are just too many iPhone 4 units in circulation to make that feasible. I'm wondering if Apple isn't moving too slowly on their data center capacity as we've seen Siri not work at times, even the first weekend when only 4 million iPhone 4S units were sold. That's a lot of phones but I wouldn't expect that to affect a modern data center.

The tests I've seen of Google Now show it being slightly faster than Siri. It's hard to say exactly where the issue lies with those tests but it is safe to say that server-side computing is Google's wheelhouse. It's interesting that Google has had every single piece (and more) that makes up Siri but until Apple showed the world how to arrange the pieces they were the in dark. I look forward to this technology exploding over the next few years.

Inside OS X 10.8 Mountain Lion GM: Dictation & speech

Comments