I'm glad to see Apple doing this. I use text-to-speech extensively to proof new books and I've been getting irritated by some of Alex's little verbal quirks. MacFixIt recently had a discussion of them, including the mangling of singulars and plurals.
I'm not sure if it is just the dictionary though. I've seen errors that pop up when reading a paragraph that disappear when reading a phrase. It's like Alan, who's programmed to pause for a breath like real people, is also inadvertently also programmed for slips of the tongue. Maybe from time to time that 'breathing' is cutting out part of a word.
OS X's more basic problem is that VoiceOver is far too complex for most users and the basic text-to-speech is too primitive to be of much use.
Basic text-to-speech borders on the useless. It's hideously clumsy to use. Most applications don't support it, requiring a trip to the very long Services menu to start reading and another to stop it. And the only two choices it offers: reading from the beginning if no text is marked or reading a marked text, aren't what we need most of the time. We need a text-to-speech that begins at the cursor, that goes until we stop it, and that lets us stop easily for correcting (when proofing) or when something interrupts us (when just reading).
I recently helped a woman at a local public library who'd just bought a new MacBook and was having trouble finding out how it could work around her vision problems. I showed her VoiceOver and a webpage at Apple to get her started. About a week ago, I checked out VoiceOver for myself. What I found, I told a helpful representative for the disabilities group at Apple, was an application designed for disability "whiz kids"--that slice of those with disabilities who are so talented and so driven, that they love a complex interface that lets them do almost anything.
I pointed out to Apple that quite a few people with vision problems can still use the screen to locate, they simply find it tiring to read on screen. They need something that makes reading anything longer than a few paragraphs easy. Text-to-speech is too primitive and clumsy to do that, while VocieOver is too complex and irritating.
Last week, I played with VoiceOver for hours, reading its manual and searching online. I couldn't find anyway to get it to behave consistently or stop its irritating practice of reading every menu and window title in sight. And I could never figure out what the distinction between a "mouse cursor" and a "VoiceOver cursor" was, even though it seems to matter a lot. In the end, I concluded that VoiceOver was designed by and for people who'd spent their entire lives working with complex programs doing much the same thing. They had no problem figuring out how to use it, but for a neophyte, much less one wrestling with multiple issues that make reading a manual hard, it was worthless. It was, I told them, like using InDesign or Quark to write a letter to a granddaughter. Yes, you can do it, but do you really want the bother and hassle?
I'm not sure VoiceOver can be simplified, but basic text-to-speech could be much improved. It needs to be better integrated with OS X and something done to encourage developers to build it into applications. It should be easy to start and stop (perhaps even using the new headphone controls on newer Macs). It should be easier to speed up or slow down of the fly. Nothing that useful should required a trip to a Preferences pane. It should also have (like VoiceOver) a mode where it reads punctuation. That's not just the book editor in me talking. Grandmothers writing an email to a granddaughter, don't want to look like a ninny either.
Text-to-speech would also be vastly better, and a step beyond Windows 7, if it would allow users to display a scrolling (perhaps dark and high-contrast) window with the text as it is being read. No amount of text-to-speech can distinguish words that sound alike such as "there" and "their." It'd let all of us proof better, and those much-larger letters would be marvelous for those with vision problems. And make sure it scrolls. As best I can tell, that feature in VoiceOver displays only the first few words of a passage on screen and then stops. That's not much help. And eventually, it'd be nice for everyone if it'd let users fix typos on the reading screen, rather than forcing us to stop text-to-speech, fix a minor typo, and return to text-to-speech.
The first 1984 Mac featured text-to-speech to wow audiences, but in the quarter of a century since then Apple have never transformed it into anything more than a bit of glitz. It needs to improve text-to-speech and make it actually useful. A good illustration of a feature done-right is the Zoom feature, which uses Control-scroll or (on more recent trackpads) Control-two-finger-drag to zoom a screen at the cursor. Text-to-speech should be that easy to use. And it shouldn't require developers to take all the steps VoiceOver apparently requires for it to work properly with an application.
In short, Apple needs to create a text-to-speech that "just works."
--Mike Perry, Inkling Books, Seattle