The Misery of Voice Recognition Software

I hates it.

Before I typed a lot faster. This thing slows me down and drives me crazy.

This software does not learn. Instead it tries to school me. I have had to change the way I speak so it can understand me. Slower, with more precise diction, like I am impersonating a robot. I do not feel like myself when I use it.

I never intended to use it for novel writing only for e-mail and blogging and twitter and the like. But even there this software destroys my natural voice. Who spells e-mail with a hyphen! It does not recognise any of the slang, abbreviations, or made up words that I use and, of course, homonyms are a mighty pain. When I use it I am forced to avoid my habitual language. I don’t sound like me.

It claims that you can teach it. I have spent many hours training it to recognise words I use all the time that are not in its dictionary. I complete the annoying and overly long task and begin dictating. Only for it not to recognise a single word I just taught it.

Here is a list of them. See if you can figure out what I was actually saying:

Swayze
Fattening
X
Oslo
look glorious
one
just team/just Dean

It does not recognise the names of any of the characters in the books I am working on. Thus when I attempt to discuss said books with anyone else via IM or e-mail I spend most of my time having to spell those names out or just going with whatever word this software has decided I’m saying or turning it off and typing, which means unnecessary keystrokes and shortening the amount of time I can spend doing novel writing.

You also have to forget about editing, getting the cursor to go where I want it to go with voice commands has proved impossible. I am able to use it only for 1st drafts of non-fiction writing, for e-mails and chats and only with a great deal of frustration.

Even if there were none of these problems, I am a writer. I have been writing since I was little, typing since I was fourteen. My sentences do not come as fluently when I speak. I have never been as good at telling a story as I am at writing it.

On top of that I suspect that the software I’m using is somewhat buggy. Their are often long delays.1 I cannot get the command mode to work except to inadvertently delete great swaths of text. So using it for anything other than dictation is a waste of time. Forget doing research online with this thing. Given that my reason for using this software is to reduce keystrokes it’s more than a little maddening.

I know many people for whom voice recognition software is a revelation. I’m thrilled that it’s helping so many people who otherwise wouldn’t be able to write at all. I also understand that creating software that can deal with different accents and idiolects is really really hard. It really is incredible that it recognises anything I say. But at the same time I can’t help feeling that I have been sold a bill of goods. So many of the people I know who use it rave about it, say it is the best software they’ve ever used. Which meant I was expecting it to be like Harrison Ford in Blade Runner: ‘Enhance. Enhance.’ I expected it to be nigh on perfect. No such magic.

To be fair I have noticed that the latest upgrade is already performing far better than the version I loaded on my computer lo those many months ago. So those who have been using it for a long time really have seen remarkable improvements.

And yet I still hate it. In fact, I get angrier with it then with any other software I have ever used before. And I speak as a card-carrying Microsoft Word hater. Word has never caused me to throw headphones across the room. Word has never set me off on multiple 20 min uninterrupted2 vitriolic raging rants.

I have thought of myself as a writer for a very long time. Writing has been central to my sense of myself since I was a small child. Being forced to spend much less time writing has been extremely difficult. I suspect that part of my fury with this voice recognition software is not merely that it is so much slower and less accurate and less me then when I type but that it has come to symbolise the injuries that prevent me from writing with my hands on keyboards as much as I need to.

So, no, I cannot add my voice to the others praising this software. I suspect that would be true even if the software lived up to my expectations. My stories are written with my hands, not my voice.

I am very curious to hear if anyone else feels this way. I have only been using the software for 6 months. Does it get better? Does it ever come to feel like your voice?

  1. This is much better after latest upgrade. []
  2. I think Scott ran and hid. []

23 comments

  1. Shanella on #

    For some reason, my head read this post like an automaton …

    I’ve don’t use voice recognition software, but I have the built-in speech-to-text software on my Android and it’s not fun to use.

  2. Julia Rios on #

    Oh, so frustrating! I have heard a lot of similar complaints, and it makes me sad. I use my hands to type so much, and would feel quite stifled without that freedom.

    I don’t know if something like this would be helpful, but my friend is developing some open source software called Plover to help people learn stenography and apply it in situations where fewer keystrokes and faster speed would be wanted. My friend is a CART provider (Computer Asisted Realtime Transcription), and so she deals with people who need help translating from speech to text in her daily life, and wants more people to have access to faster text communication if they need it. I’m sure she’d be happy to talk to you about it if you had further questions.

  3. Joe Iriarte on #

    That sucks. 🙁

    (Wish I had something more constructive to add.)

  4. London on #

    Just team/just Dean is your name isn’t it? 🙁 Sigh. Of all the things to get wrong.
    I too have RSI and have tried out voice recognition software. I hated it, too. I gave up. Wish I had some advice to give you. The only thing it was good for was dashing out answers to practice bar exam questions, and that only because I knew I’d never reread them and no one else would ever read them either.
    I’m sorry your RSI is still so bad. It really is a wretched ailment–it’s almost impossible to describe its wretchedness. When one body part doesn’t hurt, another part does; or they all hurt at once. It’s enough to make you crazy.
    Mine is finally starting to resolve itself, 1 1/2+ years later, so I can say that it does and will get better. Eventually. (The mantra of the RSI sufferer would seem to be: patience, patience, patience.) When I think back on the shape I was in last summer, during the bar exam in fact, I am amazed at how far I’ve come. That will happen for you as well.

  5. Sherwood Smith on #

    If this is dragon, do you do the ‘improve accuracy’ thing after every session? Because only when you do that does the thing integrate the new stuff with the old.

    (Arthritis is making me adapt to this, and it IS a slow, frustrating learning curve. Though some days are better than others.)

  6. Levi Montgomery on #

    Speaking as the world’s worst typist, I have tried every single voice option that has come along. Hey, cool! Live preview of my comment. *tries some italics.* Sorry, ADD strikes again.

    Anyway, my voice is a monotonous drone that has fooled every one of them. Those voice-activated phone menu trees? Hate those, too. Trying to talk to live human beings is bad enough.

    I’ve been told that investing in a “quality” microphone will improve the demon’s (oops, sorry — I mean “dragon’s”) chances of getting things right. I was told to expect to spend $200 bucks on such a microphone, though, and I thought “No way. All my money gets frittered away on groceries and gasoline.”

    But I did think the suggestion was worth passing on.

  7. Veronica Schanoes on #

    I have a very good friend whose severe cubital tunnel syndrome means that she can neither type nor hold a pen. At all. She has been trying to work with Dragon for the past 2-3 years. I’m sorry to report that it does not get better, according to her.

  8. sirtessa on #

    i hate it. it is not my voice.

  9. Christie on #

    This is exactly how I feel about using my VRS.

  10. cameron on #

    I lost patience and gave up.

    I have another apocryphal RSI cure (ok, not a cure but it helps me manage it). Punching. I do a bag workout program and it seems to help. Even if it doesn’t you’ll be all set to challenge for the YA Author Ultimate Fighter Belt. It’s time for Green to go down!

  11. Jason on #

    This post breaks my heart. It is so unfair that I can type this when you, who have so much to say and can say it so eloquently, cannot.

  12. Jude on #

    I transcribed an 800-page book and stuck it on to the Internet. I type around 100 WPM, so it didn’t take too long, and I managed to maintain healthy wrists in the process. It seems as though it would be less frustrating to hire someone poor, yet intelligent, and have that person transcribe you recording what you want to write. This is not a plug for a job, because that would be tacky, but I’m just saying. It seems as though the writing I do in my head is much better than the writing I end up with on paper, so theoretically, that could work. I’m sorry the software is so annoying.

  13. AliceB on #

    I’m so sorry. I have experimented with Dragon and with Window 7’s voice recognition software. I find Dragon as frustrating as you have described. Window 7’s is actually better in some regards, particularly at navigating the web. However it doesn’t work well with WordPerfect (my favorite writing software), so I’m stuck with Dragon there.

  14. Nancy Kelley on #

    Justine, I’d like to add my voice to everything you just said. Thanks to an injury to my elbow, I have severe RSI in my right hand. Life forced VSR on me, and while I know I would experience less pain if I used it, I resent the necessity of it. Even if the program could tap directly into my brain and transcribe the words I was thinking onto the screen, I think I would still resent it.

    It isn’t fair that you, or I, or anyone in the world, should have a condition that hampers our ability to do what we love. VSR is a constant, daily reminder that I have to be cautious with how much I write, and I hate that.

    Thank you for posting. Thank you for telling me I’m not alone.

  15. Amber on #

    The _very closest_ interpretation I ever got out of the early VRS I had to use while at uni was a paragraph about exploring the Kenyans of Alberta, Canada (that would be ‘canyons’). In the end the only thing I used it for was getting it to read back jokes and limericks I had typed in, because the robot computer voice has no sense of pace or timing. Not helpful, but it was really funny.

  16. Yvonne Carts-Powell on #

    I know. I’ve also needed to use voice recognition software.

    One of the things I do for money is demonstrate Dragon (and some other adaptive technologies) for people who need it. Another thing I do for money is write magazine articles and newsletters.

    I’ve learned:
    No adaptive technology is as convenient as what you are used to, and all of it requires learning to do things in a different way. It’s always a pain in the rear. (For some people, btw, typing is the adaptive technology. They’d rather be speaking.)

    Composing for speaking uses a different part of your brain than composing for typing or handwriting. I don’t talk as well or as easily as I type, and my style is different.

    Of the Dragon products, NaturallySpeaking Premium 11.5 (for PC) is better than Dictate 2.0 (for Mac) today. Who knows what tomorrow may bring.

    Heaven help you if you try to use Dragon when: you cannot type at all, you need to use database software, you have limited vocal stamina, or you need to use Gmail on the web.

    For people with arm problems, a trackball is better than a mouse — you can use your arm or elbow rather than fingers. And foot pedals (or using a trackball with your feet) can spread some of the repetitive movement around.

    Good luck!

  17. Brian on #

    Justine…read your blog on voice recognition…but what software was it? Others were guessing Dragon but I would be curious to know. Thanks and take care.

    Brian

  18. Justine on #

    Brian: Yes, I’m using Dragon. The Mac version. From what everyone says it’s the best out there. I.e. the problem is not the particular variety of voice recognition software but the state of the art over all. I have had similar problems with my iPhone’s voice recognition software and various others I’ve tried.

  19. Ruth Diaz on #

    I’m sorry to hear you’ve had such a miserable experience with Dragon. I, too, was in a situation where the condition of my wrists and elbows suddenly curtailed (and eventually prohibited) my use of a keyboard and mouse. And I know that no software will be perfect and no one solution will be right for everyone.

    With that said, I had far more trouble with it in the beginning and was far more frustrated. I don’t think it took me six months to get past that point, but there were definitely some other factors that made it far more usable to me.

    First–and this turned out to be the biggest thing for me–I was trying to run it on my old computer, which didn’t meet even the minimum spec for the software, let alone exceed it. My experience at that time closely parallels what you’re describing, where it was so much easier to just turn the stupid thing off and type. No matter how much I slowed down, over-enunciated, or spoke in single words, nothing would fix certain problems. But my hands and arms kept getting worse, so I went out on a limb and bought a new computer (mine was five years old at the time, so I was about due for a change).

    Made all the difference in the world. What had been frustrating or agonizing before became suddenly easy. I discovered that when I had been reduced to speaking in individual words, I had actually been doing the worst thing possible (if only the software had been running on a computer that could handle the alternative). Speaking in whole phrases and sentences increased recognition hugely, as context actually helped Dragon to figure out what I was saying.

    I also discovered certain tools and tricks for using the editing functions in ways that were not intuitive to me out of the package. I still don’t try to make it substitute for a mouse on a regular basis (though being able to switch windows just with the voice command is very nice), but I discovered that the easiest way to position my cursor to edit text was to say “insert after” or “insert before,” rather than trying to use the “move up/right/down/left” commands. I also found it was easier to select and correct a whole phrase than just one word within it.

    Certain words, yeah, I’ve never been able to get it to give me the spelling I’m looking for the first time. I actually do spell e-mail with a hyphen, but eBook? Not so much. For those particular words, I alternate between saying, for example, “select e-book” and choosing the alternate spelling I have programmed in or simply typing in the single word I know never comes up right first time. In particularly tricky cases where I know I will never use a spelling that keeps coming up, I’ve been known to delete that spelling from Dragon’s dictionary.

    Likewise, regarding character names, I’ve found two solutions. If it’s a name I’m going to be using a lot, I program it in. The more I use that name in preference to whatever else the software might think I’m trying to say, the more likely it is to come up the first time. If it’s a name I’m not going to use very often, I will frequently use the “spell” command. “Spell E-S-T-E-B-A-N” doesn’t take me very long to say and comes out right the first time. This will probably not be true for every accent or every person, but just discovering that I didn’t have to say “spell that” and wait and then spell the word in and say “OK” each time made it a far more useful process.

    The truth is, if I weren’t in such a bad situation with my wrists and elbows, I don’t know if I would ever have had the motivation to suffer through the learning curve, let alone actually try installing the same software on a faster computer. I would never recommend someone switch to voice recognition software just because they thought it would be faster or easier–the frustration conquers all in the beginning. But when you’re left with few options, sometimes it can be a lifesaver. Again, not necessarily for you, as much as I would love to be wrong about that. But I always try to make sure I’m open to and answer questions for anyone just starting out with Dragon, because I remember how frustrating it was, and I’d so much rather spare as many people as possible that experience.

  20. Ruth Diaz on #

    Just caught in the comments that you’re using DragonDictate 2.0. I’m on a PC, so I can only speak to Dragon NaturallySpeaking, myself. But I have been told by other people using DragonDictate that version 2.5 is in an entirely different class than version 2.0. Not that it sounds like you have any interest in paying for an upgrade, not with what you’re going through. But if it’s a free upgrade and you haven’t tried it already (upgrading from NaturallySpeaking 11 to 11.5 was a free upgrade for me), it might be worth looking into.

    Mostly, I wish the people working medical marvels would figure out some way to just make us all well again. *g* But I suppose I can wish with one hand and do something else with the other and see which comes through first.

  21. Mike on #

    I knew about Windows 7 speech recognition, but had not felt a desire to try it until I got a Tablet PC. I’ve also been frustrated with it, which is how I stumbled to here.

    In reading your post, I was thinking in regard to character names, a potential work-around solution to the issue could be to use an uncommon/articulated words that do work. For example if the the character’s name is “Larbalestier” use “piano”.

    Later use find and replace to change them all at once or periodically in a group.

    I do share your frustration I find that the technology is not quite there yet as well. Best of luck to you.

Comments are closed.