Digital assistant test shows Apple's Siri is improving, lags behind Google Assistant

Posted:
in iOS
Apple's digital assistant Siri continues to lag behind the Google Assistant, a group test reports, one which also reveals improvements in properly understanding queries and providing correct responses has generally improved across all digital assistants.




The Digital Assistant IQ Test by Loup Ventures repeats a similar trial from April 2017, tasking Apple's Siri, Google Assistant, Amazon's Alexa, and Microsoft's Cortana with responding to a series of questions. While last year's test solely used smartphone-based assistants, Alexa's iOS app allowed it to be included in the roster.

A total of 800 questions were asked to each assistant, with grades provided based on if the question was correctly interpreted, and if a correct response was provided.

The Google Assistant topped the list, providing correct results 85.5 percent of the time, with Siri in second place with 78.5 percent of queries answered correctly. Alexa scored 61.4 percent correct, while Cortana rounded out the list with 52.4 percent.




There were improvements across the board, with Google Assistant up from the 74.8 percent it scored last year, while Siri and Cortana's scores from 2017 were 66.1 and 48.8 respectively. The ability for assistants to understand the queries has also improved, with Siri shifting from 95 percent understood to 99 percent, and Google moving from 99 percent to 100 percent.

"Both the voice recognition and natural language processing of digital assistants across the board have improved to the point where, within reason, they will understand everything you say to them," analysts Gene Munster and Will Thompson write.

As part of the testing, the questions were broken up into one of five categories, testing their abilities to interpret local knowledge, commerce, navigational, informational, and command queries. It is noted the questions were modified for this year's test to "reflect the changing abilities of AI assistants," changes which appeared to cause drops in results for navigation queries, but still saw improvements for the other categories.




On a category basis, Google Assistant leads over all others, except for command queries, which is dominated by Siri.

"We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music etc" states the report. "Our question set includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers."

The testing used iOS apps for Cortana and Alexa, which the analysts claim are not entirely reflective of their capabilities when used on other platforms. One example, the Alexa app's inability to set reminders, alarms, or send emails, impacted its Command category performance. As Siri and Google Assistant are baked into iPhones and Android smartphones, it is suggested Cortana and Alexa are fighting an uphill battle when compared on smartphones.

The analysts are hopeful that voice-based computing will improve over time, as it is a feature about removing friction from a user's standpoint. Along with the ability to understand conversational-style queries that follow each other, as well as routines and smart home scenes, Siri Shortcuts is also highlighted as another friction-reducing feature to create "mini automations" that could be triggered by command.

Paymenta and ride-hailing are two promising areas for future vocal development, due to requiring little to no visual output to be performed. While Siri and Alexa were able to hail a ride, and Siri and Google Assistant capable of sending money, the firm expects such features to be adopted by those that do not already offer them.
«1

Comments

  • Reply 1 of 24
    FolioFolio Posts: 591member
    In short, Siri showed biggest improvement in understanding and correct response, but still lags Google in response rate. I depend on Siri more and more, yet these must be lab conditions, as Siri certainly doesn't understand me correctly 99% of time. I hope Munster improves these comparisons (at least from what I glean by these articles) as assistants take on more tasks such as comparison shopping and maintenance of device OS chores. Also if he can make tests on how fast they can be trained and how durable that training is.
    stanhope
  • Reply 2 of 24
    elijahgelijahg Posts: 987member
    Well I dunno if I'm stuck on some 2003 version of Siri alpha but I'm nowhere near the 99% for "understood query", it's less than 50% for me.
    Great job Siri.

    Unless I need an answer to the most basic of questions, I just don't bother anymore. It takes longer to ask Siri and wait for its incorrect response than to search myself. They seem to have abandoned Wolfram Alpha integration which was a great Siri knowledge filler. Almost everything now is "here's what I found on the web for..." For another example: I ask what the temperature is in Canterbury, and despite actually being in Canterbury, UK it decides I'd rather know the temperature in Canterbury, New Zealand. Same if I ask for directions from anywhere in the UK to Canterbury. It tries to get directions to Canterbury, NZ, and then says it can't get directions to there. Brilliant.

    So I have no idea where these statistics are coming from, but anecdotal evidence from myself and a few friend's usage shows its understanding is pretty awful at best.


    edited July 2018 stanhopeJaiOh81
  • Reply 3 of 24
    stanhopestanhope Posts: 156member
    I don’t need stats to prove what i know...siri sucks....this reports must come from Fox news, home of the alternate universe
    macky the mackyJaiOh81waverboy
  • Reply 4 of 24
    mbenz1962mbenz1962 Posts: 127member
    elijahg said:
    For another example: I ask what the temperature is in Canterbury, and despite actually being in Canterbury, UK it decides I'd rather know the temperature in Canterbury, New Zealand. Same if I ask for directions from anywhere in the UK to Canterbury. It tries to get directions to Canterbury, NZ, and then says it can't get directions to there. Brilliant.


    This error happens to me too, and it is particularly maddening.  I'm not sure if it is just more prevalent outside the US and the Siri team is working there first so it isn't reflected in this testing or what.  Siri should prioritize the closest (geographical) matches with navigational based queries and if she really can't decide, then she should follow the request up with a request to refine/clarify like she does for contact queries which are too unspecific for her. 

    The other thing that would be great for Siri, and navigation in general, is native pronunciation for foreign (as respects to your device setting) places and streets.  When I'm in Germany I would like for Siri to pronounce Straße the way she would if my device were set to German and Calle the way she would if my device were set to Spanish when im in Spain.  It is pretty distracting while driving when you are looking for a turnoff and Siri mangles the names of everything.  It is particularly dumb because you know she pronounces it correctly for the people who's device is set for that language.
    edited July 2018 elijahgDon.AndersenJaiOh81
  • Reply 5 of 24
    stukestuke Posts: 86member
    elijahg said:
    ... Almost everything now is "here's what I found on the web for..." ...
    I hear you!  I too am frustrated with this response, which I know I get about 3/4 of the time when asking Siri for information.  Until this goes away, and Siri actually produces an answer that matches the question, and not a response to take me to several possible websites that may contain the information I seek, Siri is of NO (read, ZERO) value to me. It is a terrible shame too because I'm an Apple fanboy since the early 90's.
    king editor the grateelijahgJaiOh81
  • Reply 6 of 24
    It’s to the point where I’m genuinely impressed if Siri correctly answers a question that is beyond the most basic of queries.  Like most others on this message board, using Siri is more often than not, an exercise in aggravation.

    For a service that was released in 2011 with the iPhone 4S, I’m shocked at how poor the implementation still is.
    elijahgJaiOh81
  • Reply 7 of 24
    "We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music etc" states the report. "Our question set includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers.
    I know I'm an edge case because I listen mostly to Classical music and Opera, but Siri was one of the reasons that I dropped Apple Music and went back to Spotify, whose assistant understands me much better.

    When I would ask Siri to play "Das Rheingold", I would get "I couldn't find DOS Wrangled in your music".

    Siri would often mangle composer names...I'd have to say "Richard Wagner" as an English speaker would, not in the correct German pronunciation. This happened over and over, and there were certain composers I couldn't get her to play at all no matter how I pronounced their names.

    My last straw was when I asked Siri to play "Beethoven's Fifth Symphony". She understood me perfectly and then played some modern artist's song that must have had something about "Beethoven's Fifth Symphony" in the title.

    I know accents play a large role on how well Siri understands people, but Spotify's assistant rarely makes the same mistakes with me using the same accent (although its certainly not perfect either).

    I also remember asking Siri to play "I See Fire" (the song from the second Hobbit movie by Ed Sheeran), and it kept coming back as "I couldn't find Icy Fire in your music". I tried over and over again to get her to hear me, but it never worked :)

    king editor the grate
  • Reply 8 of 24
    mike1mike1 Posts: 1,910member

    The other thing that would be great for Siri, and navigation in general, is native pronunciation for foreign (as respects to your device setting) places and streets.  When I'm in Germany I would like for Siri to pronounce Straße the way she would if my device were set to German and Calle the way she would if my device were set to Spanish when im in Spain.  It is pretty distracting while driving when you are looking for a turnoff and Siri mangles the names of everything.  It is particularly dumb because you know she pronounces it correctly for the people who's device is set for that language.
    Huh?! So, you would want Siri (or any of the others) to automatically bypass your default language choice. Yeah, that wouldn't piss people off.
    elijahg
  • Reply 9 of 24
    "We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music etc" states the report. "Our question set includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers.
    I also remember asking Siri to play "I See Fire" (the song from the second Hobbit movie by Ed Sheeran), and it kept coming back as "I couldn't find Icy Fire in your music". I tried over and over again to get her to hear me, but it never worked 

    I use Siri every day, on my Apple Watch, my iPhone, my Apple TV and HomePod. For the most part I get very good accuracy and things work as intended.  However, I use Siri for things that generally don’t get me the “Here’s what I found on the web” response; playing music, making calls, sending texts, setting reminders, conversions, math (where I sometimes get answered supplied by Wolfram Alpha), navigation, weather, news, HomeKit commands, etc. And for me all these things work relatively great.

    The thing that baffles me about Siri is how different people can ask the same question, be understood and get wildly different results.  I have no issues asking for “I See Fire” (though I have a cover of it in my library and that’s what gets played) and when I ask for music by Richard Wagner (using the German pronunciation) I get a “personalized playlist of music by Richard Wagner”.
    Folio
  • Reply 10 of 24
    tedp88tedp88 Posts: 20member
    ask Siri the zip code of a particular city. Useless. 
  • Reply 11 of 24
    stukestuke Posts: 86member
    Ask Siri for the zip code of Cupertino California and you get “I can’t find a zip code for that place.”

    Now if THAT doesn’t piss off Tim Cook, then I don’t have faith in Apple!
    edited July 2018 king editor the grateJaiOh81
  • Reply 12 of 24
    SoliSoli Posts: 9,176member
    1) There are so many avenues for this kind of testing that I don't trust any of it if I can't see and hear the actual queries being made and the results.

    2) One this that Amazon does that I love that I don't think anyone else does is send you a weekly email (on Friday) with new commands, topical requests to make, and Skills to try. There is no visual UI so you will forget commands. I'm sure there are hundreds of useful Siri commands that I've ever never known about or I tried years ago but after they failed forgot about them and never tried them again. I wish Apple had something like this because I'd like to utilize Siri more than setting a timer when I put money in the meter.
    Folio
  • Reply 13 of 24
    foggyhillfoggyhill Posts: 4,767member
    stuke said:
    Ask Siri for the zip code of Cupertino California and you get “I can’t find a zip code for that place.”

    Now if THAT doesn’t piss off Tim Cook, then I don’t have faith in Apple!
    There are a whole list of postal codes, so that's not really a useful question.
    SIn fact that's the kind of crap "questions" expected from those who are supposedly "testing" (sic) assistants.

    If you ask for zipcode of Apple Park, Cupertino California (or any address anywere), you get it right away.

    In fact, I find most "tests" about assistants desingenous, if you're looking for anything search, for sure Google will come on top, that's their damn business.
    But,if you are looking for actual actions, they're not so hot.

    If your "test" in non scientific, you get a non scientific answer that reflects the "tests" bias which often is obvious even prior to actual "testing"
    edited July 2018 Don.AndersenFolio
  • Reply 14 of 24
    SoliSoli Posts: 9,176member
    My last straw was when I asked Siri to play "Beethoven's Fifth Symphony". She understood me perfectly and then played some modern artist's song that must have had something about "Beethoven's Fifth Symphony" in the title.
    I noticed a similar, but opposite problem with Spotify when using Alexa to request an artist. I asked it to play "Sofi Tukker" and it played an artist named Sophie Ticker that was popular a century ago. Not exactly an informed result. I made a submission to Spotify and I guess they changed it enough. You can now qualify the statement with "play the band Sofi Tukker" to get it to work.

    edited July 2018
  • Reply 15 of 24
    I’ve pernt near given up on Old Lady Siri. I still use it regularly to get heat index, but have learned to ask for the current heat index; otherwise “What is the heat index” leads to the infamous Here’s What I Found on the Web ...
  • Reply 16 of 24
    SoliSoli Posts: 9,176member
    tedp88 said:
    ask Siri the zip code of a particular city. Useless. 
    To what end? Do you want Siri to start reading them all off? Cupertino has 5 and you go to any big city, like NYC, and it looks like there are over 200.

    How does Siri respond when you ask "What is the city for 95014?"?

    What would be more useful is asking how many zip codes there are in a particular city. Alexa reads off 2 of 5 for Cupertino, and 5 for NYC before telling you to check out the Alexa app, while Siri says “I can’t find a zip code for that place" for Cupertino and NYC. Alexa's response is poor, and Siri is just embarrassing.

    That query reminds me of Brooklyn Nine-Nine:
    Gina: Scully searched for "how much fudge is in a calorie?"
    Scully: I never found the answer, but it was a good question.
    edited July 2018 JaiOh81
  • Reply 17 of 24
    gatorguygatorguy Posts: 20,894member
    Soli said:
    tedp88 said:
    ask Siri the zip code of a particular city. Useless. 
    To what end? Do you want Siri to start reading them all off? Cupertino has 5 and you go to any big city, like NYC, and it looks like there are over 200.

    How does Siri respond when you ask "What is the city for 95014?"?

    What would be more useful is asking how many zip codes there are in a particular city. Alexa reads off 2 of 5 for Cupertino, and 5 for NYC before telling you to check out the Alexa app, while Siri says “I can’t find a zip code for that place" for Cupertino and NYC. Alexa's response is poor, and Siri is just embarrassing.

    That query reminds me of Brooklyn Nine-Nine:
    Gina: Scully searched for "how much fudge is in a calorie?"
    Scully: I never found the answer, but it was a good question.
    If you ask for the Cupertino zipcode on Google Assistant it makes it clear there are more than one, apparently in the same vein as Alexa's response:
    "The zipcodes of Cupertino California include 94024, 94087, 95014 and others". 

    Asking the same for NYC again makes it clear there are more than one by again offering the first three and advising there are others. At least a searcher realizes they need to be more specific and not "I can't find a zipcode for that place" which tells them nothing at all. 
    edited July 2018 JaiOh81waverboy
  • Reply 18 of 24
    backstabbackstab Posts: 138member
    Yeah, I've given it up.
    Until Apple sends Tim out at a full-on Apple Event to announce an "all new Siri", or something like that, I'm just not even going to bother anymore.
    She does do a good job setting a timer or alarm. That's about it (for me).
    edited July 2018 atomic101elijahgJaiOh81
  • Reply 19 of 24
    linkmanlinkman Posts: 923member
    I can't imagine any voice recognition doing well in some real world situations such as high background noise (especially if you can't use noise cancellation such as not having the phone against your head), speakers with strong non-native accents (such as a German accent speaking English), or using slang/poor grammar.

    Then there's the oddball type of speaker: 
    Don.Andersen
  • Reply 20 of 24
    FlytrapFlytrap Posts: 9member
    As many of the other comments have already stated, these test results are not representative of my personal experience with Siri. When Siri works it is great... but those moments are so far apart and few that one rarely bothers anymore.

    I have reached the point whereby it is faster to unlock my phone, find and launch the Google app, and say "Hey Google..." than it is to say "Hey Siri...", hold my breadth, cringe in anticipation of the worst, and sigh as I get the standard response, "Here's what I found on the web..."

    Here is what I have been conditioned, by years of disappointment, to use Siri for: 1)Calling contacts, 2) Setting reminders, 3)Setting timers (one at a time), 4)Sending brief text messages, 5)Getting local weather, 6)Doing simple maths or unit conversions - basically what Loup Ventures refers to as command queries.

    I just can't imagine asking Siri the types of questions that Gene Munster and Will Thompson of Loup Ventures claim to have used in their test, such as: "Where is the nearest coffee shop?", "Can you order me more paper towels?", "How do I get to uptown on the bus?" or "Who do the Twins play tonight?" and getting an actual and correct response (that is not "Here's what I found on the web...") from Siri.
    edited July 2018
Sign In or Register to comment.