Digital assistant test shows Apple's Siri is improving, lags behind Google Assistant
Apple's digital assistant Siri continues to lag behind the Google Assistant, a group test reports, one which also reveals improvements in properly understanding queries and providing correct responses has generally improved across all digital assistants.
The Digital Assistant IQ Test by Loup Ventures repeats a similar trial from April 2017, tasking Apple's Siri, Google Assistant, Amazon's Alexa, and Microsoft's Cortana with responding to a series of questions. While last year's test solely used smartphone-based assistants, Alexa's iOS app allowed it to be included in the roster.
A total of 800 questions were asked to each assistant, with grades provided based on if the question was correctly interpreted, and if a correct response was provided.
The Google Assistant topped the list, providing correct results 85.5 percent of the time, with Siri in second place with 78.5 percent of queries answered correctly. Alexa scored 61.4 percent correct, while Cortana rounded out the list with 52.4 percent.
There were improvements across the board, with Google Assistant up from the 74.8 percent it scored last year, while Siri and Cortana's scores from 2017 were 66.1 and 48.8 respectively. The ability for assistants to understand the queries has also improved, with Siri shifting from 95 percent understood to 99 percent, and Google moving from 99 percent to 100 percent.
"Both the voice recognition and natural language processing of digital assistants across the board have improved to the point where, within reason, they will understand everything you say to them," analysts Gene Munster and Will Thompson write.
As part of the testing, the questions were broken up into one of five categories, testing their abilities to interpret local knowledge, commerce, navigational, informational, and command queries. It is noted the questions were modified for this year's test to "reflect the changing abilities of AI assistants," changes which appeared to cause drops in results for navigation queries, but still saw improvements for the other categories.
On a category basis, Google Assistant leads over all others, except for command queries, which is dominated by Siri.
"We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music etc" states the report. "Our question set includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers."
The testing used iOS apps for Cortana and Alexa, which the analysts claim are not entirely reflective of their capabilities when used on other platforms. One example, the Alexa app's inability to set reminders, alarms, or send emails, impacted its Command category performance. As Siri and Google Assistant are baked into iPhones and Android smartphones, it is suggested Cortana and Alexa are fighting an uphill battle when compared on smartphones.
The analysts are hopeful that voice-based computing will improve over time, as it is a feature about removing friction from a user's standpoint. Along with the ability to understand conversational-style queries that follow each other, as well as routines and smart home scenes, Siri Shortcuts is also highlighted as another friction-reducing feature to create "mini automations" that could be triggered by command.
Paymenta and ride-hailing are two promising areas for future vocal development, due to requiring little to no visual output to be performed. While Siri and Alexa were able to hail a ride, and Siri and Google Assistant capable of sending money, the firm expects such features to be adopted by those that do not already offer them.
The Digital Assistant IQ Test by Loup Ventures repeats a similar trial from April 2017, tasking Apple's Siri, Google Assistant, Amazon's Alexa, and Microsoft's Cortana with responding to a series of questions. While last year's test solely used smartphone-based assistants, Alexa's iOS app allowed it to be included in the roster.
A total of 800 questions were asked to each assistant, with grades provided based on if the question was correctly interpreted, and if a correct response was provided.
The Google Assistant topped the list, providing correct results 85.5 percent of the time, with Siri in second place with 78.5 percent of queries answered correctly. Alexa scored 61.4 percent correct, while Cortana rounded out the list with 52.4 percent.
There were improvements across the board, with Google Assistant up from the 74.8 percent it scored last year, while Siri and Cortana's scores from 2017 were 66.1 and 48.8 respectively. The ability for assistants to understand the queries has also improved, with Siri shifting from 95 percent understood to 99 percent, and Google moving from 99 percent to 100 percent.
"Both the voice recognition and natural language processing of digital assistants across the board have improved to the point where, within reason, they will understand everything you say to them," analysts Gene Munster and Will Thompson write.
As part of the testing, the questions were broken up into one of five categories, testing their abilities to interpret local knowledge, commerce, navigational, informational, and command queries. It is noted the questions were modified for this year's test to "reflect the changing abilities of AI assistants," changes which appeared to cause drops in results for navigation queries, but still saw improvements for the other categories.
On a category basis, Google Assistant leads over all others, except for command queries, which is dominated by Siri.
"We found Siri to be slightly more helpful and versatile (responding to more flexible language) in controlling your phone, smart home, music etc" states the report. "Our question set includes a fair amount of music-related queries (the most common action for smart speakers). Apple, true to its roots, has ensured that Siri is capable with music on both mobile devices and smart speakers."
The testing used iOS apps for Cortana and Alexa, which the analysts claim are not entirely reflective of their capabilities when used on other platforms. One example, the Alexa app's inability to set reminders, alarms, or send emails, impacted its Command category performance. As Siri and Google Assistant are baked into iPhones and Android smartphones, it is suggested Cortana and Alexa are fighting an uphill battle when compared on smartphones.
The analysts are hopeful that voice-based computing will improve over time, as it is a feature about removing friction from a user's standpoint. Along with the ability to understand conversational-style queries that follow each other, as well as routines and smart home scenes, Siri Shortcuts is also highlighted as another friction-reducing feature to create "mini automations" that could be triggered by command.
Paymenta and ride-hailing are two promising areas for future vocal development, due to requiring little to no visual output to be performed. While Siri and Alexa were able to hail a ride, and Siri and Google Assistant capable of sending money, the firm expects such features to be adopted by those that do not already offer them.
Comments
Great job Siri.
Unless I need an answer to the most basic of questions, I just don't bother anymore. It takes longer to ask Siri and wait for its incorrect response than to search myself. They seem to have abandoned Wolfram Alpha integration which was a great Siri knowledge filler. Almost everything now is "here's what I found on the web for..." For another example: I ask what the temperature is in Canterbury, and despite actually being in Canterbury, UK it decides I'd rather know the temperature in Canterbury, New Zealand. Same if I ask for directions from anywhere in the UK to Canterbury. It tries to get directions to Canterbury, NZ, and then says it can't get directions to there. Brilliant.
So I have no idea where these statistics are coming from, but anecdotal evidence from myself and a few friend's usage shows its understanding is pretty awful at best.
The other thing that would be great for Siri, and navigation in general, is native pronunciation for foreign (as respects to your device setting) places and streets. When I'm in Germany I would like for Siri to pronounce Straße the way she would if my device were set to German and Calle the way she would if my device were set to Spanish when im in Spain. It is pretty distracting while driving when you are looking for a turnoff and Siri mangles the names of everything. It is particularly dumb because you know she pronounces it correctly for the people who's device is set for that language.
For a service that was released in 2011 with the iPhone 4S, I’m shocked at how poor the implementation still is.
When I would ask Siri to play "Das Rheingold", I would get "I couldn't find DOS Wrangled in your music".
Siri would often mangle composer names...I'd have to say "Richard Wagner" as an English speaker would, not in the correct German pronunciation. This happened over and over, and there were certain composers I couldn't get her to play at all no matter how I pronounced their names.
My last straw was when I asked Siri to play "Beethoven's Fifth Symphony". She understood me perfectly and then played some modern artist's song that must have had something about "Beethoven's Fifth Symphony" in the title.
I know accents play a large role on how well Siri understands people, but Spotify's assistant rarely makes the same mistakes with me using the same accent (although its certainly not perfect either).
I also remember asking Siri to play "I See Fire" (the song from the second Hobbit movie by Ed Sheeran), and it kept coming back as "I couldn't find Icy Fire in your music". I tried over and over again to get her to hear me, but it never worked
The thing that baffles me about Siri is how different people can ask the same question, be understood and get wildly different results. I have no issues asking for “I See Fire” (though I have a cover of it in my library and that’s what gets played) and when I ask for music by Richard Wagner (using the German pronunciation) I get a “personalized playlist of music by Richard Wagner”.
Now if THAT doesn’t piss off Tim Cook, then I don’t have faith in Apple!
2) One this that Amazon does that I love that I don't think anyone else does is send you a weekly email (on Friday) with new commands, topical requests to make, and Skills to try. There is no visual UI so you will forget commands. I'm sure there are hundreds of useful Siri commands that I've ever never known about or I tried years ago but after they failed forgot about them and never tried them again. I wish Apple had something like this because I'd like to utilize Siri more than setting a timer when I put money in the meter.
How does Siri respond when you ask "What is the city for 95014?"?
What would be more useful is asking how many zip codes there are in a particular city. Alexa reads off 2 of 5 for Cupertino, and 5 for NYC before telling you to check out the Alexa app, while Siri says “I can’t find a zip code for that place" for Cupertino and NYC. Alexa's response is poor, and Siri is just embarrassing.
That query reminds me of Brooklyn Nine-Nine:
"The zipcodes of Cupertino California include 94024, 94087, 95014 and others".
Asking the same for NYC again makes it clear there are more than one by again offering the first three and advising there are others. At least a searcher realizes they need to be more specific and not "I can't find a zipcode for that place" which tells them nothing at all.
Until Apple sends Tim out at a full-on Apple Event to announce an "all new Siri", or something like that, I'm just not even going to bother anymore.
She does do a good job setting a timer or alarm. That's about it (for me).
Then there's the oddball type of speaker: