Behind closed doors, Apple is embarrassed by its slow Siri rollout, too

Jump to First Reply
Posted:
in iOS

One of Apple's top executives has called the delay of the new, more personalized Siri "ugly and embarrassing" and vows to "ship the world's greatest virtual assistant" at some point.

Hand holding smartphone with colorful app icons on the screen, against a plain background.
Apple's embarrassed by its slow Siri rollout, too



Apple has been facing significant challenges in deploying one of its most highly advertised Apple Intelligence features: an enhanced, personalized Siri.

The company had heavily promoted it at both WWDC and the September iPhone event. And it still airs most of the commercials that reference it.

However, it didn't show up at launch.

But, as we've learned, the feature is facing some serious hang-ups. While it was initially expected to roll out in iOS 18.4, it looks like it may not be coming until iOS 19. If it even shows up then.

While Apple's been slammed by the media for the delay, the company isn't exactly going easy on itself either. Robby Walker, Apple's senior director of Siri and Information Intelligence, called an all-hands-on-deck meeting to address the issue, as sources told Bloomberg.

Walker doesn't have a concrete time frame for when the enhanced Siri will finally launch. The company may be aiming for iOS 19, but the senior director has his doubts.

"We have other commitments across Apple to other projects," Walker reportedly said, citing new software and hardware initiatives. "We want to keep our commitments to those, and we understand those are now potentially more timeline-urgent than the features that have been deferred."

As it turns out, the enhanced Siri was delayed because the company found that it only works properly about two-thirds of the time. He urged the team to make more progress so that when the feature finally debuts, it will meet customer expectations.

He believes that there is enough personal accountability to go around, referencing both his boss, head of AI John Giannandrea, and software chief Craig Federighi. However, it doesn't seem like anyone's getting fired just yet.

Walker told staff they should feel proud for getting as much done as they had. He commended them for pouring their "hearts and souls into this thing." At the same time, he allegedly seemed to think it was unfair that Apple heavily promoted a feature that wasn't ready.

He showed examples of the technology working during the meeting to underscore just how much progress they'd made. Many team members are feeling burnt out, and Walker says that his team is entitled to some time away to recharge before diving back into the project.

Regarding the delay, Walker reminds staff that Apple holds itself to a higher standard. He points out that Apple's competitors have launched virtual assistants in worse states.

That wouldn't be good enough for Apple. Walker ended the meeting by saying Apple would "ship the world's greatest virtual assistant." What he considers that to be is unclear, and not disclosed in Friday's report.

Apple has faced myriad challenges during its artificial intelligence push. Not only did Apple arrive relatively late to the game, but it also released features at a glacial pace.

It's hardly an ideal situation for a company that insists Apple Intelligence will bolster lackluster post-pandemic sales. In fact, poor performance could wind up doing the opposite.

Poor Siri has taken the brunt of the criticism, as it has for the better part of a decade. Delays aside, it's also unfortunate that Siri is getting notably worse than it used to be.



Read on AppleInsider

williamlondon

Comments

  • Reply 1 of 12
    avon b7avon b7 Posts: 8,225member
    Late to the game, burnt out, not fully working, other software/hardware priorities, no real time line for delivery ...

    It reads terribly but I think they have spread themselves way too thin and maybe just don't have the resources to juggle everything they have on the table. That would be a worst case scenario. 

    On the other hand, being so late, while embarrassing (given all the marketing), makes things just a question of time (assuming the resources are in fact available)  They will get there - eventually. Maybe the best case scenario? 

    The only problem with 'time' is that competitors aren't standing still so it could turn out taking longer to catch up. 

    My guess is that those high priority software solutions are related to serious foundational changes to their OSs to get them ready for a fully IoT world. Something that would require even Apple Intelligence to take a back seat. 


    darbus69gregoriusmwatto_cobra
     2Likes 1Dislike 0Informatives
  • Reply 2 of 12

    As it turns out, the enhanced Siri was delayed because the company found that it only works properly about two-thirds of the time. 

    Two out of three ain't bad, or based on Meat Loaf's standards it isn't.

    The Bloomberg article says it worked 60-80% of the time. Color me surprised, the punditry around this whole situation has painted a far worse picture than that and it sounds much further along than I had expected. 


    williamlondonihatescreennameswatto_cobra
     3Likes 0Dislikes 0Informatives
  • Reply 3 of 12

    As it turns out, the enhanced Siri was delayed because the company found that it only works properly about two-thirds of the time. 

    Two out of three ain't bad, or based on Meat Loaf's standards it isn't.

    The Bloomberg article says it worked 60-80% of the time. Color me surprised, the punditry around this whole situation has painted a far worse picture than that and it sounds much further along than I had expected. 


    https://www.youtube.com/watch?v=VakU20APPdw
    williamlondon
     0Likes 1Dislike 0Informatives
  • Reply 4 of 12
    mpantonempantone Posts: 2,400member
    Two-thirds of the time is horrid. Even 90% is useless.

    Put it in perspective using an actual real world comparable scenario: a human personal assistant.

    Let's say you give your human P.A. three tasks:
    1. pick up dry cleaning (via TaskRabbit for the AI assistant),
    2. e-mail vendor that their account will be past due tomorrow thus incurring a 1.5% service charge, and
    3. book round trip flight on April 17th from Los Angeles to San Jose (SJC in California)

    Your AI assistant only correctly accomplished two of the three tasks. Now if it's the dry cleaning, that's maybe not a big deal. But the other two are. And there are plenty of ways the AI assistant can screw up. Maybe they told the vendor they would be fired tomorrow. Maybe the AI assistant quotes a 2.5% service charge. Maybe the AI assistant books you to SJO (San Jose, Costa Rica) instead of SJC.

    The problem is you don't get to choose which task the AI assistant fails at.

    Now if you had a human personal assistant, you'd fire them for effing up #2 or #3.

    Realistically a useful AI assistant (or human assistant) really needs to be about 99.8% accurate. Assistants need to be reliable, accurate, and private. And not just two of those three attributes.

    What if your cellular provider didn't deliver 40% of your text messages? Your transit card fails at 40% of fare gates. Your car won't start three days a week? Your credit card fails to authorize a couple times a day?

    Hell, what if the Tokyo Metro subway payment system screwed up 0.02% of transactions every day? That's literally thousands of rides. Or if ATMs gave the wrong amount of cash withdrawals that many times. If you had a Pasmo subway transit card that only worked 40% of the time, you'd probably give up and just buy paper tickets from the ticket vending machine.

    Apple knows this. An AI-assisted assistant needs to be way better than current Siri. It needs to at least be as good as a really, Really, REALLY good human assistant because going back to clean up someone else's mess (AI or human) takes too much time. And you lose trust in that assistant very quickly.

    "Fake it until you make it" is not a credible business plan in the real world. That's something Elizabeth Holmes would do.

    Apple cannot afford to put out an AI-assisted Siri that only gets things right two-thirds of the time and promise that it'll get better. We already have way too many LLM-powered AI chatbot assistants that dole out garbage on a regular basis. The world is not going to be any better with Yet Another Lame Assistant.
    edited March 14
    nubusForumPostwilliamlondonmuthuk_vanalingamlibertyandfreegrandact73watto_cobra
     5Likes 2Dislikes 0Informatives
  • Reply 5 of 12
    I the end, this is just a marketing failure and not the product itself as a whole
    williamlondonihatescreennameschasmthethirdshoewatto_cobra
     4Likes 1Dislike 0Informatives
  • Reply 6 of 12
    hypoluxahypoluxa Posts: 702member
    I think they will eventually get there, but they hyped it way too early.
    chasmgrandact73watto_cobra
     3Likes 0Dislikes 0Informatives
  • Reply 7 of 12
    dutchlorddutchlord Posts: 301member
    Apple had an advantage when Siri launched. But after that nothing happened. What roll-out? Seems like Apple
    does not take Siri seriously, like one of the famous Apple hobbies, like products who do not get much love. Apple TV is another example of such a hobby. 
    12Strangerslibertyandfreegrandact73Javert24601watto_cobra
     5Likes 0Dislikes 0Informatives
  • Reply 8 of 12
    chasmchasm Posts: 3,724member
    This is what happens when engineering and marketing aren't communicating well.

    FWIW, so far I've found what features ARE present in Apple Intelligence now are quite good. Genmoji and Image Generator do exactly what they say on the tin, though I will not likely have any use for them -- but they exist, and work completely as advertised.

    The Writing Tools are so far EXCELLENT, and in an age of decreasing literacy are more needed than any other feature (apart from Siri upgrades). I absolutely LOVE the proofreading tool and find the rewriting tools very high-quality when needed (with me personally, making things more concise is sometimes very useful).

    Siri is a little better than it was previously, but as acknowledged in the article is far behind what was promised. As near as I can tell, this is the only part of what was promised in Apple Intelligence that hasn't already been delivered, and to be fair it's the big marquee feature. I get that that's disappointing, but to be entirely fair it was ALWAYS going to have be an incremental-over-time upgrade as Apple responded to user feedback.

    I'm not trying to paper over the fact that they missed their own deadline on the Siri improvements, but I think there's a lot of hope for it being mostly complete by the end of this year.

    It's important to remember that other AI engines can develop much faster because they don't have those silly privacy concerns -- indeed, they collect and market tons of data about what you request -- and they don't seem too concerned about the ratio of misinformation/"hallucinations" they put out compared to accurate info. Anything 50 percent right is "good enough" for Apple's competitors, but isn't good enough for Apple (and we hold them to a very high double standard on stuff like this).

    In these areas, I'm okay with Apple being slower but getting it right rather than faster and "experimenting" on me, or selling my data. Unlike many (especially on Reddit I notice), I can be patient when I think the reward will be worth it.
    watto_cobra
     1Like 0Dislikes 0Informatives
  • Reply 9 of 12
    Lettucelettuce Posts: 35member
    Photo search already has difficulty searching my 150.000+ photo library with simple terms. How will Siri be able to scan it quickly enough for this kind of process? 
    watto_cobra
     1Like 0Dislikes 0Informatives
  • Reply 10 of 12
    mpantonempantone Posts: 2,400member
    Lettuce said:
    Photo search already has difficulty searching my 150.000+ photo library with simple terms. How will Siri be able to scan it quickly enough for this kind of process? 
    It depends. If you use iCloud Photos, most likely Apple Intelligence Private Cloud (or whatever its called) will eventually do the cataloguing. It's worth pointing out that Google has used AI photo captioning since before the pandemic.

    If you are locally storing your photos on your Mac, at some point your Mac might be able to do this accurately without sending any information to the cloud.

    If you are just keeping photos on your phone, you might be giving up more accuracy for privacy. At some point, I figure your phone's local Apple Intelligence service will be able to handle this (it might do this at night when plugged in with a 100% charge).

    One thing for sure, it would be silly to expect this to happen overnight with 100% accuracy after a new iOS or a new handset release. Most likely Apple will roll out this sort of thing which will get better over time.

    Remember that Apple already has tools for things like facial recognition, holidays, locations, etc. so the basic framework has already existed for year. 

    Siri -- the personal assistant -- has been long neglected so it will be some time before Apple can get it to work correctly with the underlying Apple Intelligence features.
    edited March 16
     0Likes 0Dislikes 0Informatives
  • Reply 11 of 12
    mpantone said:
    Two-thirds of the time is horrid. Even 90% is useless.

    Put it in perspective using an actual real world comparable scenario: a human personal assistant.

    Let's say you give your human P.A. three tasks:
    1. pick up dry cleaning (via TaskRabbit for the AI assistant),
    2. e-mail vendor that their account will be past due tomorrow thus incurring a 1.5% service charge, and
    3. book round trip flight on April 17th from Los Angeles to San Jose (SJC in California)

    Your AI assistant only correctly accomplished two of the three tasks. Now if it's the dry cleaning, that's maybe not a big deal. But the other two are. And there are plenty of ways the AI assistant can screw up. Maybe they told the vendor they would be fired tomorrow. Maybe the AI assistant quotes a 2.5% service charge. Maybe the AI assistant books you to SJO (San Jose, Costa Rica) instead of SJC.

    The problem is you don't get to choose which task the AI assistant fails at.

    Now if you had a human personal assistant, you'd fire them for effing up #2 or #3.

    Realistically a useful AI assistant (or human assistant) really needs to be about 99.8% accurate. Assistants need to be reliable, accurate, and private. And not just two of those three attributes.

    What if your cellular provider didn't deliver 40% of your text messages? Your transit card fails at 40% of fare gates. Your car won't start three days a week? Your credit card fails to authorize a couple times a day?

    Hell, what if the Tokyo Metro subway payment system screwed up 0.02% of transactions every day? That's literally thousands of rides. Or if ATMs gave the wrong amount of cash withdrawals that many times. If you had a Pasmo subway transit card that only worked 40% of the time, you'd probably give up and just buy paper tickets from the ticket vending machine.

    Apple knows this. An AI-assisted assistant needs to be way better than current Siri. It needs to at least be as good as a really, Really, REALLY good human assistant because going back to clean up someone else's mess (AI or human) takes too much time. And you lose trust in that assistant very quickly.

    "Fake it until you make it" is not a credible business plan in the real world. That's something Elizabeth Holmes would do.

    Apple cannot afford to put out an AI-assisted Siri that only gets things right two-thirds of the time and promise that it'll get better. We already have way too many LLM-powered AI chatbot assistants that dole out garbage on a regular basis. The world is not going to be any better with Yet Another Lame Assistant.
    This is why Apple shot itself in the foot by limiting themselves to being on the device. I strongly suspect there is not enough "horsepower" for a AI assistant to be able to "reason" the request, come up with an avenue to accomplish the tasks, and then execute the tasks. I personally use Grok3 (I have not used ChatGPT because of it usage limitations), and I really like the way that it shows you how it breaks down the request, shows you how it figures out the answer, and then explains the answer while also explaining the limitations and pros and cons of the answer. And, how much "horsepower" is behind Grok3 compared to our iPhones? Apple should have gone with the "privacy on the server" route as it worked to putting it on the device. Don't get me wrong as I fully appreciate the privacy/security emphasis, but it was the wrong call at the wrong time. Apple already squeezed a LLM into 8 GB of RAM, but that doesn't mean the 8GB is enough to make an AI assistant that is 99.99% accurate. Whoever made that call should be fired.
    Wesley_Hilliardwatto_cobra
     1Like 1Dislike 0Informatives
  • Reply 12 of 12
    Wesley_Hilliardwesley_hilliard Posts: 449member, administrator, moderator, editor
    mpantone said:
    Two-thirds of the time is horrid. Even 90% is useless.

    Put it in perspective using an actual real world comparable scenario: a human personal assistant.

    Let's say you give your human P.A. three tasks:
    1. pick up dry cleaning (via TaskRabbit for the AI assistant),
    2. e-mail vendor that their account will be past due tomorrow thus incurring a 1.5% service charge, and
    3. book round trip flight on April 17th from Los Angeles to San Jose (SJC in California)

    Your AI assistant only correctly accomplished two of the three tasks. Now if it's the dry cleaning, that's maybe not a big deal. But the other two are. And there are plenty of ways the AI assistant can screw up. Maybe they told the vendor they would be fired tomorrow. Maybe the AI assistant quotes a 2.5% service charge. Maybe the AI assistant books you to SJO (San Jose, Costa Rica) instead of SJC.

    The problem is you don't get to choose which task the AI assistant fails at.

    Now if you had a human personal assistant, you'd fire them for effing up #2 or #3.

    Realistically a useful AI assistant (or human assistant) really needs to be about 99.8% accurate. Assistants need to be reliable, accurate, and private. And not just two of those three attributes.

    What if your cellular provider didn't deliver 40% of your text messages? Your transit card fails at 40% of fare gates. Your car won't start three days a week? Your credit card fails to authorize a couple times a day?

    Hell, what if the Tokyo Metro subway payment system screwed up 0.02% of transactions every day? That's literally thousands of rides. Or if ATMs gave the wrong amount of cash withdrawals that many times. If you had a Pasmo subway transit card that only worked 40% of the time, you'd probably give up and just buy paper tickets from the ticket vending machine.

    Apple knows this. An AI-assisted assistant needs to be way better than current Siri. It needs to at least be as good as a really, Really, REALLY good human assistant because going back to clean up someone else's mess (AI or human) takes too much time. And you lose trust in that assistant very quickly.

    "Fake it until you make it" is not a credible business plan in the real world. That's something Elizabeth Holmes would do.

    Apple cannot afford to put out an AI-assisted Siri that only gets things right two-thirds of the time and promise that it'll get better. We already have way too many LLM-powered AI chatbot assistants that dole out garbage on a regular basis. The world is not going to be any better with Yet Another Lame Assistant.
    This is why Apple shot itself in the foot by limiting themselves to being on the device. I strongly suspect there is not enough "horsepower" for a AI assistant to be able to "reason" the request, come up with an avenue to accomplish the tasks, and then execute the tasks. I personally use Grok3 (I have not used ChatGPT because of it usage limitations), and I really like the way that it shows you how it breaks down the request, shows you how it figures out the answer, and then explains the answer while also explaining the limitations and pros and cons of the answer. And, how much "horsepower" is behind Grok3 compared to our iPhones? Apple should have gone with the "privacy on the server" route as it worked to putting it on the device. Don't get me wrong as I fully appreciate the privacy/security emphasis, but it was the wrong call at the wrong time. Apple already squeezed a LLM into 8 GB of RAM, but that doesn't mean the 8GB is enough to make an AI assistant that is 99.99% accurate. Whoever made that call should be fired.
    You've made several comments about Grok, a chatbot running in a server, and compared it to Apple Intelligence, an on-device model running specific skill sets. They are not the same nor are they meant to be the same. If you've ever read want to compare it to something, compare to ChatGPT.

    and no, none of these models can't reason. No more than a toaster oven can reason. They are just predictive pixel machines. Nothing intelligent about them.
    watto_cobra
     1Like 0Dislikes 0Informatives
Sign In or Register to comment.