John Giannandrea out as Siri chief, Apple Vision Pro lead in

bingo_wings · March 20, 2025 9:42PM

This has Copland written all over it

mrstep · March 20, 2025 9:59PM

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Agreed. I think it probably needs to be something like 99.999% right, the current "AI" doesn't cut it. The delight of "this is incredible" is a thing ("it's a monkey juggling watermelons... haha"), but that's when you also laugh off bad summaries, 7-fingered hands, and hallucinations - none of which are endearing when asking a question like "when does XYZ close today?" if you have to count on it while planning your day. Obviously Siri offering to search the web for that isn't good, but a confident wrong answer doesn't help either.

Prompting "Siri, I don't think that's right. Double-check that answer' and getting "hmm, it looks like that was wrong..." is the state of current ML models. I don't really see how that gets turned into a significantly better output just because it's Apple investing in it - loads of companies are investing in it, it's just not baked - and that's the current state of the art. They're statistical models with ungodly amounts of information used to train them, so it looks like magic, but it's still science. Cool, sometimes helpful, but damn you better not just assume the output is correct.

mpantone · March 20, 2025 10:11PM

mrstep said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Agreed. I think it probably needs to be something like 99.999% right, the current "AI" doesn't cut it. The delight of "this is incredible" is a thing ("it's a monkey juggling watermelons... haha"), but that's when you also laugh off bad summaries, 7-fingered hands, and hallucinations - none of which are endearing when asking a question like "when does XYZ close today?" if you have to count on it while planning your day. Obviously Siri offering to search the web for that isn't good, but a confident wrong answer doesn't help either.

Prompting "Siri, I don't think that's right. Double-check that answer' and getting "hmm, it looks like that was wrong..." is the state of current ML models. I don't really see how that gets turned into a significantly better output just because it's Apple investing in it - loads of companies are investing in it, it's just not baked - and that's the current state of the art. They're statistical models with ungodly amounts of information used to train them, so it looks like magic, but it's still science. Cool, sometimes helpful, but damn you better not just assume the output is correct.

It's not magic, that's for sure. Asking several AI chatbots to fill in a March Madness bracket was a complete joke and illustrated how painfully obtuse LLM-powered AI assistants still are in March 2025. They are years and years away from replicating something that a thoughtful adult could do.

Someday we can look back at this and laugh at it ("boy, those were crazy times"). But right now there are morons paying subscription fees to have early access to alpha software. True insanity but hey, it's not my money they are spending. Feel free to shell out your hard earned dollars to be a tester.

Based on recent failures for these AI assistants in tackling the March Madness bracket as well as identifying the time for the Super Bowl kickoff (asked less than two weeks before the game), I've deleted all AI assistants from my various devices (iPhone, iPad, etc.). Even ChatGPT is gone for the time being.

I'll revisit these in a year or two. Right now they're just an embarrassment and a massive time sink. Or maybe I should keep one and put it in my Games folder on my iPhone.

edited March 20

jas99 · March 20, 2025 10:43PM

mpantone said:

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

Like I said I think Apple's senior management is understanding this which is why they've postponed the AI-enabled Siri. Even if it were 60-80% accurate, that's still too low to be useful. You really don't want a personal assistant -- human or AI -- that makes so many messes that you have to go clean up after them or find an alternate personal assistant to might do some of those failed tasks better. In the end for many tasks right now, you are better off using your own brain (and maybe a search engine) to figure many things out because AI chatbots will unapologetically make stuff up.

All of these AI assistants will flub some questions. The problem is you don't know which one will fail which question at any given moment. That's a waste of time. I think the technology will eventually get there but I'm much more pessimistic about the timeline today compared to a year ago because improvements in these LLMs seems to have stalled or even regressed. I don't know why that is but it doesn't matter to Joe Consumer. It just needs to work. Basically all the time. And right now none of them do.

For sure I am not the first person to ask an AI assistant about the Super Bowl and March Madness. And yet these AI ASSistants have zero motivation to improve accuracy even if they are caught fibbing or screw up an answer.

I've used all of the major AI assistants and they all routinely muck up. The fact that I did not try Gemini is simply because I got far enough by the third AI chatbot to deem this a waste of time. I can't keep jumping from one AI chatbot to another until I find one that gives an acceptable result.

In most cases, the AI assistant doesn't know it is wrong. You have to tell the developer (there's often a thumbs up or thumbs down for the answer). And that doesn't make the AI assistant get it right for the next person who asks the same question. Maybe enough people asked Gemini the question and the programmers fixed Gemini to give the proper response. One thing for sure, AI assistants don't have any common senses whatsoever. Hell, a lot of humans don't either so if LLMs are modeled after humans, it's no wonder that AI assistants are so feeble.

Here in March 2025 all consumer-facing AI assistants are silly toys that wastefully use up way too much electricity and water. I still use some AI-assisted photo editing tools because my Photoshop skills are abysmal. But for a lot of other things, I'll wait for AI assistants to mature. A lot more mature.

You are absolutely correct. I believe Apple also agrees with you. They know the masses are in a delusional frenzy about AI-enabled …. whatever. The truth is the LLM fake-intelligence chatbot DOES NOT WORK and is UNACCEPTABLE for Apple’s products.
Apple has to generate their own neural network machine learning instead of this LLM snake oil.
I’m just sorry Apple committed itself to being a purveyor of a useful system called Apple Intelligence when, apparently, it’s based on snake oil LLMs.

rogue01 · March 20, 2025 11:44PM

So now the lead of one failed product is taking over another failed product? What could possibly go wrong?

Vision Pro has been a complete failure for Apple. Overpriced. No developers. No one cares about AR. It doesn't solve any problem. Walk into an Apple Store and no one pays attention to them. For decades, no one has cared about AR because they don't want goggles on their head. Just like no one wanted to wear 3D glasses to watch TV and 3D TVs are non-existent. So why would anyone spend $3500 for a heavy pair of goggles on their head?

And now this guy is going to fix Siri? The same Siri that Apple has pretty much abandoned for the past 14 years? Gets things wrong constantly and it is inconsistent on the Mac, iPhone, AppleTV and the speaker. Good luck with that.

Apple Intelligence has been a huge disappointment so far. Siri still makes constant mistakes on simple dictation for a text message. Siri, do you want me to send this? NO!

ravnorodom · March 20, 2025 11:46PM

mike snoow said:

The problem isn't a one person problem but lack of investment into AI look at how AI is going every week there is a new product and things are moving too fast its not ur standard thing. Apple is already like 5yrs behind and gap is widening since AI is moving at light speed
U need a 10x investment atleast

Apple had wasted a lot of money on AppleCar that went no where land plus they already put lots of money VisionPro that sells like a snail pace. No sure if they can shield out anymore doles.

mrstep · March 20, 2025 11:57PM

jas99 said:

mpantone said:

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

Like I said I think Apple's senior management is understanding this which is why they've postponed the AI-enabled Siri. Even if it were 60-80% accurate, that's still too low to be useful. You really don't want a personal assistant -- human or AI -- that makes so many messes that you have to go clean up after them or find an alternate personal assistant to might do some of those failed tasks better. In the end for many tasks right now, you are better off using your own brain (and maybe a search engine) to figure many things out because AI chatbots will unapologetically make stuff up.

All of these AI assistants will flub some questions. The problem is you don't know which one will fail which question at any given moment. That's a waste of time. I think the technology will eventually get there but I'm much more pessimistic about the timeline today compared to a year ago because improvements in these LLMs seems to have stalled or even regressed. I don't know why that is but it doesn't matter to Joe Consumer. It just needs to work. Basically all the time. And right now none of them do.

For sure I am not the first person to ask an AI assistant about the Super Bowl and March Madness. And yet these AI ASSistants have zero motivation to improve accuracy even if they are caught fibbing or screw up an answer.

I've used all of the major AI assistants and they all routinely muck up. The fact that I did not try Gemini is simply because I got far enough by the third AI chatbot to deem this a waste of time. I can't keep jumping from one AI chatbot to another until I find one that gives an acceptable result.

In most cases, the AI assistant doesn't know it is wrong. You have to tell the developer (there's often a thumbs up or thumbs down for the answer). And that doesn't make the AI assistant get it right for the next person who asks the same question. Maybe enough people asked Gemini the question and the programmers fixed Gemini to give the proper response. One thing for sure, AI assistants don't have any common senses whatsoever. Hell, a lot of humans don't either so if LLMs are modeled after humans, it's no wonder that AI assistants are so feeble.

Here in March 2025 all consumer-facing AI assistants are silly toys that wastefully use up way too much electricity and water. I still use some AI-assisted photo editing tools because my Photoshop skills are abysmal. But for a lot of other things, I'll wait for AI assistants to mature. A lot more mature.

You are absolutely correct. I believe Apple also agrees with you. They know the masses are in a delusional frenzy about AI-enabled …. whatever. The truth is the LLM fake-intelligence chatbot DOES NOT WORK and is UNACCEPTABLE for Apple’s products.
Apple has to generate their own neural network machine learning instead of this LLM snake oil.
I’m just sorry Apple committed itself to being a purveyor of a useful system called Apple Intelligence when, apparently, it’s based on snake oil LLMs.

I can't figure out if the dislikes are from somebody thinking they're protecting Apple or what. As you and mpantone say, the bottom line is that the technology *isn't* AI, calling it AI is the marketing BS that took off a few years ago - it's not something that thinks or learns. I assume Apple thought it would be faster to crank up the quality from 50% to 99% on generative models, and I assume the underlying technology just isn't there yet.

There are absolutely great things it can do - identify things in photos, drive things like 'smart fill' and 'smart mask' effects in photos and videos that can save tons of time, and then it drops to giving semi-functional code snippets, generating massive quantities of vapid text, and just getting things wrong. Unfortunately for consumers, spotting 7-finger hands, code that doesn't work, and data that's just wrong is easier for people - the models don't.

Not a dig at Apple, I hope they get it working in a way that also protects our data & privacy..

ravnorodom · March 21, 2025 12:04AM

Another reason that triggered this position reshuffling is people is sueing Apple for not delivering AI in time as promised after they purchased the iPhones for this AI feature.

Source:
https://stocks.apple.com/AGVbmcbe4TA2ueVrIol4EtA

gatorguy · March 21, 2025 12:08AM

mpantone said:

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

FWIW, Gemini has made major moves in the past 60 days, so it might be worth giving them more of a shot. But yeah, point make about reliability. 80% isn't good enough to be trusted. That's still solidly in trust but verify territory.

edited March 20

9secondkox2 · March 21, 2025 12:21AM

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

wesley_hilliard · March 21, 2025 12:29AM

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

bulk001 · March 21, 2025 2:37AM

Wesley Hilliard said:

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

Right you are. Now back to making your emojis or whatever emoji related feature Apple brought to its “intelligence”

wesley_hilliard · March 21, 2025 2:57AM

bulk001 said:

Wesley Hilliard said:

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

Right you are. Now back to making your emojis or whatever emoji related feature Apple brought to its “intelligence”

Ah you're so close you might actually realize that "AI" is overhyped nonsense that increases productivity in some respects. Genmoji is actually a great example of what Apple got right with its AI initiative in spite of your sad troll attempt. I also happen to get a lot of use out of the Writing Tools function, and summaries in email, messages, and notifications help greatly with triage.

But sure, tell me what AI has done for you.

dewme · March 21, 2025 3:50AM

Easter is coming soon. One of the popular meat dishes for Easter is Lamb. Sounds like Apple got their order in early this year.

mpantone · March 21, 2025 4:17AM

gatorguy said:

mpantone said:

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.
FWIW, Gemini has made major moves in the past 60 days, so it might be worth giving them more of a shot. But yeah, point make about reliability. 80% isn't good enough to be trusted. That's still solidly in trust but verify territory.

Like I said earlier Gemini also failed to answer the simple question "When is the Super Bowl kickoff?" less than two weeks before the actual game.

Gemini is untrustworthy. All AI ASSistants are untrustworthy. I had a much better attitude about AI ASSistants a year ago. Basically there has been scant improvement in the past year and some notably regressions.

80% is still way too low. I would fire any human personal assistant if that was their success rate. Your standards sucks. And that's basically why today's AI assistants are accepted as they are. Because of people like you.

What if your local coffee shop got 20% of your orders wrong? What if you got traffic tickets 20% of the time you are stopped at a red light? People like you reward piss poor performance. Mediocrity is NOT okay. And Apple knows this is WRONG. If they thought it was right, they'd ship a POS like Grok.

edited March 21

mpantone · March 21, 2025 4:32AM

Wesley Hilliard said:

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

Nah, Vision Pro weighs too much regardless of the image quality. (Disclaimer: I own an Oculus Rift S).

For the ficticious future "Apple Glass" to be successful, it really needs to be around the same weight as my current eyeglasses: about 30 grams.

One of the AI photo features has been helpful. Of course there are third party tools that function similarly, it's just nice to have the Apple Intelligence one closely integrated in the OS.

You're a writer so you are attuned to the writing benefits of Apple Intelligence. Many on this planet won't benefit. As I mentioned elsewhere, Apple Intelligence is good at generating corpspeak, the stuff one shoves into a work e-mail. There's very little style involved. Writing is highly commoditized these days, it doesn't take much effort for an AI assistant to pump out copy that resembles 99.9% of drivel on the Internet anyhow.

And with LLMs training on Internet accessible content, the quality of their datasets is dropping over time. Old school computer scientists have an expression for this: GIGO (Garbage In, Garbage Out). That's a really good way to describe LLM-powered AI chatbot assistant responses in March 2025.

Note that there are similar examples of regression in generative AI photo tools. Some of the AI generation tools today put out less satisfactory images than they did a year ago.

I expect consumer-facing AI to put out more and more useless crap in the next 12-24 months because these models are training themselves on garbage that their predecessors created. There really needs to be some sort of quantum leap in AI model performance to circumnavigate this decline. The longer this takes, the harder it will be for AI models to overcome the poor quality data.

When I do a standard Internet search engine query, I'm seeing mostly garbage. It's pretty easy for me to identify in a few seconds but LLMs don't have any natural quality distinctions, they will take absorb satire, fact, and outright lies equally. This probably explains why LLMs can't effectively process the junkmail folder in my e-mail. What is obvious to me as junkmail goes unrecognized by AI tools.

edited March 21

wesley_hilliard · March 21, 2025 5:17AM

mpantone said:

Wesley Hilliard said:

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

Nah, Vision Pro weighs too much regardless of the image quality. (Disclaimer: I own an Oculus Rift S).

For the ficticious future "Apple Glass" to be successful, it really needs to be around the same weight as my current eyeglasses: about 30 grams.

One of the AI photo features has been helpful. Of course there are third party tools that function similarly, it's just nice to have the Apple Intelligence one closely integrated in the OS.

You're a writer so you are attuned to the writing benefits of Apple Intelligence. Many on this planet won't benefit. As I mentioned elsewhere, Apple Intelligence is good at generating corpspeak, the stuff one shoves into a work e-mail. There's very little style involved. Writing is highly commoditized these days, it doesn't take much effort for an AI assistant to pump out copy that resembles 99.9% of drivel on the Internet anyhow.

And with LLMs training on Internet accessible content, the quality of their datasets is dropping over time. Old school computer scientists have an expression for this: GIGO (Garbage In, Garbage Out). That's a really good way to describe LLM-powered AI chatbot assistant responses in March 2025.

Note that there are similar examples of regression in generative AI photo tools. Some of the AI generation tools today put out less satisfactory images than they did a year ago.

I expect consumer-facing AI to put out more and more useless crap in the next 12-24 months because these models are training themselves on garbage that their predecessors created. There really needs to be some sort of quantum leap in AI model performance to circumnavigate this decline. The longer this takes, the harder it will be for AI models to overcome the poor quality data.

When I do a standard Internet search engine query, I'm seeing mostly garbage. It's pretty easy for me to identify in a few seconds but LLMs don't have any natural quality distinctions, they will take absorb satire, fact, and outright lies equally. This probably explains why LLMs can't effectively process the junkmail folder in my e-mail. What is obvious to me as junkmail goes unrecognized by AI tools.

Apple Vision Pro is no heavier than PSVR2, especially when considering the weight of the tethering cable for PlayStation. I couldn't use Meta's headsets, not just because Meta sucks, but because I want to use something on Apple's platforms. The specs and features are secondary to that imo. Vision Pro wins on all fronts there.

And if you've ever read anything I've written on AI, you'll know I think it's all overhyped nonsense.

lukei · March 21, 2025 6:04AM

charlesn said:

Totally not surprised this happened although I didn't expect it to take this long. Tim Cook is the proverbial iron fist in a velvet glove. It's worth recounting this tale from an urgent meeting Cook had called:

"One day back then, he convened a meeting with his team, and the discussion turned to a particular problem in Asia. “This is really bad,” Cook told the group. “Someone should be in China driving this.” Thirty minutes into that meeting Cook looked at Sabih Khan, a key operations executive, and abruptly asked, without a trace of emotion, “Why are you still here?”
Khan, who remains one of Cook’s top lieutenants to this day, immediately stood up, drove to San Francisco International Airport, and, without a change of clothes, booked a flight to China with no return date, according to people familiar with the episode. The story is vintage Cook: demanding and unemotional."

Very much vintage Cook. Not 2025 Cook from what we see recently.

yucam · March 21, 2025 7:42AM

A Winner replaces a Winner…

saarek · March 21, 2025 7:57AM

The thing is, Siri cannot even tell you what month it is. Try it, “Siri, what month is it”.

It’s an absolute embarrassment to Apple, has been since its release and yet they’ve not bothered to fix it.

Yes, we’ve had promises and the occasional enhancement. But after 14 years on the market it’s still a useless pile of shit and will likely to continue to be.

John Giannandrea out as Siri chief, Apple Vision Pro lead in

Comments