mpantone

About

Username: mpantone
Joined: November 2006
Visits: 803
Last Active: August 12
Roles: member
Points: 3,772
Badges: 1
Posts: 2,525

Reactions

2.1KLike201Dislike346Informative

John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

No one can be the best at everything. Clearly Giannandrea wasn't. Neither is Cook. Neither was Jobs. Neither are you. Neither am I.

It's important to point out that Services revenue has grown massively under the current leadership.

I think Cook is smart enough to know that when brainstorming future software roadmaps, there might be someone else in the room who should be holding the whiteboard marker for the majority of time.

Tim isn't writing code for Apple, he relies on his direct reports to say "Yes, we can make ____ happen by ____." This is not unique, this happens in all sorts of businesses all around the world every single minute. Somewhere on this planet, there is a restaurant kitchen getting slammed. Some line cook is telling their chef "I got this" or "I could use a hand here".

Clearly some deadlines were missed concerning AI-powered Siri hence the change. However it's also important to point out that Siri is not a P&L center unlike Apple TV+ or iCloud or Fitness+ or the Apple Watch hardware division.

A person can make all the snarky armchair CEO comments they want. But doing so risks unveiling how much that person knows about working in a business, whether it be some mom-and-pop shop, a Fortune 10 megacorp or somewhere in between.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

Let's remember that John Giannandrea was the former head of AI at Google. It's not like he was unqualified for his post at Apple. However being a good researcher and shipping product are two very different disciplines.

Unfortunately Apple took its sweet time to make this change, just like letting Project Titan fester for years.

Let's also remember that consumer-facing AI is new technology still in its infancy. It's not like there's any (consumer) company that has been doing this for 20+ years. Apple only started including machine learning silicon in their chips in 2017.

Everyone is pretty new to AI which is why not a single consumer-facing AI assistant is head-and-shoulders better than the competition. It's all alpha quality right now. And it doesn't look like whoever has the most datacenter Nvidia GPUs wins either.

Like most Americans with a retirement plan, I am an indirect investor in almost all of the major players. I have a vested interest in seeing some level of success from all of them. Competition is good, it drives quality, innovation and value. I also appreciate Apple's commitment to privacy. This reason itself makes me want Apple to be a top competitor in this field.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

Like I said I think Apple's senior management is understanding this which is why they've postponed the AI-enabled Siri. Even if it were 60-80% accurate, that's still too low to be useful. You really don't want a personal assistant -- human or AI -- that makes so many messes that you have to go clean up after them or find an alternate personal assistant to might do some of those failed tasks better. In the end for many tasks right now, you are better off using your own brain (and maybe a search engine) to figure many things out because AI chatbots will unapologetically make stuff up.

All of these AI assistants will flub some questions. The problem is you don't know which one will fail which question at any given moment. That's a waste of time. I think the technology will eventually get there but I'm much more pessimistic about the timeline today compared to a year ago because improvements in these LLMs seems to have stalled or even regressed. I don't know why that is but it doesn't matter to Joe Consumer. It just needs to work. Basically all the time. And right now none of them do.

For sure I am not the first person to ask an AI assistant about the Super Bowl and March Madness. And yet these AI ASSistants have zero motivation to improve accuracy even if they are caught fibbing or screw up an answer.

I've used all of the major AI assistants and they all routinely muck up. The fact that I did not try Gemini is simply because I got far enough by the third AI chatbot to deem this a waste of time. I can't keep jumping from one AI chatbot to another until I find one that gives an acceptable result.

In most cases, the AI assistant doesn't know it is wrong. You have to tell the developer (there's often a thumbs up or thumbs down for the answer). And that doesn't make the AI assistant get it right for the next person who asks the same question. Maybe enough people asked Gemini the question and the programmers fixed Gemini to give the proper response. One thing for sure, AI assistants don't have any common senses whatsoever. Hell, a lot of humans don't either so if LLMs are modeled after humans, it's no wonder that AI assistants are so feeble.

Here in March 2025 all consumer-facing AI assistants are silly toys that wastefully use up way too much electricity and water. I still use some AI-assisted photo editing tools because my Photoshop skills are abysmal. But for a lot of other things, I'll wait for AI assistants to mature. A lot more mature.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.
Behind closed doors, Apple is embarrassed by its slow Siri rollout, too

mpantone

March 14
Two-thirds of the time is horrid. Even 90% is useless.

Put it in perspective using an actual real world comparable scenario: a human personal assistant.

Let's say you give your human P.A. three tasks:
1. pick up dry cleaning (via TaskRabbit for the AI assistant),
2. e-mail vendor that their account will be past due tomorrow thus incurring a 1.5% service charge, and
3. book round trip flight on April 17th from Los Angeles to San Jose (SJC in California)
Your AI assistant only correctly accomplished two of the three tasks. Now if it's the dry cleaning, that's maybe not a big deal. But the other two are. And there are plenty of ways the AI assistant can screw up. Maybe they told the vendor they would be fired tomorrow. Maybe the AI assistant quotes a 2.5% service charge. Maybe the AI assistant books you to SJO (San Jose, Costa Rica) instead of SJC.

The problem is you don't get to choose which task the AI assistant fails at.

Now if you had a human personal assistant, you'd fire them for effing up #2 or #3.

Realistically a useful AI assistant (or human assistant) really needs to be about 99.8% accurate. Assistants need to be reliable, accurate, and private. And not just two of those three attributes.

What if your cellular provider didn't deliver 40% of your text messages? Your transit card fails at 40% of fare gates. Your car won't start three days a week? Your credit card fails to authorize a couple times a day?

Hell, what if the Tokyo Metro subway payment system screwed up 0.02% of transactions every day? That's literally thousands of rides. Or if ATMs gave the wrong amount of cash withdrawals that many times. If you had a Pasmo subway transit card that only worked 40% of the time, you'd probably give up and just buy paper tickets from the ticket vending machine.

Apple knows this. An AI-assisted assistant needs to be way better than current Siri. It needs to at least be as good as a really, Really, REALLY good human assistant because going back to clean up someone else's mess (AI or human) takes too much time. And you lose trust in that assistant very quickly.

"Fake it until you make it" is not a credible business plan in the real world. That's something Elizabeth Holmes would do.

Apple cannot afford to put out an AI-assisted Siri that only gets things right two-thirds of the time and promise that it'll get better. We already have way too many LLM-powered AI chatbot assistants that dole out garbage on a regular basis. The world is not going to be any better with Yet Another Lame Assistant.