mpantone

About

Username: mpantone
Joined: November 2006
Visits: 805
Last Active: 9:20PM
Roles: member
Points: 3,779
Badges: 1
Posts: 2,528

Reactions

2.1KLike201Dislike347Informative

John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

tokyojimu said:

Maybe they should have been investing time and effort into Siri and AI instead of Jony’s car project.

That ship sailed a long time ago. Now they are playing catchup. Hindsight is always 20-20.

Apple has an additional hurdle: one of their key pillars to the way they operate is their defense of customer privacy. Some of their competition really does not value privacy. In fact, some of them (Meta, Alphabet) make most of their revenue selling users' online activity data.

In the end I would rather wait for a private and reliable AI assistant rather than muck around with alpha-quality AI assistant shovelware that is doing God knows what with my activity. At least with Meta and Alphabet, I know they are selling it.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

Like I said I think Apple's senior management is understanding this which is why they've postponed the AI-enabled Siri. Even if it were 60-80% accurate, that's still too low to be useful. You really don't want a personal assistant -- human or AI -- that makes so many messes that you have to go clean up after them or find an alternate personal assistant to might do some of those failed tasks better. In the end for many tasks right now, you are better off using your own brain (and maybe a search engine) to figure many things out because AI chatbots will unapologetically make stuff up.

All of these AI assistants will flub some questions. The problem is you don't know which one will fail which question at any given moment. That's a waste of time. I think the technology will eventually get there but I'm much more pessimistic about the timeline today compared to a year ago because improvements in these LLMs seems to have stalled or even regressed. I don't know why that is but it doesn't matter to Joe Consumer. It just needs to work. Basically all the time. And right now none of them do.

For sure I am not the first person to ask an AI assistant about the Super Bowl and March Madness. And yet these AI ASSistants have zero motivation to improve accuracy even if they are caught fibbing or screw up an answer.

I've used all of the major AI assistants and they all routinely muck up. The fact that I did not try Gemini is simply because I got far enough by the third AI chatbot to deem this a waste of time. I can't keep jumping from one AI chatbot to another until I find one that gives an acceptable result.

In most cases, the AI assistant doesn't know it is wrong. You have to tell the developer (there's often a thumbs up or thumbs down for the answer). And that doesn't make the AI assistant get it right for the next person who asks the same question. Maybe enough people asked Gemini the question and the programmers fixed Gemini to give the proper response. One thing for sure, AI assistants don't have any common senses whatsoever. Hell, a lot of humans don't either so if LLMs are modeled after humans, it's no wonder that AI assistants are so feeble.

Here in March 2025 all consumer-facing AI assistants are silly toys that wastefully use up way too much electricity and water. I still use some AI-assisted photo editing tools because my Photoshop skills are abysmal. But for a lot of other things, I'll wait for AI assistants to mature. A lot more mature.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.
Apple TV+ is losing billions of dollars -- as planned and expected

mpantone

March 20

8thman said:

I avoid SERIES lock-ins. I don’t want the time commitment.
I prefer Movies.

The Stories are Mediocre and production values are lower than movies.

That's fine, but Apple TV+ is a subscription service. If you just want movies, you can rent them à la carte from Apple or some of their competitors. Series lock-in is a way to keep customers in maintaining subscriptions. And people who follow series are presumably enjoying the ones they follow.

And there are some who think of series as a really long movie that has been chopped up into more digestible chunks -- like chapters of a book.

Multi-part stories go back to the beginning of human civilization. Even the first known story -- the Epic of Gilgamesh -- has multiple chapters. And yes, you could make the Epic of Gilgamesh into a movie. Or a television series.

For sure, one thing a television series can do is go deeper into a story. A movie with a typical two hour runtime simply doesn't have enough space to capture complex or intricate details. We have seen them with many movies like the Lord of the Rings trilogy or the Harry Potter films. Go ahead and rewatch the LotR trilogy and count how many minutes Tom Bombadil appears.

Of course there's an opposite side to television series. The average runtime for a 1 hour television show is 42 minutes of content, the rest are commercials. So each TV episode needs to be around 42 minutes whereas a movie has no such rigid requirements.
New Apple TV+ studio buildings taking shape in Culver City

mpantone

March 20

Nah, Apple won't buy Ubisoft. Ubisoft has a lot of problems and undoubtedly the Guillemot family will ask too much for it yet insist on control.

Apple does not let any of its corporate acquisitions call the shots.

But this article is about a television studio not a videogame studio. It'll be interesting to see what comes out of this facility when it is done (likely five years from now).