mpantone

About

Username: mpantone
Joined: November 2006
Visits: 803
Last Active: August 12
Roles: member
Points: 3,772
Badges: 1
Posts: 2,525

Reactions

2.1KLike201Dislike346Informative

Lumon Terminal Pro in Apple's online store is a master-class on media marketing

mpantone

March 27

Good for Apple, they should milk this for all they can. (Disclaimer: I have never seen even a minute's worth of Severance.)

The media consuming public in 2025 is extremely fickle and soon they will move to something else. The window for a company to take advantage of this level of attention is extremely narrow these days.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 21

Wesley Hilliard said:

9secondkox2 said:

Tim’s ai, headset, and car haven’t panned out so far.

Would love to see a return to the days of Apple not releasing something until it’s 100% ready and captivating. I’ll give Cook’s car a pass since it was never officially announced.

Even the watch was closely guarded. And it was a success when it finally launched. A return to form in that regard would be most welcome.

No more public experiments and betas please.

I'll never understand what people mean when they say this other than "I don't like it so it didn't pan out." By what metrics are you measuring the panning?

Other than delaying one feature, Apple Intelligence is doing fine. Apple didn't promise a sentient machine like others in the space did. Is that a bad thing? Them delaying something that isn't ready instead of releasing it anyway is exactly what you're asking Apple to do, yet you're criticizing them for it.

And unless you know something about how Apple Vision Pro has done so far and Apple's expectation for the product, there's no way of knowing "how it panned out." It's been pretty awesome from my perspective.

Nah, Vision Pro weighs too much regardless of the image quality. (Disclaimer: I own an Oculus Rift S).

For the ficticious future "Apple Glass" to be successful, it really needs to be around the same weight as my current eyeglasses: about 30 grams.

One of the AI photo features has been helpful. Of course there are third party tools that function similarly, it's just nice to have the Apple Intelligence one closely integrated in the OS.

You're a writer so you are attuned to the writing benefits of Apple Intelligence. Many on this planet won't benefit. As I mentioned elsewhere, Apple Intelligence is good at generating corpspeak, the stuff one shoves into a work e-mail. There's very little style involved. Writing is highly commoditized these days, it doesn't take much effort for an AI assistant to pump out copy that resembles 99.9% of drivel on the Internet anyhow.

And with LLMs training on Internet accessible content, the quality of their datasets is dropping over time. Old school computer scientists have an expression for this: GIGO (Garbage In, Garbage Out). That's a really good way to describe LLM-powered AI chatbot assistant responses in March 2025.

Note that there are similar examples of regression in generative AI photo tools. Some of the AI generation tools today put out less satisfactory images than they did a year ago.

I expect consumer-facing AI to put out more and more useless crap in the next 12-24 months because these models are training themselves on garbage that their predecessors created. There really needs to be some sort of quantum leap in AI model performance to circumnavigate this decline. The longer this takes, the harder it will be for AI models to overcome the poor quality data.

When I do a standard Internet search engine query, I'm seeing mostly garbage. It's pretty easy for me to identify in a few seconds but LLMs don't have any natural quality distinctions, they will take absorb satire, fact, and outright lies equally. This probably explains why LLMs can't effectively process the junkmail folder in my e-mail. What is obvious to me as junkmail goes unrecognized by AI tools.
Apple's premature Apple Intelligence ad subject of new lawsuit

mpantone

March 21

Even if the lawsuit is meritless it draws negative attention to Apple.

The best way they can counteract this is to quickly ship a super-useful, super-effective, super-private AI-powered Siri, just like the one they pictured in their now-revoked advertisements just a few months ago.

They talked the talk. Now they have to walk the walk. Remember that Apple themselves put this forth.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 21

gatorguy said:

mpantone said:

gatorguy said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars

I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.
FWIW, Gemini has made major moves in the past 60 days, so it might be worth giving them more of a shot. But yeah, point make about reliability. 80% isn't good enough to be trusted. That's still solidly in trust but verify territory.

Like I said earlier Gemini also failed to answer the simple question "When is the Super Bowl kickoff?" less than two weeks before the actual game.

Gemini is untrustworthy. All AI ASSistants are untrustworthy. I had a much better attitude about AI ASSistants a year ago. Basically there has been scant improvement in the past year and some notably regressions.

80% is still way too low. I would fire any human personal assistant if that was their success rate. Your standards sucks. And that's basically why today's AI assistants are accepted as they are. Because of people like you.

What if your local coffee shop got 20% of your orders wrong? What if you got traffic tickets 20% of the time you are stopped at a red light? People like you reward piss poor performance. Mediocrity is NOT okay. And Apple knows this is WRONG. If they thought it was right, they'd ship a POS like Grok.
John Giannandrea out as Siri chief, Apple Vision Pro lead in

mpantone

March 20

mrstep said:

mpantone said:

As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

"Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.

Agreed. I think it probably needs to be something like 99.999% right, the current "AI" doesn't cut it. The delight of "this is incredible" is a thing ("it's a monkey juggling watermelons... haha"), but that's when you also laugh off bad summaries, 7-fingered hands, and hallucinations - none of which are endearing when asking a question like "when does XYZ close today?" if you have to count on it while planning your day. Obviously Siri offering to search the web for that isn't good, but a confident wrong answer doesn't help either.

Prompting "Siri, I don't think that's right. Double-check that answer' and getting "hmm, it looks like that was wrong..." is the state of current ML models. I don't really see how that gets turned into a significantly better output just because it's Apple investing in it - loads of companies are investing in it, it's just not baked - and that's the current state of the art. They're statistical models with ungodly amounts of information used to train them, so it looks like magic, but it's still science. Cool, sometimes helpful, but damn you better not just assume the output is correct.

It's not magic, that's for sure. Asking several AI chatbots to fill in a March Madness bracket was a complete joke and illustrated how painfully obtuse LLM-powered AI assistants still are in March 2025. They are years and years away from replicating something that a thoughtful adult could do.

Someday we can look back at this and laugh at it ("boy, those were crazy times"). But right now there are morons paying subscription fees to have early access to alpha software. True insanity but hey, it's not my money they are spending. Feel free to shell out your hard earned dollars to be a tester.

Based on recent failures for these AI assistants in tackling the March Madness bracket as well as identifying the time for the Super Bowl kickoff (asked less than two weeks before the game), I've deleted all AI assistants from my various devices (iPhone, iPad, etc.). Even ChatGPT is gone for the time being.

I'll revisit these in a year or two. Right now they're just an embarrassment and a massive time sink. Or maybe I should keep one and put it in my Games folder on my iPhone.