jas99

About

Username
jas99
Joined
Visits
99
Last Active
Roles
member
Points
734
Badges
1
Posts
185
  • John Giannandrea out as Siri chief, Apple Vision Pro lead in

    mpantone said:
    gatorguy said:
    mpantone said:
    As far as I can tell, consumer-facing AI isn't improving in leaps and bounds anymore, and probably hasn't for about a year or 18 months.

    A couple weeks before Super Bowl I asked half a dozen LLM-powered AI assitant chatbots when the Super Bowl kickoff was scheduled. Not a single chatbot got it right.

    Earlier today I asked several chatbots to fill out a 2025 NCAA mens basketball tournament bracket. They all failed miserably. Not a single chatbot could even identify the four #1 seeds. Only Houston was identified as a #1 seed by more than one chatbot, probably because of their performance in the 2024.

    I think Grok filled out a fictitious bracket with zero upsets. There has never been any sort of major athletic tournament that didn't have at least one upset. And yet Grok is too stupid to understand this. It's just a dumb probability calculator that uses way too much electricity.

    Context, situational awareness, common sense, good taste, humility. Those are all things that AI engineers have not programmed yet into consumer facing LLMs.

    An AI assistant really need to be accurate 99.8% of the time (or possibly more) to be useful and trustworthy. Getting one of the four #1 seeds correct (published on multiple websites) is appallingly poor. If it can't even identify the 68 actual teams involved in the competition, what good is an AI assistant? Why would you trust it to do anything else? Something more important like schedule an oil change for your car? Keep your medical information private?

    As I said a year ago, all consumer facing AI is still alpha software. It is nowhere close to being ready for primetime. In several cases there appears to be some serious regression.

    25% right isn't good enough. Neither is 80%. If a human assistant failed 3 out of 4 tasks and you told them so, they would be embarrassed and probably afraid that they would be fired. And yes, I would fire them.

    Apple senior management is probably coming to grips with this. If they put out an AI-powered Siri that frequently bungles requests, that's no better than the feeble Siri they have now. And worse, it'll probably erodes customer trust.

    "Fake it until you make it" is not a valid business model. That's something Elizabeth Holmes would do. And she's in prison.
    Did you try Gemini, currently 2.0 Flash? In a voice search on my Pixel it listed South Auburn Tigers, West Gators, East Duke, and Midwest Cougars
    I did not give Gemini the bracket question. I did give it the Super Bowl question which it failed like the others.

    Your comment brings up an important illustrative point. No one has the time to dork around with 7-8 AI chatbots to find one (or more) that gives the correct answer for each question. That's not a sustainable approach.

    There's probably some AI chatbot that will might get the right answer to a simple question. The problem is no AI chatbot is reliably accurate enough to instill trust and confidence. I can't ask ten questions to 8 chatbots and wade through the responses. In the same way, having ten human personal assistants isn't a worthwhile approach.

    Let's say Grok has a 20% accuracy score and Gemini is 40%. That's double the accuracy for Gemini but it still is way too low to be trusted and deemed reliable.

    Like I said I think Apple's senior management is understanding this which is why they've postponed the AI-enabled Siri. Even if it were 60-80% accurate, that's still too low to be useful. You really don't want a personal assistant -- human or AI -- that makes so many messes that you have to go clean up after them or find an alternate personal assistant to might do some of those failed tasks better. In the end for many tasks right now, you are better off using your own brain (and maybe a search engine) to figure many things out because AI chatbots will unapologetically make stuff up.

    All of these AI assistants will flub some questions. The problem is you don't know which one will fail which question at any given moment. That's a waste of time. I think the technology will eventually get there but I'm much more pessimistic about the timeline today compared to a year ago because improvements in these LLMs seems to have stalled or even regressed. I don't know why that is but it doesn't matter to Joe Consumer. It just needs to work. Basically all the time. And right now none of them do.

    For sure I am not the first person to ask an AI assistant about the Super Bowl and March Madness. And yet these AI ASSistants have zero motivation to improve accuracy even if they are caught fibbing or screw up an answer.

    I've used all of the major AI assistants and they all routinely muck up. The fact that I did not try Gemini is simply because I got far enough by the third AI chatbot to deem this a waste of time. I can't keep jumping from one AI chatbot to another until I find one that gives an acceptable result.

    In most cases, the AI assistant doesn't know it is wrong. You have to tell the developer (there's often a thumbs up or thumbs down for the answer). And that doesn't make the AI assistant get it right for the next person who asks the same question. Maybe enough people asked Gemini the question and the programmers fixed Gemini to give the proper response. One thing for sure, AI assistants don't have any common senses whatsoever. Hell, a lot of humans don't either so if LLMs are modeled after humans, it's no wonder that AI assistants are so feeble.

    Here in March 2025 all consumer-facing AI assistants are silly toys that wastefully use up way too much electricity and water. I still use some AI-assisted photo editing tools because my Photoshop skills are abysmal. But for a lot of other things, I'll wait for AI assistants to mature. A lot more mature.
    You are absolutely correct. I believe Apple also agrees with you. They know the masses are in a delusional frenzy about AI-enabled …. whatever. The truth is the LLM fake-intelligence chatbot DOES NOT WORK and is UNACCEPTABLE for Apple’s products.
    Apple has to generate their own neural network machine learning instead of this LLM snake oil.
    I’m just sorry Apple committed itself to being a purveyor of a useful system called Apple Intelligence when, apparently, it’s based on snake oil LLMs.
    williamlondonmrstepmuthuk_vanalingamelijahgwatto_cobra
  • How the new Apple Invites app works, and when you want to use it

    Why all the controversy? Apple decided to create a piece of software that integrates well with your entire Apple account.
    I tried it. Seems OK. It might provide a more visually appealing invitation than if it were done through other means.
    This probably did not take much time away from Apple’s other engineering efforts.
    Great! Let’s move on to be upset about things that deserve to raise our hackles. 
    watto_cobra9secondkox2
  • iOS 18.3 arrives with Visual Intelligence update, notification summary changes

    No issues with the upgrade.
    williamlondonfred1JanNLdewme
  • If you want to buy the 2022 iPhone SE, do it now

    Afarstar said:
    Who on earth would buy a 2022 model SE?
    You don’t know the situation of everybody else in the world. I personally am not interested, but others may be.

    I appreciate a notification like this from AppleInsider.
    williamlondonlam92103muthuk_vanalingam
  • iPhone 16 Camera Control button -- the ultimate guide

    I think this is really innovative and I like it.
    mike1williamlondongrandact73shervin