This will actually be solved in a week. All it takes is to add the current time to each input.

I just used the voice feature in my truck to enter an address for Google maps like always, it came up as Gemini with a long speech. I repeated the address, it asked me if I wanted the location in my home city or one in a city over 400 miles away. Regression with exponential cost.

And every fake-friendly long-winded response consumes more electricity and water than it should, while also being useless.

Makes me so angry. All the problems that couldve been solved with that kinda money. Climate crisis. World hunger. Population migration. Housing affordability.

If Trump triggered WW3 and we all got nuked id be fine with it. We don't deserve to exist.

Instead, all that money is being used to accelerate our doom. AI datacenters unnecessarily consuming power and drinking water in small towns everywhere. Many just dumping humidity into the air and letting that water literally blow away via lazy evaporative cooling. Most "normal" water consuming processes consume, treat, and return water to the downstream-traveling aquifer.

Now, couple that with an overall warming climate. When air is warmer, the more moisture the air can hold. So we end up with more water vapor in the air than normal. With the weirding factor of climate change, this means more water energy for more powerful and destructive storms the likes of which humanity has never seen. Which feeds back into more ice melting, oceans rising, permafrost melting, cycle, accelerate, cycle, accelerate.

Also, real curious to see how millions of warehouses belching humidity and heat into the air across the surface of the globe can affect the general weather patterns, but that sadly won't be known until after the damage is done.

There was an ad during the Super Bowl that succinctly sums up how I feel right now: “America deserves Pepsi”

The climate crisis couldn't be solved with such a small sum of money, and addressing world hunger would decimate our economies. A person in Africa wouldn't pick coffee beans for ten cents a day if he wasn't really, really hungry.

The climate crisis couldn't be solved with such a small sum of money

852 billion isn't 'a small sum of money'. And thats just OpenAI - add to it what Google, MS, Meta etc have spent and I'll bet you'll get close.

addressing world hunger would decimate our economies

What will decimate the global economy is the correction that will happen when either AI proves to be a folly or investors realise they'll never recoup what they've paid out. Or when theres mass unemployment cos AI has taken all the human jobs. We're cooked whichever way it goes.

Why's this need to be on the LLM? They control the app, can't they just make a tool call out?

Hey, set a timer for 60 seconds.

ChatGPT analyzes text

You want a timer for 600 seconds, got it!

Sets timer for 600 seconds with api.

This is most unhinged take from both sides.

Time can’t exist in LLM by design: it’s just a thing that predicts next token based on previous tokens. There is no temporal relation between tokens. You can stop and resume generation at any point. How anyone expect it to “count time”? Based on what? The best you can do is add time mark to model input at some interval.

Simplifying, somewhat complex biological systems have some kind of clocks that actually chemically tick and induce some kind of signal that they can react on.

LLMs can’t do that like at all. They never will. Some other architecture that runs in cycles? Maybe. But transformer shit? Never ever.

Some other architecture that runs in cycles?

Spiking networks!

Nobody really has a good handle on which SNN architecture would be ideal for which task and they're ridiculously hard to train, but if there is ever going to be an AGI (let alone an ASI), my money is on it popping out of something like an SNN that can also simulate neuroplasticity

The issue is that ChatGPT will tell you that it can do those things. Most of the hype for "AI" has been predicated on treating it like actual artificial intelligence and not the LLM parrot it truly is

I don't think anybody is expecting an LLM to do it

what they are expecting is the product, chatGPT, to be a one-stop spot that can do basic tasks like that

Basically just rely on tool calling to a time server of some sort 🤷

Tell me the name of person who needs an AI for setting up timers. It is useless as ****!

To be fair, timers are hard.

Lets give it a try and see how far we get:

00:00:01

00:00:02

00:00:0E

00:00:0F

The hexadecimal value of 00:00:0F relates to an almost black color with a faint of blue — many call it the Commodore black as it was the same color as the ill-fated Commodore CDTV console, which was based on the Commodore 64 home computer. It is not only a nice color — it is also a statement of love for retro Commodore consoles.

Commodore.

Commodore.

Commodore.

Commodore.

Commodore.

Commodore.

Commodore.

Commodore.

ERROR! MODEL COLLAPSE DETECTED! TERMINATING SESSION AND CLEARING CONTEXT WINDOW.

Wednesday

null

I shall gladly pay you Tuesday for a hamburger today!

Seven. A.M. Case the restaurant, run background checks on the staff. Can the cook be trusted? If not, I gotta kill him. Dispose of the body, replace him with my own guy no later than 4:30...

Blenteenth of Farglesblat

if you have a variable clock speed, yeah.

Dear Scat Altman, just add a timestamp at each response that the LLM can read

Its a Large Language Model, not a Large Number Model.

How about a small timer?

Alt-Man is a size queen.

Sam Altman wants funding right?

Here is an idea. I would pay 1000 dollars to get in a boxing ring with this guy, and probably a lot of other people would love to get a shot at that punchable face, no?

We have solved funding.

Shit like this is a reminder to me that a large portion behind some AI products' hype are people who have no clue what these products even do. I wonder how the world would change, if these jack of all trades who ~~invest~~ waste so much time into collecting ideas to fill up their pockets, instead spent more time on actually understanding the ideas they have chosen and build at least a fundamental knowledge.

Upper management in a nutshell

I wonder how the world would change, if these jack of all trades who ~~invest~~ waste so much time into collecting ideas to fill up their pockets, instead spent more time on actually understanding the ideas they have chosen and build at least a fundamental knowledge.

I am afraid they would be even more dangerous

He's going to ask US Congress for a bailout with taxpayers money when this all fails and Congress is going to most likely give it to him because this one company is a huge part of the US economy

And because they can frame it as being a national security issue because otherwise China's LLMs will dominate.

I don't think so, and I'm on the Ed Zitron train of thought why not.

The financial instruments got a bailout in 08, because the economy itself would stop functioning. That's different than the stocks would drop. Also, there's like nothing to bail out? OpenAI and their ilk are just sucking down capital and returning nothing. Even if they get one bailout, they need a continuous stream of unlimited money forever? I don't think it'll happen.

I hope I'm right, cuz damn that shit is cancerous

If Trump is still in charge when the bubble pops, he'll do everything he can to bail them out. Altman knows how to flatter people, and he's doing that constantly with Trump. A significant part of Trump's base is silicon valley techbros who will lose their shirts if the bubble collapses. They had enough sway to get their guy installed as the VP. Getting a bailout will be easy for them. If they get poor, they won't be able to fund the MAGA movement.

Even if Trump isn't in charge anymore. Businesses that have fired a lot of employees and replaced what they did with LLM slop will say their businesses will be ruined if the bubble suddenly pops, so they'll frame it as the economy collapsing if the LLM bubble is allowed to pop. Not to mention they'll claim it's a national security matter because if American LLMs disappear the only ones left will be Chinese ones, and that would be a threat to national security. The fact that the military is extensively using LLMs in their bombing of Iran shows how integrated they now are into the way the military does things, and you can't ask the military to just go back to how things were done 5 years ago!

I expect that when the LLM bubble starts to pop, there will be enormous bailouts from the government, adding tens of trillions to the US debt. That's a long-term thing and will be someone else's problem.

I think a potential OpenAI "bailout" should go something like this:

  • The investors get their money back.
  • They have to sign a pact that they must not invest into AI anymore for a given amount of years (20+ minimum).
  • Massive regulatory overhaul to make sure stuff like this never happen. Also undo Ford v. Dodge Brothers.
  • Scam Altman and the others go to life in prison.

... why should the investors get their money back? They invested ludicrous amounts of money into a technology with obvious limitations from the start with the intention of using that technology to replace many people's jobs. Losing that money will be a better lesson than some probably unenforceable "pact".

They have to sign a pact that they must not invest into AI anymore for a given amount of years (20+ minimum).

Problem is this might hurt actual AI research to punish a scam that has absolutely nothing to do with AI other than having coopted the name for marketing purposes.

(Any investment in actual AI research is doomed for decades anyway when this bubble pops, but this would cause even more harm than the bubble has already caused.)

(Also any form of research is probably ruined for decades anyway due to LLM-induced brain rot and having to sift through all the slop to try to recover any remaining fragments of actually useable knowledge, but, again, let's not make it even worse than it already is).

See, it does not understand time, so in order to.Vibe code in timer funxtionaloty, they need to start feeding it clocks.

Even if it could, it would be an order of magnitude more inefficient in terms of convenience than the stopwatch we already have on our phones.

"Hey ChatGPT, do the thing I could have done in 3-4 clicks on my clock app."

Not to mention the sheer wastefulness in terms of energy. A MINECRAFT REDSTONE MACHINE TIMER WOULD BE MORE EFFICIENT. (Not to mention that, unlike SOTA LLMs, it can run offline on a phone)

You are correct but I think you are missing the point.

Remember, from the perspective of all AI companies (OpenAi probably more than most), AI is this monster tech that will surely replace all workers and even your Grandma as it can bake better cookies.

This is yet another display of how lacking AI is in a simple, everyday task... but more importantly, it is a gigantic demonstration of how AI is completely blind to its own weaknesses which is what makes it really really dangerous when used as prescribed by OpenAi and the others

This situation is basically the same as when the brand new $700 iPhones (back when that was eye watering expensive for a phone) could not run an alarm in the mornings and Apple's answer was something like "why are you using your Cadillac phone as a cheap alarm?"... it should fucking wake me up with a massage for that cost!

Oh yeah, that's definitely a problem too. And part of the problem is that AI companies seem to be blind to the LLMs' weaknesses as well.

minecraft is turing-complete, so, like, you can do a whole lot more than just be a timer.

People have coded Minecraft in Minecraft.

Absolutely. I was thinking of getting back into minecraft Redstone but I'd rather do it in a non-Microsoft alternative. Not to mention at least a dozen other projects on my backlog

Check out Luanti and Voxelibre!

Will do, thanks for the recommendations!

An open-source voxel game creation platform. Play one of our many games solo or together. Mod a game as you see fit, or make your own.

https://www.luanti.org/en/

VoxeLibre is a survival sandbox game for Luanti. Survive, gather, hunt, mine for ores, build, explore, and do much more. Inspired by Minecraft, pushing beyond.

https://content.luanti.org/packages/wuzzy/mineclone2/

Mesecons! They're yellow, they're conductive, and they'll add a whole new dimension to Luanti's gameplay.

Mesecons implements a ton of items related to digital circuitry, such as wires, buttons, lights, and even programmable controllers. Among other things, there are also pistons, solar panels, pressure plates, and note blocks.

Mesecons has a similar goal to Redstone in Minecraft, but works in its own way, with different rules and mechanics.

https://content.luanti.org/packages/Jeija/mesecons/

Yeah, I'd be playing minecraft, too, especially with my nephew, but uhm. that whole microslop thing ruins it for me.

We do enjoy space engineers, though. (lots of mining, lots of building. Nephew loves it when Klang accepts our sacrifices.) It's a bit more involved than minecraft, though.

(but niftishly, there's programable blocks that will let you write c# code and... do things.) (space engineers 2 is in early access if you have the hardware.)

You might like Logic World

It looks interesting, but I'm looking for something with more... world in it. Something to use my logic circuits with (count storage items, simple store, item mail system...)

The public fundamentally misunderstands this tech because salesman lied to them. An LLM is not AI. It just says the most likely thing based off what is most common in its training data for that scenario. It can't do math or problem solve. It can only tell you what the most likely answer would be. It can't do function things. It's like Family Feud where it says what the most people surveyed said.

Some of them will "do math" but not with the LLM predictor, they have a math engine and the predictor decides when to use it. What's great is when it outputs results, it's not clear if it engaged the math engine or just guessed.

when it outputs results, it's not clear if it engaged the math engine or just guessed

That depends on the harness though. In the plain model output it will be clear if a tool call happened, and it depends on the application UI around it whether that's directly shown to the user, or if you only see the LLM's final response based on it.

In all the UIs i have seen, not even 1 will tell you that it called the math engine, maybe it does happn with "thinking" models but i never tried

Edit: i tries with deepseek, i don't hsve enough math knowledge to do crazy stuff, so i did an addition lmao

Quick note on terminology, there's no thing called a "math engine". Most models have the ability to run custom computer code in some way as one of the "tools" they have available, and that's what's used if a model decides to offload the calculation, rather than answer directly.

This is what that looks like in Claude Code:

Notice the lines starting with a green dot and the text Bash(python3...). Those are the model calling the "Bash" tool to run python code to answer the second and third question. The first question it answered (correctly, btw) without doing any tool call, that's just the LLM itself getting it right in a straight shot, similar to DeepSeek in your example. Current models are actually good enough to generally get this kind of simple math correct on their own. I still wouldn't want to rely on that, but I'm not surprised it got it correct without any tool calls.

So I tested my more complex calculations against DeepSeek, and it seems like (at least in the Web UI) it doesn't have any access to a math or code running tool. It just starts working through it in verbose text, basically explaining to itself how to do manual addition like you learn in school, and then doing that. Incredibly wasteful, but it did actually arrive at the correct answers.

Gemini is the only web-based AI app I thought to test right now that seems to have access to a code running tool, here's what that looks like:

It's hidden by default, but you can click on "Show code" in the top right to see what it did.

This is what I mean when I say the harness matters. The models are all pretty similar, but the app you're using to interact with them determines what tools are even made available to the LLM in the first place, and whether/how you're shown when it calls those tools.

I explain it as asking 100 people to Google something and taking the most common answer.

Yeah, that's basically exactly what family feud does.

Yep but instead of "name something a woman keeps in her purse" it's "write my legal document" or "is it ok to lick a lamp socket"

Great question! The answer to all three of your queries is "yes." Would you like me to search for the nearest lamp socket?

Is a human much different? We too require tons of training and we too are prone to stupid mistakes.

Fundamentally yes and no. Original commentor could've saved his breath, if people wanted to be educated on AI they have plenty of resources to do so but instead they choose to remain ill informed. The difference is that humans are capable of critical thinking and conceptual connection. We are just as prone to mistakes as AI, we just have a much higher apptitude for mistakes lol. Hence the goal not being to make a perfect AI, its a much more achievable goal of making AI's that beat us in specific fields. Then to beat us in all fields.

Yes.

I know Lemmy hates AI with a fiery passion (and I too hate it for various reasons), but the ability to make this sort of prediction in a way far more stable than whatever else came before with natural language processing (fancy term of the day for those who havem't heard of it), and however inefficiently built and ran it is, is useful if you can nudge it enough in a certain direction. It can't do functional things reliably, but if you contain it to only parse human language and extract very specific information, show it in a machine-parsable way, and then use that as input for something you can program, you've essentially built something that feels like it can understand you in human language for a handful of tasks and carry out those tasks (even if the carrying out part isn't actually done by an LLM). So pedantically, it's not AI, but most people not in tech don't know or care about the difference. It's all magic all the way down like how computers should just magically do what they're thinking of. That's not changed.

My point though, and this isn't targeting you specifically dear OC, is that we can circlejerk all we want here, but echoing this oversimplification of what LLMs can do is pretty irrelevant to the bigger discourse. Call these companies out on their practices! Their hypocrisy! Their indifference to the collapse of our biosphere, human suffering, letting the most vulnerable to hang high and dry!

Tech is a tool, and if our best argument is calling a tool useless when it's demonstrably useful in specific ways, we're only making a fool of ourselves, turning people away from us and discouraging others from listening to us.

But if your goal is to feel good by letting one out, please be my guest.

Peace

The only way to know if LLM output is accurate is to know what an accurate output should look like, and if you know that, you don't need an LLM. If you don't know what an accurate output should look like, an LLM is equally likely to confidently lie to you as it is to help you, making you dumber the more you use it. The only other situation is if you know what an accurate output should look like, but you want an inaccurate one, which is a bad thing to encourage.

"Demonstrably useful" is a lie. It's a blatant and obvious lie. LLMs are so actively detrimental to their users, and society as a whole, that calling them useless is being generous. And even if they were the most beneficial thing on the planet, there is still no reason to use the billionaire's toxic Nazi plagiarism machine.

The only way to know if LLM output is accurate is to know what an accurate output should look like, and if you know that, you don’t need an LLM

I empathize with your overall standpoint, but that's just plain wrong. There are a lot of problems where verifying an answer is much easier for a human (or non-LLM computer program) than coming up with a correct answer.

Anything that involves language manipulation, for example. I'll have a much easier time checking a translation from English to German for accuracy than doing the full translation myself, assuming the model gets most of it correct and I don't have to rewrite anything major (which is generally the case with current models). Or letting an LLM proof-read a text I wrote - I can't be sure it got everything, but the things it does find are trivial for me to verify, and will often include things that slipped past me and three other people who proof-read the same text. Less useful, but still applicable to the premise: Producing a set of words that rhyme with a given one. Coming up with new ones after the first couple that pop into your head gets pretty hard, but checking if new candidates actually do rhyme is trivially easy.

Moving on from language-stuff, finding security issues in software is a huge one - finding those is often extremely hard, but verifying them is mostly pretty straightforward if the report is well prepared. Models are just now getting good enough to reliably produce good security reports for actual issues.

Answering questions about a big codebase, where the actual value doesn't lie in the specific response the model gives, but pointing me to the correct places in the code where I can check for myself.

Producing code or entire programs is a bit more debatable and it depends heavily on the goal and the skill level of the operator whether complete verification is actually easier than doing it yourself.

Just a couple of examples. As I said I get where you're coming from, but completely denying any kind of utility does not help your cause at all, it just make you look like an absolutist who doesn't know what they're talking about.

If you know enough to verify a translation as accurate, or you have the tools to figure out an accurate translation through dictionaries or some such, then you know enough to do the translation yourself. If you don't, then I cannot trust your translation.

And if you can't trust the output to be comprehensive or correct, then why would you trust something like system security to an LLM? Any security analyst who deserves their job would never take that risk. You don't cut those corners.

Quick reminder: rhyming dictionaries exist. LLMs solved a solved problem, but worse.

Once again, even if the billionaire's toxic Nazi plagiarism machine was useful, it is so morally repugnant that it should never be used, which makes it functionally useless. This is an absolute statement, but trying to "um actually" that makes you look like either a boot-licker, a pollutant, a Nazi, a plagiarist, an idiot, or some combination of those.

I would rather look like an absolutist. How about you?

If you know enough to verify a translation as accurate, or you have the tools to figure out an accurate translation through dictionaries or some such, then you know enough to do the translation yourself.

Correct. But it's going to take me a lot more work and time, possibly to the point of not being feasible and probably even matching the energy cost of using the LLM over the entirety of the task.

why would you trust something like system security to an LLM?

I wouldn't. I don't know where you got that. Adding LLM-based analysis to your toolkit to spot important issues that otherwise might not have been found is just that: an addition. Not replacing anything. And it is demonstrably useful for that at this point, there's just no denying that.

Once again, even if the billionaire’s toxic Nazi plagiarism machine was useful, it is so morally repugnant that it should never be used, which makes it functionally useless.

My point is that if you are this confidently wrong about the capabilities of LLM-based tools, then why should I believe you to be any less wrong about the moral and ethical issues you're raising? It looks like you're either completely misinformed or deliberately fighting a strawman for a part of your argument, so it gives anyone on the other side an easy excuse to just not engage with the rest of it and just dismiss it entirely. That's what I'm trying to get across here.

Surely, the energy cost to verify the translation would be the same as translating it? If you're struggling that much, why are you translating it at all? I cannot trust your translation.

If you tell an LLM to generate reports, it will, regardless of the actual quality of the environment. It doesn't know what's secure and what isn't. All you've shown it to do is convince the kinds of security analysts with a system so insecure as to have a LOT of good reports that their system is more secure than it is. Which is useless at best, detrimental at worst.

It's useless for translation. It's useless for security analysis. It's useless for rhyming (I notice you didn't mention that one). You're trying so hard to prove how useful it is, and your failure demonstrates how useless it is.

You can't condemn confident wrongness and defend LLMs. And you can't defend the billionaire's toxic Nazi plagiarism machine while questioning someone else's morals. You can't cherry-pick my argument and claim I'm the one fighting a strawman. ...Well, not if you're arguing in good faith.

Sometimes i use AI even if i know the answer because i am a lazy person, and holy shit, i can confirm that it lies a lot and tells wrong shit

We already have tools that can give us incorrect answers in natural human language.

And they post their videos to youtube for free.

Wow, the only thing Siri is generally competent at.

My first thought as well, lol.

Scam Altman sounds like it's a name straight from an hltv comment section, I love it

That is genuinely hilarious!

Oh lol that's Husk, I didn't realise he made videos like that. I see his gaming videos from time to time and they're fun

It is pretty wild seeing him become the focus of world attention and big politics, when he is just a guy who made funny Hell Let Loose and DayZ videos.

Everyone’s getting their knickers in a twist over nothing here.

Of course an AI can track time, if it’s given access to a timer MCP server.

Can we track time without tools, just in our heads? Certainly not very accurately. We can, however, track it reasonably accurately if given access to a quartz stop watch (typically +/-15 s/year)

A language model is based around language and reasoning by words/symbols. It’s not a surprise it doesn’t have timing capability.

What Altman SHOULD be embarrassed about is that the model lies about its capabilities. That implies that the context is still not right - it should be adequately trained and given context to prevent the lying. That implies a much more worrying issue - and something that Anthropic handles far better, IMHO (when asked if it can track time, if says “no, not on my own”, and then proceeds to build a JavaScript timer that it offers up to track time).

well messages are clearly not stateless (otherwise there would be no context), but in general yes the issue is not the lack of capability, it's the complete unawareness of it and the insistence on lying about it.

THIS time it is ridiculously obvious but what if it does this after checking a very large data set where there would be no (good) way to verify its answer?

This is why Ai, in it's current form, is basically useless. If you cannot trust it NOT to lie, and must/should verify everything yourself, you might as well skip the useless step of asking

To call AI useless is quite a strong statement.

There’s a million places to use it!

The problem is that the market thinks there’s a billion places to use. And right now we’re funding 999 million places that shouldn’t be using AI but have the funding to do that dumb thing so we can figure the one million places where it makes fantastic sense.

I get your point and yes, I was exaggerating for effect but... are there a million places to use Ai where you can blindly trust its output/work?

I do not really think it's completely useless; however, I do think the uses are very very limited (compared to the hype) and the cost of running these models for the benefit they provide makes them even less practical

I don't use them but I follow the news about them loosely. The reason for this is epistemic humility. Claude has a pretty good idea of what its capabilities are and where the ceiling is. Chatgpt has no clue what its limits are so it believes it can do everything. Basically chatgpt has a lot of info and no idea where the gaps live and Claude has a fair idea when to search or use some external function to handle something. Gemini has less than Claude but more than chatgpt. Grok has little to no epistemic humility, but it did manage to accurately portray Musk as a world champion piss drinker, something none of the others were able to do.

I say that, but it's been a few months since I looked. That could have changed because shit moves fast. By the looks of what it's trying to do with the timer chatgpt has less than it used to. Possibly because of the way the model is trained to be helpful and confident.

It could simply save a timestamp of the "begin timer" message and compare it to the timestamp of the "end" message. It's not that complicated, and writing a script and executing it is overkill... It just needs access to a calculator skill.

Yes, it handles it better, but it's still a dumb approach and waste of energy.

Aren’t we saying exactly the same? Give it an MCP server or a native skill that CAN track time.

Lol. Why dont they ask the AI how to program an AI?

They should just vibe code the feature. They'll have it done in an afternoon, right?

"It's making too many errors. Make it smarter."

Yo dawg I heard you like AI so we put an AI in your AI so it can AI while you AI!

You would already be doing a great service to the world if you produced a really well tuned search engine / information digger with LLMs but no you had to periodically hype it as AGI because it can memorize entire text books with some accuracy. You did this to yourselves and if you fall it will be because of these expectations which are not met.

Okay, so, in case the headline is confusing anyone else, it's literal. Like, you know how there are those cringe-ass Alexa ads that are about how it does AI language processing and assistant shit? Yeah, ChatGPT can't I guess.

It seems out of scope for the tool, IMHO.

Just make Codex write the code for it. Should be easy. Don’t even need humans. Right?

Odd because home assistant can use a local run LLM to do so?

Beacuse they probably work as agents so they dont count themselfs. They use another app to do . Chat gpt probably could also do thay if integrated properly with your phone software.

It's the Elon strategy. Works just right when the most powerful country of the world if full of people who can't read at 6th grade level and a bunch of psychopaths.

midwest.social

Rules

  1. No porn.
  2. No bigotry, hate speech.
  3. No ads / spamming.
  4. No conspiracies / QAnon / antivaxx sentiment
  5. No zionists
  6. No fascists

Chat Room

Matrix chat room: https://matrix.to/#/#midwestsociallemmy:matrix.org

Communities

Communities from our friends:

Donations

LiberaPay link: https://liberapay.com/seahorse