this post was submitted on 27 May 2024
475 points (98.0% liked)

Technology

58061 readers
31 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

You know how Google's new feature called AI Overviews is prone to spitting out wildly incorrect answers to search queries? In one instance, AI Overviews told a user to use glue on pizza to make sure the cheese won't slide off (pssst...please don't do this.)

Well, according to an interview at The Vergewith Google CEO Sundar Pichai published earlier this week, just before criticism of the outputs really took off, these "hallucinations" are an "inherent feature" of  AI large language models (LLM), which is what drives AI Overviews, and this feature "is still an unsolved problem."

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 205 points 3 months ago (7 children)

They keep saying it's impossible, when the truth is it's just expensive.

That's why they wont do it.

You could only train AI with good sources (scientific literature, not social media) and then pay experts to talk with the AI for long periods of time, giving feedback directly to the AI.

Essentially, if you want a smart AI you need to send it to college, not drop it off at the mall unsupervised for 22 years and hope for the best when you pick it back up.

[–] [email protected] 92 points 3 months ago (2 children)

No he's right that it's unsolved. Humans aren't great at reliably knowing truth from fiction too. If you've ever been in a highly active comment section you'll notice certain "hallucinations" developing, usually because someone came along and sounded confident and everyone just believed them.

We don't even know how to get full people to do this, so how does a fancy markov chain do it? It can't. I don't think you solve this problem without AGI, and that's something AI evangelists don't want to think about because then the conversation changes significantly. They're in this for the hype bubble, not the ethical implications.

[–] [email protected] 40 points 3 months ago (2 children)

We do know. It's called critical thinking education. This is why we send people to college. Of course there are highly educated morons, but we are edging bets. This is why the dismantling or coopting of education is the first thing every single authoritarian does. It makes it easier to manipulate masses.

[–] [email protected] 31 points 3 months ago (2 children)

"Edging bets" sounds like a fun game, but I think you mean "hedging bets", in which case you're admitting we can't actually do this reliably with people.

And we certainly can't do that with an LLM, which doesn't actually think.

[–] [email protected] 5 points 3 months ago (1 children)

Jinx! You owe me an edge sesh!

[–] [email protected] 3 points 3 months ago* (last edited 3 months ago)

A big problem with that is that I've noticed your username.

I wouldn't even do that with Reagan's fresh corpse.

[–] [email protected] 3 points 3 months ago (1 children)

I think that’s more a function of the fact that it’s difficult to verify that every one of the over 1M college graduates each year isn’t a “moron” (someone very bad about believing things other people made up). I think it would be possible to ensure a person has these critical thinking skills with a concerted effort.

[–] [email protected] 1 points 3 months ago (1 children)

The people you're calling "morons" are orders of magnitude more sophisticated in their thinking than even the most powerful modern AI. Almost every single one of them can easily spot what's wrong with AI hallucinations, even if you consider them "morons". And also, by saying you have to filter out the "morons", you're still admitting that a lot of whole real assed people are still not reliably able to sort fact from fiction regardless of your education method.

[–] [email protected] 2 points 3 months ago

No I still agree that we are far from LLMs being ‘thinking’ enough to be anywhere near this. But if we had a bunch of models similar to LLMs that could actually think, or if we really needed to select a person, I do think it would be possible to evaluate a bunch of the models/people to determine which ones are good at distinguishing fake information.

All I’m saying is I don’t think the limitation is actually our ability to select for capability in distinguishing fake information, I think the only limitation is fundamental to how current LLMs work.

[–] [email protected] 4 points 3 months ago

What does this have to do with AI and with what OP said? Their point was obviously about limitations of the software, not some lament about critical thinking

[–] [email protected] 2 points 3 months ago* (last edited 3 months ago)

We haven't even been able to eliminate religious thought patterns, human minds attach to stories not facts. We are a sad alpha version of sentience and I sincerely hope the next version isn't so fundamentally broken.

[–] [email protected] 42 points 3 months ago (2 children)

I let you in on a secret: scientific literature has its fair share of bullshit too. The issue is, it is much harder to figure out its bullshit. Unless its the most blatant horseshit you've scientifically ever seen. So while it absolutely makes sense to say, let's just train these on good sources, there is no source that is just that. Of course it is still better to do it like that than as they do it now.

[–] [email protected] 18 points 3 months ago (1 children)

The issue is, it is much harder to figure out its bullshit.

Google AI suggested you put glue on your pizza because a troll said it on Reddit once...

Not all scientific literature is perfect. Which is one of the many factors that will stay make my plan expensive and time consuming.

You can't throw a toddler in a library and expect them to come out knowing everything in all the books.

AI needs that guided teaching too.

[–] [email protected] 1 points 3 months ago (2 children)

Google AI suggested you put glue on your pizza because a troll said it on Reddit once…

Genuine question: do you know that's what happened? This type of implementation can suggest things like this without it having to be in the training data in that format.

[–] [email protected] 5 points 3 months ago (2 children)

In this case, it seems pretty likely. We know Google paid Reddit to train on their data, and the result used the exact same measurement from this comment suggesting putting Elmer’s glue in the pizza:

https://old.reddit.com/r/Pizza/comments/1a19s0/my_cheese_slides_off_the_pizza_too_easily/

And their deal with Reddit: https://www.cbsnews.com/news/google-reddit-60-million-deal-ai-training/

[–] [email protected] 3 points 3 months ago

It's going to be hilarious to see these companies eventually abandon Reddit because it's giving them awful results, and then they're completely fucked

[–] [email protected] 1 points 3 months ago

Genuine question: do you know that’s what happened?

Yes

[–] [email protected] -2 points 3 months ago (1 children)

"Most published journal articles are horseshit, so I guess we should be okay with this too."

[–] [email protected] 1 points 3 months ago

No, it's simply contradicting the claim that it is possible.

We literally don't know how to fix it. We can put on bandaids, like training on "better" data and fine-tune it to say "I don't know" half the time. But the fundamental problem is simply not solved yet.

[–] [email protected] 33 points 3 months ago (1 children)

I'm addition to the other comment, I'll add that just because you train the AI on good and correct sources of information, it still doesn't necessarily mean that it will give you a correct answer all the time. It's more likely, but not ensured.

[–] [email protected] 8 points 3 months ago

Yes, thank you! I think this should be written in capitals somewhere so that people could understand it quicker. The answers are not wrong or right on purpose. LLMs don't have any way of distinguishing between the two.

[–] [email protected] 10 points 3 months ago (1 children)

no, the truth is it's impossible even then. If the result involves randomness at its most fundamental level, then it's not reliable whatever you do.

[–] [email protected] 0 points 3 months ago* (last edited 3 months ago) (2 children)

Sure, the AI is never going to understand what it's doing or why, but training it on better datasets certain WILL improve the results.

Garbage in, garbage out.

[–] [email protected] 2 points 3 months ago (1 children)

You can train an LLM on the best possible set of data without a single false statement and it will still hallucinate. And there’s nothing to be done against that.

Without understanding of the context everything can be true or false.

“The acceleration due to gravity is equal to 9.81m/s2” True or False?

LLM basically works like this: given the previous words written and their order, the most probable next word of the sentence is this one.

[–] [email protected] -1 points 3 months ago

Well yes, I've seen those examples of ChatGPT citing scientific research papers that turned out to be completely made up, but at least it seems to be a step up from straight up shitposting, which is what you get when you train it on a dataset full of shitposts.

[–] [email protected] 1 points 3 months ago (1 children)

The problem is that given the way they combine things is determine by probability, even training it with the greatest bestest of data, the LLM is still going to halucinate because it's combining multiple sources word by word (roughly) guided only by probabilities derived from language, not logic.

[–] [email protected] 1 points 3 months ago (1 children)

Yes, I understand that. But I'm fairly certain the quality of the data will still have a massive influence over how much and how egregiously that happens.

Basically, what I'm saying is, training your AI on a corpus on shitposts instead of factual information seems like a good way to increase the frequency and magnitude of such hallucinations.

[–] [email protected] 1 points 3 months ago

Yeah, true.

If you train you LLM on exclusivelly Nazi literature (to pick a wild example) don't expect it to by chance end up making points similar to Marx's Das Kapital.

(Personally I think what might be really funny - in the sense of laughter inducing - would be to purposefull train an LLM exclusivelly on a specific kind of weird material).

[–] [email protected] 9 points 3 months ago (3 children)

it's just expensive

I'm a mathematician who's been following this stuff for about a decade or more. It's not just expensive. Generative neural networks cannot reliably evaluate truth values; it will take time to research how to improve AI in this respect. This is a known limitation of the technology. Closely controlling the training data would certainly make the information more accurate, but that won't stop it from hallucinating.

The real answer is that they shouldn't be trying to answer questions using an LLM, especially because they had a decent algorithm already.

[–] [email protected] 2 points 3 months ago* (last edited 3 months ago)

Yeah, I've learned Neural Networks way back when those thing were starting in the late 80s/early 90s, use AI (though seldom Machine Learning) in my job and really dove into how LLMs are put together when it started getting important, and these things are operating entirelly at the language level and on the probabilities of language tokens appearing in certain places given context and do not at all translate from language to meaning and back so there is no logic going on there nor is there any possibility of it.

Maybe some kind of ML can help do the transformation from the language space to a meaning space were things can be operated on by logic and then back, but LLMs aren't a way to do it as whatever internal representation spaces (yeah, plural) they use in their inners layers aren't those of meaning and we don't really have a way to apply logic to them).

[–] [email protected] 2 points 3 months ago (1 children)

So with reddit we had several pieces of information that went along with every post.

User, community along with up, and downvotes would inform the majority of users as to whether an average post was actually information or trash. It wasn't perfect, because early posts always got more votes and jokes in serious topics got upvotes, bit the majority of the examples of bad posts like glue on food came from joke subs. If they can't even filter results by joke sub, there is no way they will successfully handle saecasm.

Only basing results on actual professionals won't address the sarcasm filtering issue for general topics. It would be a great idea for a serious model that is intended to only return results for a specific set of topics.

[–] [email protected] 2 points 3 months ago (1 children)

only return results for a specific set of topics.

This is true, but when we're talking about something that limited you'll probably get better results with less work by using human-curated answers rather than generating a reply with an LLM.

[–] [email protected] 1 points 3 months ago

Yes, that would be the better solution. Maybe the humans could write down their knowledge and put it into some kind of journal or something!

[–] [email protected] 1 points 3 months ago

It’s worse than that. “Truth” can no more reliably found by machines than it can be by humans. We’ve spent centuries of philosophy trying to figure out what is “true”. The best we’ve gotten is some concepts we’ve been able to convince a large group of people to agree to.

But even that is shaky. For a simple example, we mostly agree that bleach will kill “germs” in a petri dish. In a single announcement, we saw 40% of the American population accept as “true” that bleach would also cure them if injected straight into their veins.

We’re never going to teach machine to reason for us when we meatbags constantly change truth to be what will be profitable to some at any given moment.

[–] [email protected] 3 points 3 months ago (1 children)

They could also perform some additional iterations with other models on the result to verify it, or even to enrich it; but we come back to the issue of costs.

[–] [email protected] 8 points 3 months ago* (last edited 3 months ago) (1 children)

Also once you start to get AI that reflects on its own information for truthfulness, where does that lead? Ultimately to determine truth you need to engage with the meaning of the words, and the process inherently involves a process of self-awareness. I would say you're talking about treaching the AI to understand context, and there is no predefined limit to the layers of context needed to understand the truthfulness of even basic concepts.

An AI that is aware of its own behaviour and is able to explore context as far as required to answer questions about truth, which would need that exploration precached in some sort of memory to reduce the overhead of doing this from first principles every time? I think you're talking about a mind; a person.

I think this might be a fundamental barrier, which I would call the "context barrier".

[–] [email protected] 1 points 3 months ago

Also once you start to get AI that reflects on its own information for truthfulness, where does that lead?

A new religion

[–] [email protected] 2 points 3 months ago (1 children)

Why not solve it before training the AI?

Simply make it clear that this tech is experimental, then provide sources and context with every result. People can make their own assessment.

[–] [email protected] 9 points 3 months ago (1 children)

Because a lot of people won't look at sources even if you serve them up on a silver platter?

[–] [email protected] 3 points 3 months ago

It's better than not doing anything and pretending it's all accurate.