OpenAI says ChatGPT will always make things up, but it could get better at admitting uncertainty

1 month ago 13

ARTICLE AD BOX

AI systems like ChatGPT will always make things up, but they may soon get better at recognizing their own uncertainty.

OpenAI says language models will always hallucinate, meaning they'll sometimes generate false or misleading statements, also known as "bullshit." This happens because these systems are trained to predict the next most likely word, not to tell the truth. Since they have no concept of what's true or false, they can produce convincing but inaccurate answers just as easily as correct ones. While that might be fine for creative tasks, it's a real problem when users expect reliable information.

OpenAI breaks down different types of hallucinations. Intrinsic hallucinations directly contradict the prompt, like answering "2" when asked "How many Ds are in DEEPSEEK?" Extrinsic hallucinations go against real-world facts or the model's training data, such as inventing fake quotes or biographies. Then there are "arbitrary fact" hallucinations, which show up when the model tries to answer questions about things rarely or never seen in its training, like specific birthdays or dissertation titles. In those cases, the model just guesses.

To reduce hallucinations, OpenAI says it uses several strategies: reinforcement learning with human feedback, external tools like calculators and databases, and retrieval-augmented generation. Fact-checking subsystems add another layer. Over time, OpenAI wants to build a modular "system of systems" that makes models more reliable and predictable.

THE DECODER Newsletter

The most important AI news straight to your inbox.

✓ Weekly

✓ Free

✓ Cancel at any time

Models should admit uncertainty

OpenAI says hallucinations can't be eliminated completely. But future models should at least be able to recognize when they're unsure - and say so. Instead of guessing, they should use external tools, ask for help, or stop responding altogether. While that doesn't mean the model actually understands what's true, it does mean it can flag when it's not confident about its own answers.

The goal is to make models behave more like people when they're uncertain. Humans don't know everything, and sometimes it's better to admit not knowing than to make something up. Of course, people sometimes guess anyway, which causes problems - just like with AI.

There's already some progress outside of OpenAI's own benchmarks. A Stanford math professor recently spent a year testing an unsolved problem on OpenAI's models. Earlier versions gave incorrect answers, but the latest model finally admitted it couldn't solve the problem. It also chose not to guess on the toughest question from this year's International Mathematical Olympiad. OpenAI says these improvements should make their way into commercial models in the coming months.

Read Entire Article