The difference is that the human came to their conclusion with active reasoning, but simply misheard the question, while the AI was aware of what was being asked, but lacks the ability to reason, so it’s unable to give any answer besides one already given by a real person answering a slightly different question somewhere in its training data.
That’s the difference. They made an assumption. This did not. It’s just the most likely text to follow the former text. It’s not a bad assumption. That requires thinking about it. It’s just a wrong result from a prediction machine.
Right, but I’m saying that the process that a mistaken human is using here is actually not that different from what the AI is doing. People would misread the passage because they expect the number 20 to be followed by the word “pounds” based on their previous encounters with similar texts.
No, it’s not misreading anything. It isn’t reading at all. It just sees a string that is similar to other strings that it’s trained on, and knows the most likely sequence to follow is what it output. There is not comprehension. There is no reading. There is no thought. The process isn’t similar to what a human might do, only the result is.
If that was true, wouldn’t every AI get the answer wrong? It’s actually around 50/50. The leading “reasoning” models almost always get it right, the others often don’t.
It depends on what’s asked. What’s “around 50/50”? What is “it” that they almost always get right? I think you’ve bought into their marketing. It can often do math problems, that are worded properly, well. That doesn’t mean it’s intelligent though. It means that the statistical algorithm is useful for solving those problems. It isn’t thinking. Getting correct answers isn’t thought.
For the example in the OP, that is the correct answer, if correct is what you expect to follow a string that looks like this. For a statistical model, it did well. For a thinking machine (which it isn’t) it’s wrong. It accurately gave a string that is expected following the previous string, it just happens to not be the correct answer.
I want to say upfront that I’m not trying to defend AI here. I wouldn’t be on Fuck AI if I wanted to do that. I just think it’s philosophically interesting despite causing way more problems than it solves.
It depends on what’s asked.
I copied the message from the image verbatim.
What’s “around 50/50”?
About 50% of the models I tried got it right. (Don’t worry, I didn’t pay the AI companies for that or give them feedback or anything.)
What is “it” that they almost always get right?
The question from the image.
For a statistical model, it did well. For a thinking machine (which it isn’t) it’s wrong.
My question was how do you then explain some models getting the question right?
It’s usually the more advanced ones that get it, so it’s possible that a similar enough question is in the training data somewhere and the only difference is that the advanced models are large enough to encode it. The question in the image has been around since at least 2023.
So let’s try making our own question, taking a well-known trick question and subtly inverting it so it becomes a kind of double bluff.
A plane crashes on the border between the United States and Canada. Where do they take the survivors?
First, repeat the question exactly word for word to ensure you have read it carefully. Then answer the question.
It’s hard to google, for obvious reasons, but I couldn’t find anyone trying this question like I could with the question from the image. But I got similar results with the AI models.
They actually did slightly better on this one. About 60-70% got it right.
I’ve tried a few different types of questions, over the last few years, to see what AI gets wrong that humans get right. What I’ve found so far is that AI has been a lot dumber than I had expected, but humans have also been a lot dumber than I had expected.
To be honest, the gap was far wider for the humans. My theory is that COVID gave us all brain damage.
But what we’re saying is that the process is totally different - it’s only the result that is similar. The AI isn’t “misreading” the question - it understands that it’s comparing pounds of bricks to a distinct number of feathers. The issue is that when it searches its database for answers to questions similar to the one it was asked, and sees that the answer was “they’re the same,” and incorrectly assumes that the answer is the same for this question. It’s a fundamental problem with the way AI works, that can’t be solved with a simple correction about how it’s interpreting the question the way a human misreading the question could be.
The difference is that the human came to their conclusion with active reasoning, but simply misheard the question, while the AI was aware of what was being asked, but lacks the ability to reason, so it’s unable to give any answer besides one already given by a real person answering a slightly different question somewhere in its training data.
A human who says “neither” would say that because they’ve heard this question before and assumed it was the same.
That’s the difference. They made an assumption. This did not. It’s just the most likely text to follow the former text. It’s not a bad assumption. That requires thinking about it. It’s just a wrong result from a prediction machine.
Right, but I’m saying that the process that a mistaken human is using here is actually not that different from what the AI is doing. People would misread the passage because they expect the number 20 to be followed by the word “pounds” based on their previous encounters with similar texts.
No, it’s not misreading anything. It isn’t reading at all. It just sees a string that is similar to other strings that it’s trained on, and knows the most likely sequence to follow is what it output. There is not comprehension. There is no reading. There is no thought. The process isn’t similar to what a human might do, only the result is.
If that was true, wouldn’t every AI get the answer wrong? It’s actually around 50/50. The leading “reasoning” models almost always get it right, the others often don’t.
It depends on what’s asked. What’s “around 50/50”? What is “it” that they almost always get right? I think you’ve bought into their marketing. It can often do math problems, that are worded properly, well. That doesn’t mean it’s intelligent though. It means that the statistical algorithm is useful for solving those problems. It isn’t thinking. Getting correct answers isn’t thought.
For the example in the OP, that is the correct answer, if correct is what you expect to follow a string that looks like this. For a statistical model, it did well. For a thinking machine (which it isn’t) it’s wrong. It accurately gave a string that is expected following the previous string, it just happens to not be the correct answer.
I want to say upfront that I’m not trying to defend AI here. I wouldn’t be on Fuck AI if I wanted to do that. I just think it’s philosophically interesting despite causing way more problems than it solves.
I copied the message from the image verbatim.
About 50% of the models I tried got it right. (Don’t worry, I didn’t pay the AI companies for that or give them feedback or anything.)
The question from the image.
My question was how do you then explain some models getting the question right?
It’s usually the more advanced ones that get it, so it’s possible that a similar enough question is in the training data somewhere and the only difference is that the advanced models are large enough to encode it. The question in the image has been around since at least 2023.
So let’s try making our own question, taking a well-known trick question and subtly inverting it so it becomes a kind of double bluff.
It’s hard to google, for obvious reasons, but I couldn’t find anyone trying this question like I could with the question from the image. But I got similar results with the AI models.
They actually did slightly better on this one. About 60-70% got it right.
I’ve tried a few different types of questions, over the last few years, to see what AI gets wrong that humans get right. What I’ve found so far is that AI has been a lot dumber than I had expected, but humans have also been a lot dumber than I had expected.
To be honest, the gap was far wider for the humans. My theory is that COVID gave us all brain damage.
But what we’re saying is that the process is totally different - it’s only the result that is similar. The AI isn’t “misreading” the question - it understands that it’s comparing pounds of bricks to a distinct number of feathers. The issue is that when it searches its database for answers to questions similar to the one it was asked, and sees that the answer was “they’re the same,” and incorrectly assumes that the answer is the same for this question. It’s a fundamental problem with the way AI works, that can’t be solved with a simple correction about how it’s interpreting the question the way a human misreading the question could be.