AI Chatbots: Solving Problems with Analogies
Like Gen-Z and TikTok or Donald Trump and his Diet Coke button, humans use **analogies** a lot. Beyond being an annoying part of every SAT, analogies actually help us solve problems and even learn skills by connecting them with concepts that we’re already familiar with. For example, you might tell a child learning to ride a bike for the first time that it’s a lot like balancing on a seesaw.
For a while, it was thought that only humans used analogical reasoning to solve problems. However, recent research from psychologists at the University of California, Los Angeles has found that **AI chatbots** also have the ability to use analogies just like humans.
Discovering the Analogical Reasoning Ability of AI Chatbots
The team published a study Monday in the journal Nature Human Behaviour that found OpenAI’s large language model GPT-3 performed as well as college students when solving analogy problems, similar to those found on tests like the SAT. This suggests that AI chatbots may possess a hallmark of human intelligence.
Senior author Hongjing Lu, a psychology professor at UCLA, expressed surprise at the chatbot’s reasoning abilities, saying, “Language learning models are just trying to do word prediction so we’re surprised they can do reasoning. Over the past two years, the technology has taken a big jump from its previous incarnations.”
Analogical Reasoning Tests and Results
To test GPT-3’s analogical reasoning, the researchers utilized problems based on the Raven’s Progressive Matrices, which are questions that require test takers to predict the next image in a series of shapes. The images were converted into a text format to enable GPT-3 to “see” and solve the problems.
Both GPT-3 and 40 undergraduate UCLA students were given the same test, and the chatbot correctly solved 80 percent of the problems, while the human participants had an average score of 60 percent. However, the highest human scores fell within the range of scores achieved by GPT-3.
“Surprisingly, not only did GPT-3 do about as well as humans but it made similar mistakes as well,” Lu said.
The researchers also tested GPT-3 with SAT analogy questions that they believed had never been published on the internet, meaning GPT-3 would not have been trained on these specific questions. After comparing the results to actual college applicants’ SAT scores, they found that the chatbot outperformed the human participants.
The Limitations of AI Chatbots
While these results may be impressive, it is important to remember that it does not necessarily mean that chatbots are smarter than humans or possess human-level intelligence and reasoning. Chatbots like GPT-3 are language models trained on massive datasets, including ones that have crawled through the entire internet. They are performing a set of actions that they have been trained to do.
Co-author Keith Holyoak, a psychology professor at UCLA, explains, “GPT-3 might be kind of thinking like a human, but on the other hand, people did not learn by ingesting the entire internet, so the training method is completely different.”
The research team is still investigating whether GPT-3 is truly employing human-like methods or if it has developed something entirely new and groundbreaking in the field of artificial intelligence.
Editor Notes
AI chatbots like GPT-3 continue to push the boundaries of what machines are capable of. The ability to use analogies and solve complex problems demonstrates the potential for AI to assist us in various domains. However, it is crucial to maintain a critical perspective and understand the limitations of these technologies. AI is a powerful tool that can augment our capabilities, but it is not a substitute for human intelligence.
For more news and updates on the latest advancements in AI and technology, visit GPT News Room.