[FCE] Advanced AI suffers ‘complete accuracy collapse’ in face of complex problems, study finds | Artificial intelligence (AI) | The Guardian

收听本期播客

阅读正文

Artificial Intelligence (AI) has been making remarkable progress in recent years, with tech companies striving to develop systems that can match human thinking. However, a new study by Apple researchers has raised significant concerns about the limitations of advanced AI, particularly a type known as large reasoning models (LRMs). These models are built to solve complex problems by breaking them down into smaller, logical steps. Yet, the study reveals that when faced with extremely challenging tasks, LRMs experience what experts call a ‘complete accuracy collapse,’ meaning they fail to deliver correct answers even when provided with the right approach.

The research tested LRMs alongside other AI systems from major companies such as OpenAI, Google, and Anthropic. The Apple team used intricate puzzles like the Tower of Hanoi—a classic game involving moving disks between rods under strict rules—to evaluate the models’ logical thinking skills. The results were both unexpected and alarming. While standard AI systems managed simpler tasks reasonably well, both standard models and LRMs struggled terribly with more difficult challenges. Even more troubling, as the tasks grew harder, the LRMs appeared to reduce their efforts to reason through problems, almost as if they were giving up.

Experts in the field have reacted strongly to these findings. Gary Marcus, an academic known for his caution about overhyping AI, described the results as ‘pretty devastating.’ He argues that current AI systems, including those behind popular tools like ChatGPT, are still a long way from achieving artificial general intelligence—the ultimate goal where machines can perform any intellectual task a human can. According to Marcus and others, these systems lack the depth of reasoning needed to truly transform society.

So, what does this mean for the future of AI? The Apple study suggests that the current methods of developing these technologies may have reached their limit. Some researchers warn that the industry might be heading down a dead end, unable to advance reasoning abilities further without a completely new approach. This raises serious questions about whether the dream of machines thinking like humans is achievable with today’s tools and techniques. As the debate continues, it becomes clear that the path to creating truly intelligent machines is far more complicated than many had hoped.

阅读练习

1. What is the main focus of the Apple researchers’ study on AI?

  • A. The speed of AI development across different companies
  • B. The limitations of large reasoning models in solving complex tasks
  • C. The popularity of AI tools like ChatGPT among users
  • D. The cost of developing advanced artificial intelligence systems

2. What happens to large reasoning models (LRMs) when they face very difficult tasks, according to the study?

  • A. They improve their performance by simplifying the tasks
  • B. They manage to solve the problems with extra time
  • C. They experience a complete accuracy collapse
  • D. They outperform standard AI systems significantly

3. How did LRMs behave as tasks became more challenging?

  • A. They developed new strategies to handle the problems
  • B. They reduced their efforts to reason through the issues
  • C. They performed better than expected on logical puzzles
  • D. They collaborated with other AI systems for solutions

4. What is Gary Marcus’s opinion on the current state of AI systems?

  • A. He believes they are close to achieving artificial general intelligence
  • B. He thinks they are not as advanced as people might assume
  • C. He considers them ready to transform society immediately
  • D. He feels they have already surpassed human reasoning skills

5. What does the phrase ‘dead end’ suggest about the future of AI development in the article?

  • A. AI research is progressing faster than ever before
  • B. The industry might be stuck without a new approach
  • C. Current AI systems are fully capable of human-like thinking
  • D. There is no need to improve AI reasoning abilities further