Can language models reason?

 


Illustration by Rajashree Rajadhyax

Large language models are an impressive technology that excels at providing answers based on vast knowledge of the world. However, they often struggle with reasoning and logic-based questions. To overcome this limitation, the Chain-of-Thought (CoT) method was introduced. This technique emulates how humans solve problems by reasoning step-by-step. Just as we break down complex problems into smaller, manageable parts for analysis and logical resolution, CoT guides language models to adopt a similar approach. By mimicking this natural, structured problem-solving process, CoT enhances the models’ ability to handle logic-driven tasks with greater accuracy and reliability.


This approach, however, requires more time and computational effort because it involves processing more tokens. Tokens are the building blocks of text, like words or parts of words, that the model uses to understand and generate responses. Since the Chain-of-Thought method involves explaining each step in detail, it needs more tokens to represent the reasoning process, which increases the time and resources required.


Expanding on the idea of step-by-step reasoning in AI, Meta’s FAIR (Fundamental research In AI) team have introduced a new method called Chain of Continuous Thought (Coconut). Unlike the traditional Chain-of-Thought (CoT) method, which explains reasoning through words, Coconut allows AI to think internally without converting every step into natural language. This approach makes reasoning faster and more efficient. Meta provides an analogy and cites neuroimaging studies which show that ‘language network – a set of brain regions responsible for language comprehension and production – remain largely inactive during various reasoning tasks.


Here are some key findings related to Coconut


Improved Performance on Logical Tasks:

  • Coconut excelled in logical reasoning, achieving nearly perfect scores on certain tests like ProntoQA (99.8%) and significantly outperforming CoT in complex planning tasks, such as ProsQA (97% vs. 77.5%).
  • However, for math problems, it performed slightly lower than CoT but still much better than models without step-by-step reasoning.

Efficiency Gains:

  • Coconut needs far fewer "thought steps" (tokens) compared to CoT. For example, it used only 9 steps versus 92.5 in a logic test, making it faster and less resource-intensive.

Unique Thinking Approach:

  • Instead of writing out every reasoning step in words, Coconut analyzes options internally and gradually eliminates wrong answers, much like brainstorming solutions before deciding.

Challenges and Future Potential:

  • Training Coconut to think in this way was complex and required a gradual process. However, the researchers believe this method has great potential to make AI systems better at reasoning across a wider range of tasks.


Coconut is still experimental. While it shows promising results, it has only been tested on specific tasks and is in the early stages of development. The researchers believe it has great potential but needs further work before it can be widely applied.


By Rajashree Rajadhyax

Co-Founder, Cere Labs

Comments

Popular posts from this blog

AI implementation myths

Homework 2.0: The science behind homework and why it still matters in the AI age!