Recently at the Build conference Andrej Karpathy gave a talk about the current state of large language models, popularised by the release of ChatGPT. The talk centred around particularly GPT models, and how to use them effectively. It was a great talk; this is my attempt to distill what I took out of it.

Chain of Thought in Large Language Models

While large language models (LLMs) like GPT-4 are capable of impressive feats of linguistic complexity, they lack the complex internal monologue and intuitive reasoning capabilities of humans. This can lead to situations where LLMs struggle with tasks that humans find relatively simple, especially those requiring multi-step reasoning or problem-solving. This is where the concept of “chain of thought” comes in.

Researchers have discovered that by prompting LLMs to approach problems in a step-by-step manner, explicitly outlining their reasoning process, their performance on these tasks can be significantly improved. This technique, often referred to as “chain of thought prompting,” helps to bridge the gap between the linear token processing of LLMs and the more nuanced cognitive processes of humans.

How Chain of Thought Prompting Works

The core idea behind chain of thought prompting is to encourage the LLM to break down complex tasks into smaller, more manageable steps, mimicking the way humans naturally approach problem-solving.

  • Instead of expecting the LLM to arrive at the correct answer immediately, chain of thought prompting provides a framework for the LLM to “think out loud,” revealing its reasoning process one step at a time.
  • This is often achieved by providing examples in the prompt where the desired step-by-step reasoning is demonstrated.
  • Prompts that explicitly instruct the LLM to “think step-by-step” or “work this out in a step-by-step way to be sure we have the right answer” have been shown to be particularly effective.

Why Chain of Thought Prompting is Effective

The effectiveness of chain of thought prompting can be attributed to several factors.

  • Reduced Computational Load per Token: By breaking down a problem into smaller steps, each step requires less complex computation from the LLM, increasing the likelihood of arriving at the correct answer.
  • Transparency and Error Detection: The step-by-step breakdown makes the LLM’s reasoning process transparent, allowing for easier identification and correction of errors. For instance, by asking an LLM if it has fulfilled the requirements of a prompt, it becomes possible to identify instances where the model has “gotten unlucky” in its token sampling and generated an incorrect or undesirable output.
  • Mimicking Human Cognition: Chain of thought prompting helps to compensate for the lack of an “inner monologue” in LLMs. By prompting them to externalize their reasoning process, we create a system that more closely resembles the way humans think and solve problems.

Examples of Chain of Thought Prompting

Several techniques leverage chain of thought prompting principles.

  • “Let’s think step-by-step” prompting: This straightforward approach involves adding phrases like “Let’s think step-by-step” or “Let’s work this out step-by-step to make sure we have the right answer” to the prompt, encouraging the LLM to explicitly lay out its reasoning process.
  • Tree of Thought: This more advanced technique involves maintaining multiple possible solution paths (“thoughts”) and evaluating them at each step, expanding only the most promising paths. This mimics the hierarchical and evaluative nature of human planning.
  • React: This method structures the LLM’s response as a series of “thought, action, observation” cycles, incorporating tool use (like accessing a calculator) into the reasoning process. This approach further blurs the line between thought and action, creating a more dynamic problem-solving system.

Chain of Thought Prompting: The Future of LLMs?

Chain of thought prompting is a rapidly developing field with the potential to significantly enhance the capabilities of LLMs.

  • As researchers continue to develop new and innovative prompting techniques, we can expect LLMs to become increasingly adept at tackling complex, multi-step reasoning tasks.
  • The integration of chain of thought prompting with other techniques, such as retrieval augmentation (providing LLMs with access to external knowledge bases) and tool use, further expands their potential, paving the way for more sophisticated and capable AI systems.

However, it’s important to remember that chain of thought prompting is only one piece of the puzzle. LLMs still face limitations, including biases, potential for hallucination (generating incorrect information), and susceptibility to adversarial attacks. Continued research and development are crucial to address these limitations and unlock the full potential of LLMs as powerful tools for augmenting human capabilities.

Watch the talk here on YouTube. State of GPT

Slides from the talk “How to train your GPT” are here.