What is chain-of-thought prompting?

Chain-of-thought prompting is a prompting technique where you ask an AI model to reason step by step before giving an answer. Instead of immediately presenting a conclusion, the model writes out its thinking steps. For complex questions, this often produces better and verifiable results.

The technique was introduced in 2022 by researchers at Google, led by Jason Wei. They showed that language models perform significantly more accurately on arithmetic, logic and reasoning tasks when they explicitly formulate their intermediate steps. Since then, chain-of-thought prompting has been one of the most widely used techniques in prompting become.

How does chain-of-thought prompting work?

Chain-of-thought prompting works by encouraging the language model to generate intermediate steps instead of jumping straight to a final answer. With a standard prompt, you often only get a conclusion. With a chain-of-thought prompt, you also see how the model arrives at that conclusion.

That difference is similar to the difference between someone saying “the answer is 42” and someone showing what calculation preceded it. The written-out reasoning allows you as a user to judge whether the logic is correct, and to see exactly where things might go wrong.

There are two common variants of chain-of-thought prompting. Which one you choose depends on the complexity of your task and the time you have to prepare your prompt.

Zero-shot chain-of-thought prompting

Zero-shot chain-of-thought prompting is the simplest variant. You add a short instruction to your prompt, such as “Think about this step by step” or “Explain how you arrived at your answer”. The model is not given examples, but only the instruction to show its reasoning.

This variant is quick and practical. You don't need to prepare examples and can apply it directly to any question. Results are already noticeably better on simple reasoning tasks than without chain-of-thought instruction.

An example. Suppose you ask ChatGPT: “A project team has 12 tasks. Three team members divide the tasks equally. Then 6 tasks are added and shared between two team members. How many tasks does each team member now have?” Without chain-of-thought instruction, you sometimes get the wrong answer. If you add “Calculate this step by step”, the model writes out each step and almost always arrives at the correct answer.

Few-shot chain-of-thought prompting

Few-shot chain-of-thought prompting goes a step further. You give the model one or more examples in which you yourself show what a reasoning pattern looks like. The model then follows that pattern with the new question.

This variant is particularly effective in tasks where the desired reasoning style is not obvious. Think of situations where you want to follow a specific consideration framework, or where reasoning needs to consider multiple perspectives. By showing a concrete example, you steer the model in the right direction.

The downside is that few-shot prompts take more preparation. You have to formulate a good example yourself, and the quality of that example directly affects the outcome. In practice, few-shot chain-of-thought prompting is especially valuable in recurring tasks where you can recoup the investment in a good example.

Chain-of-thought prompting in practice

Chain-of-thought prompting is explained in most articles using mathematical arithmetic. This makes sense, as the effect is most clearly measurable in computational tasks. But the technique is at least as valuable in professional tasks that require consideration or analysis.

Example for an analysis

Chain-of-thought prompting in an analysis helps you check the model's reasoning. Suppose you have a AI chatbot asks to compare three suppliers based on price, delivery time and quality. Without chain-of-thought instruction, you get a conclusion (“Supplier B is the best choice”) without substantiation.

If you add: “Weigh each factor separately and explain how you arrived at your recommendation”, you can see exactly what assumptions the model makes. Maybe it weights price more than quality, while for your situation it should be the other way around. That visibility makes the difference between blind trust and conscious judgement.

Example in business consideration

Chain-of-thought prompting in a business trade-off forces the model to work out arguments for and against. Suppose you ask Claude whether it is wise to implement a new software package for your team. A standard answer lists benefits and ends with “it depends on your situation”.

With a chain-of-thought instruction like “Walk through the pros and cons step by step, weigh them against each other and come to a concrete recommendation”, you get a structured analysis. The model separately names costs, implementation time, learning curve and expected productivity gains. This makes the output useful as a starting point for a real decision, rather than a non-committal overview.

When does chain-of-thought prompting work best?

Chain-of-thought prompting delivers most on tasks that require multiple steps or trade-offs. Think arithmetic problems, logic puzzles, comparisons between options, risk analysis and policy considerations. The common feature is that the answer cannot be achieved in a single step.

The more complex the task, the greater the difference with standard prompting. For simple questions with a direct answer (“What is the capital of the Netherlands?”), chain-of-thought prompting adds nothing. But as soon as several variables come into play or a trade-off is needed, you notice that the quality of the output increases.

Professionals who asking AI good questions often combine chain-of-thought prompting with other techniques. For example, you can combine it with a role description (“You are a financial analyst. Go through the following data step by step...”) or with a specific output format.

When does chain-of-thought prompting not make sense?

Chain-of-thought prompting is not a technique you should apply to every prompt by default. In simple factual questions, short writing tasks or creative assignments, it does not produce better results. The model then makes the answer unnecessarily long without improving quality.

A concrete example: if you ask “Write a concise e-mail rescheduling a meeting to Thursday”, then step-by-step reasoning is unnecessary. The model does not need to think about intermediate steps, it just needs to generate a text.

The rule of thumb is: if you can formulate the answer yourself in one step, chain-of-thought prompting makes little sense. As soon as you notice that you should take a moment to think about the right approach yourself, it is a good time to give the model that same thinking space.

Longer answers are also not necessarily better answers. A model that reasons extensively but is on the wrong track produces more text without more value. Assessing the reasoning remains your responsibility, and that requires you to have sufficient knowledge of the subject yourself.

Chain-of-thought and reasoning models

Chain-of-thought prompting was originally something that you as a user had to activate yourself through your prompt. That is changing. Recent reasoning models have chain-of-thought reasoning built into their architecture by default.

OpenAI's o-series (o1, o3, o4-mini) is the best-known example here. These models are trained to go through a reasoning process internally before giving an answer. You can see this in the interface as a “thinking” phase where the model builds its own logic. As a user, you no longer need to give an explicit chain-of-thought instruction; the model does it automatically.

DeepSeek R1 goes one step further. This model shows its entire reasoning process visibly in its output, including moments when it corrects itself. The model sometimes recognises halfway through a reasoning process that it is on the wrong track, adjusts its approach and still arrives at a better answer.

For professionals, this is a relevant development. With reasoning models, the skill shifts from “writing the correct chain-of-thought prompt” to “critically assessing the reasoning shown”. You no longer have to tell the model how to think, but you have to be able to assess whether the thinking steps shown are correct.

Do you use a model without built-in reasoning, such as Gemini or a standard ChatGPT model (GPT-4o), then chain-of-thought prompting remains a valuable technique that you activate yourself. The effect differs from model to model, but the principle is the same: step-by-step reasoning produces better output in complex tasks.

Limitations and pitfalls of chain-of-thought prompting

Chain-of-thought prompting improves output on complex tasks, but it does not guarantee correct answers. There are two pitfalls you need to be alert to as a professional.

Persuasive but flawed reasoning

Chain-of-thought prompting can lead to convincing but incorrect reasoning. The model generates text that seems logical, step by step, but is built on false assumptions. A language model recognises patterns in text and predicts the most likely next word. It does not “check” its own logic like a human does.

In practice, this means always checking the reasoning, especially in tasks where errors have consequences. A financial statement that looks neat but is built on a wrong assumption is more dangerous than a visibly messy answer. The written-out steps allow you to find errors, but only if you actually read and assess them.

Longer answers are not always better answers

Chain-of-thought prompting produces longer answers. Sometimes that is exactly what you need, but it can also add noise. A model that writes out seven steps where three would have sufficed makes it harder to get to the heart of the matter.

For simple tasks, chain-of-thought prompting can actually worsen output. Research by Wei et al (2022) showed that for simple questions requiring only one step, chain-of-thought prompting did not improve and sometimes led to worse results. So use the technique purposefully, not by default.

Chain-of-thought prompting and quality control

Chain-of-thought prompting fits well with a way of working in which you systematically check AI output before using it. The written-out reasoning gives you concrete leads to assess whether the answer is correct.

You can check the reasoning at three levels. Are the facts used by the model correct? Are the intermediate steps logical? And does the conclusion follow from those intermediate steps? If one of these three is incorrect, you know exactly where the problem is and can make targeted adjustments.

Who understands how ChatGPT works, also understands why such control is needed. The model generates text based on patterns, not factual knowledge. Persuasive reasoning is not proof that the conclusion is correct. Chain-of-thought prompting makes it easier to see that distinction, but assessment remains human work.

In the ChatGPT course from LearnLLM will teach you how to combine chain-of-thought prompting with fixed checkpoints, so you can justify AI output to colleagues and clients.

Chain-of-thought prompting is thus more than a trick for better answers. It is a way to make AI output controllable, which is exactly what professionals delivering output on their own behalf need. Want to learn how to use this technique structurally in your work? The ChatGPT e-learning from LearnLLM addresses chain-of-thought prompting as part of a complete workflow with repeatable workflows and checkpoints.