Chain of Thought & Planning

Medium 25 min read

Chain of Thought Reasoning

Why Planning Matters

The Problem: LLMs often fail on complex tasks when asked to jump directly to an answer. They make reasoning errors and miss important steps.

The Solution: Chain of Thought (CoT) prompting encourages the model to think step-by-step, dramatically improving accuracy on reasoning-intensive tasks.

Real Impact: CoT can improve math problem accuracy from 18% to 97% and enables agents to tackle multi-step planning tasks reliably.

Real-World Analogy

Think of CoT like showing your work on a math test:

  • Direct Answer = Writing just "42" -- might be wrong, hard to debug
  • Chain of Thought = Writing each step -- easier to verify and correct
  • Task Decomposition = Breaking a big problem into smaller solvable parts
  • Tree of Thoughts = Exploring multiple solution paths simultaneously

CoT Techniques

Zero-Shot CoT

Simply add "Think step by step" to your prompt. Surprisingly effective for many reasoning tasks.

Few-Shot CoT

Provide examples with detailed reasoning chains. The model learns to mirror the reasoning style.

Self-Consistency

Generate multiple reasoning paths and take the majority vote. Reduces errors from any single chain.

Plan-and-Execute

First create a complete plan, then execute each step. Separates planning from execution.

Key Takeaway: Chain of Thought prompting works by making the LLM show its reasoning steps explicitly. Adding "Let's think step by step" to a prompt can improve accuracy by 10-40% on reasoning tasks, especially math, logic, and multi-step problems.
Output (with vs without CoT)
Without CoT: "Roger has 5 tennis balls." (WRONG)
With CoT: "Roger started with 5 balls. He bought 2 cans of 3 balls each = 6 balls.
  5 + 6 = 11 balls total." (CORRECT)

Zero-Shot CoT

chain_of_thought.py
# Zero-Shot Chain of Thought
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{
        "role": "user",
        "content": "A store has 3 types of fruit. Apples cost $2, bananas $1, oranges $3. I buy 4 apples, 6 bananas, and 2 oranges. How much do I spend? Think step by step."
    }]
)
# Model breaks down: 4*2=8, 6*1=6, 2*3=6, total=20

# Plan-and-Execute Pattern
class PlanAndExecute:
    def run(self, task):
        # Step 1: Create plan
        plan = self.planner.create_plan(task)

        # Step 2: Execute each step
        results = []
        for step in plan.steps:
            result = self.executor.execute(step, results)
            results.append(result)

        # Step 3: Synthesize final answer
        return self.synthesizer.combine(results)

Task Decomposition

Planning Tree Diagram
Complex Task Sub-task 1 Sub-task 2 Sub-task 3 Result 1 Result 2 Result 3 Synthesized Answer

Common Mistake

Wrong: Breaking tasks into too many fine-grained steps

Why it fails: Over-decomposition causes the model to lose sight of the overall goal. Each step adds context overhead and increases the chance of compounding errors.

Instead: Decompose into 3-5 meaningful sub-tasks that each produce a clear intermediate result. Let the LLM handle the granular reasoning within each sub-task.

Planning Strategies

StrategyHow It WorksBest For
Linear PlanSequential step listWell-defined workflows
DAG PlanSteps with dependenciesParallelizable tasks
Adaptive PlanReplan after each stepUncertain environments
HierarchicalHigh-level then detailedVery complex tasks

Tree of Thoughts

Tree of Thoughts (ToT)

  • Generate: Create multiple candidate reasoning steps
  • Evaluate: Score each candidate for promise
  • Expand: Continue the most promising branches
  • Backtrack: Abandon dead-end branches and try alternatives
Deep Dive: When CoT Hurts Performance

Chain of thought does not always help. For simple factual lookups ("What is the capital of France?"), CoT adds unnecessary tokens without improving accuracy. For creative tasks, step-by-step reasoning can make outputs feel robotic. CoT works best for multi-step reasoning, math, logical deduction, and tasks where showing work reveals errors. Always benchmark CoT vs direct prompting on your specific task before committing to one approach.

Key Takeaway: Tree of Thoughts extends CoT by exploring multiple reasoning paths in parallel and evaluating which branches are most promising. Use it for complex planning tasks where the first reasoning path may not be optimal.

Quick Reference

TechniqueDescriptionAccuracy Gain
Zero-Shot CoT"Think step by step"+20-40%
Few-Shot CoTExamples with reasoning chains+30-50%
Self-ConsistencyMultiple paths, majority vote+5-15% over CoT
Tree of ThoughtsBranch and evaluate pathsBest for creative tasks
Plan-and-ExecuteSeparate planning from doingBest for multi-step tasks