Chapter 5 ยท Section 4

MIPRO Optimizer

Multi-step Instruction and Demonstration Optimization for maximum performance.

~20 min read

โšก Introduction

MIPRO (Multi-step Instruction and demonstration PRompt Optimization) represents a significant advancement in automated prompt optimization. Unlike simpler approaches that only optimize examples, MIPRO simultaneously optimizes both instructions (prompts) and demonstrations (examples) for each module in a multi-stage pipeline.

๐Ÿ“Š

Research results: MIPRO achieves 63% improvement on HotpotQA (32.0 โ†’ 52.3 F1) compared to manual prompting!

๐Ÿ“ˆ Performance Benchmarks

Dataset Task Type Manual MIPRO Improvement
HotpotQA Multi-hop QA 32.0 F1 52.3 F1 +63%
GSM8K Math Reasoning 28.5% 33.8% +19%
CodeAlpaca Code Generation 63.1% 64.8% +3%
FEVER Fact Verification 71.2% 78.9% +11%

๐Ÿ—๏ธ Dual-Component Optimization

MIPRO's power comes from jointly optimizing two key elements:

๐Ÿ“

Instruction Generation

Generates candidate instructions using meta-prompting conditioned on program structure, module signatures, and dataset characteristics

๐Ÿ“‹

Demonstration Selection

Selects demonstrations using data-driven selection, utility scoring, and greedy algorithms

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚                     MIPRO Optimization Loop                       โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”      โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚  โ”‚  Meta-Prompt   โ”‚      โ”‚  Demonstration โ”‚                      โ”‚
โ”‚  โ”‚  Generation    โ”‚      โ”‚  Selection     โ”‚                      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜      โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚          โ”‚                       โ”‚                               โ”‚
โ”‚          โ–ผ                       โ–ผ                               โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚  โ”‚        Candidate Configurations        โ”‚                      โ”‚
โ”‚  โ”‚   (Instruction + Demonstration Pairs)  โ”‚                      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                      โ”‚                                           โ”‚
โ”‚                      โ–ผ                                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚  โ”‚      Simulated Annealing Search        โ”‚                      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ”‚                      โ”‚                                           โ”‚
โ”‚                      โ–ผ                                           โ”‚
โ”‚  โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”                      โ”‚
โ”‚  โ”‚         Best Configuration             โ”‚                      โ”‚
โ”‚  โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜                      โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

๐Ÿ“ฆ Basic Usage

import dspy
from dspy.teleprompt import MIPRO

# Define your program
class ReasoningQA(dspy.Module):
    def __init__(self):
        super().__init__()
        self.generate = dspy.ChainOfThought("question -> answer")
    
    def forward(self, question):
        return self.generate(question=question)

# Define metric
def answer_accuracy(example, pred, trace=None):
    return example.answer.lower() in pred.answer.lower()

# Create MIPRO optimizer
optimizer = MIPRO(
    metric=answer_accuracy,
    num_candidates=10,          # Instruction candidates per module
    init_temperature=0.7,       # Diversity in generation
    verbose=True
)

# Compile with MIPRO
compiled = optimizer.compile(
    ReasoningQA(),
    trainset=trainset,
    num_trials=3,               # Optimization iterations
    max_bootstrapped_demos=8    # Max demos per module
)

๐ŸŽ›๏ธ Auto Configuration Modes

MIPRO provides convenient auto-configuration presets:

# Light mode: Quick optimization for simple tasks
# Equivalent to: num_candidates=5, init_temperature=0.8
optimizer = MIPRO(metric=accuracy, auto="light")

# Medium mode: Balanced optimization (recommended default)
# Equivalent to: num_candidates=10, init_temperature=1.0
optimizer = MIPRO(metric=accuracy, auto="medium")

# Heavy mode: Extensive optimization for complex tasks
# Equivalent to: num_candidates=20, init_temperature=1.2
optimizer = MIPRO(metric=accuracy, auto="heavy")
Parameter Default Range Description
num_candidates 10 5-30 Instruction candidates per module
init_temperature 1.0 0.5-1.5 Meta-prompt sampling temperature
num_trials 3 1-10 Optimization iterations
max_bootstrapped_demos 8 0-16 Max demos per module

๐Ÿ”— Multi-Stage Pipeline Optimization

MIPRO excels at optimizing multi-stage pipelines:

class MultiHopQA(dspy.Module):
    """Multi-hop QA pipeline."""
    
    def __init__(self):
        super().__init__()
        # Stage 1: Query generation
        self.generate_queries = dspy.Predict("question -> search_queries")
        
        # Stage 2: Retrieval
        self.retrieve = dspy.Retrieve(k=5)
        
        # Stage 3: Answer generation
        self.generate_answer = dspy.ChainOfThought("question, context -> answer")
    
    def forward(self, question):
        # Stage 1
        queries = self.generate_queries(question=question)
        
        # Stage 2
        passages = self.retrieve(query=queries.search_queries).passages
        context = "\n".join(passages[:5])
        
        # Stage 3
        answer = self.generate_answer(question=question, context=context)
        
        return dspy.Prediction(
            answer=answer.answer,
            reasoning=answer.rationale
        )

# MIPRO optimizes EACH module in the pipeline
optimizer = MIPRO(
    metric=qa_metric,
    num_candidates=15,  # More candidates for multi-stage
    auto="medium"
)

optimized = optimizer.compile(
    MultiHopQA(),
    trainset=trainset,
    num_trials=5,
    max_bootstrapped_demos=6  # Per module
)

๐ŸŽฏ Zero-Shot vs Few-Shot

MIPRO research reveals that optimized zero-shot can often beat manual few-shot:

# Zero-shot optimization (instructions only)
mipro_zeroshot = MIPRO(metric=accuracy, num_candidates=20)
zeroshot_compiled = mipro_zeroshot.compile(
    program,
    trainset=trainset,
    max_bootstrapped_demos=0  # No demonstrations!
)

# Few-shot optimization (instructions + demos)
mipro_fewshot = MIPRO(metric=accuracy, num_candidates=20)
fewshot_compiled = mipro_fewshot.compile(
    program,
    trainset=trainset,
    max_bootstrapped_demos=8
)

# Compare results
print(f"Zero-shot: {evaluate(zeroshot_compiled, testset):.1%}")
print(f"Few-shot:  {evaluate(fewshot_compiled, testset):.1%}")
๐Ÿ’ก

Key insight: Zero-shot optimization saves context window space for reasoning while still achieving strong performance!

๐Ÿ”ฅ Simulated Annealing Search

MIPRO uses simulated annealing to efficiently search the prompt configuration space:

๐ŸŒก๏ธ

High Temperature Start

Accept many changes initially to explore broadly

โ„๏ธ

Gradual Cooling

Become more selective as optimization progresses

๐Ÿ”๏ธ

Escape Local Optima

Occasionally accept "uphill" moves to find global optimum

๐Ÿ“ Key Takeaways

MIPRO optimizes both instructions and demonstrations jointly

Significant improvements: 10-60%+ on various benchmarks

Use auto modes for easy configuration (light/medium/heavy)

Excels at multi-stage pipelines with multiple modules

Optimized zero-shot can beat manual few-shot!