โก Introduction
MIPRO (Multi-step Instruction and demonstration PRompt Optimization) represents a significant advancement in automated prompt optimization. Unlike simpler approaches that only optimize examples, MIPRO simultaneously optimizes both instructions (prompts) and demonstrations (examples) for each module in a multi-stage pipeline.
Research results: MIPRO achieves 63% improvement on HotpotQA (32.0 โ 52.3 F1) compared to manual prompting!
๐ Performance Benchmarks
| Dataset | Task Type | Manual | MIPRO | Improvement |
|---|---|---|---|---|
| HotpotQA | Multi-hop QA | 32.0 F1 | 52.3 F1 | +63% |
| GSM8K | Math Reasoning | 28.5% | 33.8% | +19% |
| CodeAlpaca | Code Generation | 63.1% | 64.8% | +3% |
| FEVER | Fact Verification | 71.2% | 78.9% | +11% |
๐๏ธ Dual-Component Optimization
MIPRO's power comes from jointly optimizing two key elements:
Instruction Generation
Generates candidate instructions using meta-prompting conditioned on program structure, module signatures, and dataset characteristics
Demonstration Selection
Selects demonstrations using data-driven selection, utility scoring, and greedy algorithms
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ MIPRO Optimization Loop โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โ
โ โ Meta-Prompt โ โ Demonstration โ โ
โ โ Generation โ โ Selection โ โ
โ โโโโโโโโโฌโโโโโโโโโ โโโโโโโโโฌโโโโโโโโโ โ
โ โ โ โ
โ โผ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Candidate Configurations โ โ
โ โ (Instruction + Demonstration Pairs) โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Simulated Annealing Search โ โ
โ โโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโ โ
โ โ โ
โ โผ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โ โ Best Configuration โ โ
โ โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
๐ฆ Basic Usage
import dspy
from dspy.teleprompt import MIPRO
# Define your program
class ReasoningQA(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.generate(question=question)
# Define metric
def answer_accuracy(example, pred, trace=None):
return example.answer.lower() in pred.answer.lower()
# Create MIPRO optimizer
optimizer = MIPRO(
metric=answer_accuracy,
num_candidates=10, # Instruction candidates per module
init_temperature=0.7, # Diversity in generation
verbose=True
)
# Compile with MIPRO
compiled = optimizer.compile(
ReasoningQA(),
trainset=trainset,
num_trials=3, # Optimization iterations
max_bootstrapped_demos=8 # Max demos per module
)
๐๏ธ Auto Configuration Modes
MIPRO provides convenient auto-configuration presets:
# Light mode: Quick optimization for simple tasks
# Equivalent to: num_candidates=5, init_temperature=0.8
optimizer = MIPRO(metric=accuracy, auto="light")
# Medium mode: Balanced optimization (recommended default)
# Equivalent to: num_candidates=10, init_temperature=1.0
optimizer = MIPRO(metric=accuracy, auto="medium")
# Heavy mode: Extensive optimization for complex tasks
# Equivalent to: num_candidates=20, init_temperature=1.2
optimizer = MIPRO(metric=accuracy, auto="heavy")
| Parameter | Default | Range | Description |
|---|---|---|---|
num_candidates |
10 | 5-30 | Instruction candidates per module |
init_temperature |
1.0 | 0.5-1.5 | Meta-prompt sampling temperature |
num_trials |
3 | 1-10 | Optimization iterations |
max_bootstrapped_demos |
8 | 0-16 | Max demos per module |
๐ Multi-Stage Pipeline Optimization
MIPRO excels at optimizing multi-stage pipelines:
class MultiHopQA(dspy.Module):
"""Multi-hop QA pipeline."""
def __init__(self):
super().__init__()
# Stage 1: Query generation
self.generate_queries = dspy.Predict("question -> search_queries")
# Stage 2: Retrieval
self.retrieve = dspy.Retrieve(k=5)
# Stage 3: Answer generation
self.generate_answer = dspy.ChainOfThought("question, context -> answer")
def forward(self, question):
# Stage 1
queries = self.generate_queries(question=question)
# Stage 2
passages = self.retrieve(query=queries.search_queries).passages
context = "\n".join(passages[:5])
# Stage 3
answer = self.generate_answer(question=question, context=context)
return dspy.Prediction(
answer=answer.answer,
reasoning=answer.rationale
)
# MIPRO optimizes EACH module in the pipeline
optimizer = MIPRO(
metric=qa_metric,
num_candidates=15, # More candidates for multi-stage
auto="medium"
)
optimized = optimizer.compile(
MultiHopQA(),
trainset=trainset,
num_trials=5,
max_bootstrapped_demos=6 # Per module
)
๐ฏ Zero-Shot vs Few-Shot
MIPRO research reveals that optimized zero-shot can often beat manual few-shot:
# Zero-shot optimization (instructions only)
mipro_zeroshot = MIPRO(metric=accuracy, num_candidates=20)
zeroshot_compiled = mipro_zeroshot.compile(
program,
trainset=trainset,
max_bootstrapped_demos=0 # No demonstrations!
)
# Few-shot optimization (instructions + demos)
mipro_fewshot = MIPRO(metric=accuracy, num_candidates=20)
fewshot_compiled = mipro_fewshot.compile(
program,
trainset=trainset,
max_bootstrapped_demos=8
)
# Compare results
print(f"Zero-shot: {evaluate(zeroshot_compiled, testset):.1%}")
print(f"Few-shot: {evaluate(fewshot_compiled, testset):.1%}")
Key insight: Zero-shot optimization saves context window space for reasoning while still achieving strong performance!
๐ฅ Simulated Annealing Search
MIPRO uses simulated annealing to efficiently search the prompt configuration space:
High Temperature Start
Accept many changes initially to explore broadly
Gradual Cooling
Become more selective as optimization progresses
Escape Local Optima
Occasionally accept "uphill" moves to find global optimum