Introduction
Reflective Prompt Evolution (RPE) is an innovative optimizer that treats prompt engineering as an evolutionary process. Unlike standard gradient-based optimization, RPE maintains a population of prompt candidates and evolves them using mutation and selection, guided by the language model's own ability to "reflect" on what's working and what isn't.
Core Concept: RPE simulates "Survival of the Fittest" for prompts. The fittest prompts (those that perform best on your metric) survive and reproduce (mutate) to form the next generation.
๐ What Makes RPE Special?
- Population-Based: Explores multiple directions simultaneously, avoiding local optima better than single-path greedy search.
- Self-Reflection: Uses the LM to critique its own prompts ("Why did I fail this example?") to generate smarter mutations.
- Gradient-Free: Works with any black-box LM API, as it doesn't require access to model weights or gradients.
๐ป Basic Usage
import dspy
from dspy.teleprompter import ReflectivePromptEvolution
# 1. Define your module (e.g., ChainOfThought)
class Reasoner(dspy.Module):
def __init__(self):
self.prog = dspy.ChainOfThought("question -> answer")
def forward(self, question):
return self.prog(question=question)
# 2. Configure RPE
optimizer = ReflectivePromptEvolution(
metric=your_metric_function,
population_size=10, # Keep 10 active candidates
generations=5, # Evolve for 5 rounds
mutation_rate=0.3, # 30% chance to mutate a prompt
selection_pressure=0.5 # Keep top 50% performers
)
# 3. Compile
optimized_reasoner = optimizer.compile(
Reasoner(),
trainset=train_examples,
valset=val_examples
)
โ๏ธ The Evolution Process
- Initialization: Create an initial population of diverse prompts (e.g., using different instruction styles).
- Evaluation (Fitness): Test all candidates on the training set and assign a fitness score based on your metric.
- Reflection: For lower-performing prompts, ask the LM to analyze
why they failed.
"Analyze this prompt's performance. Which instructions were unclear? What reasoning was missing?" - Mutation: Create new candidates by applying changes suggested by the reflection (e.g., "Add a step to check for negative numbers").
- Selection: Keep the best prompts from the current pool and the new mutations.
- Repeat: Continue for N generations or until convergence.
๐งช Advanced Tactics
Diversity Maintenance
To prevent the population from becoming too similar (converging too early), RPE can enforce diversity constraints. It calculates the cosine similarity between prompt embeddings and penalizes or removes redundant candidates.
Custom Mutations
You can define domain-specific mutation operators. For a coding task, you might add a mutation that specifically inserts "Check for edge cases" instructions.
class CustomMutationOperator:
def domain_specific_mutation(self, prompt, domain):
if domain == "code":
return prompt + "\nEnsure time complexity is O(n)."
return prompt
๐ค When to Use RPE?
| Scenario | RPE Suitability | Reason |
|---|---|---|
| Complex Reasoning | โ High | Evolution finds creative reasoning paths humans miss. |
| Simple Classification | โ ๏ธ Medium | Overkill; BootstrapFewShot is faster and sufficient. |
| Black-Box APIs | โ High | No gradients needed, efficient use of API calls via reflection. |