Key Research Findings
VMware research (2024) demonstrated several surprising insights:
- LLM-Generated Prompts Outperform Human Ones: Automatic optimizers created prompts that humans would likely reject, yet they performed better
- "Positive Thinking" Prompts Are Suboptimal: Manual additions like "This will be fun!" provide minimal benefit
- Open Source Models Can Self-Optimize: Even 7B parameter models (Mistral-7B) effectively optimize prompts with just 100 test samples
DSPy Implementation
class AutomaticPromptOptimizer:
def __init__(self, base_model="gpt-3.5-turbo", optimizer_model="mixtral-8x7b"):
self.base_lm = dspy.OpenAI(model=base_model)
self.optimizer_lm = dspy.HFClientVLLM(model=optimizer_model)
def discover_eccentric_prompts(self, task_description, examples):
"""Generate unexpected but effective prompts based on research insights."""
prompt_generator = dspy.ChainOfThought(
"task_description, examples -> creative_system_prompt, persona_prompt"
)
return prompt_generator(task_description=task_description, examples=examples)