Chapter 5

Bayesian Optimization

Navigate the prompt space intelligently using Gaussian Processes to balance exploration and exploitation.

Introduction

Bayesian Optimization (BO) is a powerful global optimization technique that excels at optimizing expensive black-box functions with few evaluations. In the context of DSPy, BO provides a principled approach to efficiently search for optimal prompt configurations.

Core Components

  • Search Space: The domain of possible configurations (e.g., instruction styles, temperature).
  • Surrogate Model: A probabilistic model (usually a Gaussian Process) approximating the performance landscape.
  • Acquisition Function: guiding the selection of next points.
# Defining a search space for Bayesian Optimization
def _define_search_space(self):
    return {
        "instruction_length": {"type": "discrete", "values": [10, 20, 30]},
        "instruction_style": {"type": "categorical", "values": ["direct", "detailed"]},
        "temperature": {"type": "continuous", "bounds": [0.0, 1.0]},
    }

Gaussian Process Surrogate

The Gaussian Process (GP) models both the expected performance and the uncertainty across the search space. This allows the optimizer to identify regions that are likely to be high-performing (high mean) or are unexplored (high uncertainty).

# Example of fitting a GP surrogate
self.surrogate_model.fit(self.X_observed, self.y_observed)
mean, std = self.surrogate_model.predict(new_configs, return_std=True)

Acquisition Functions

Acquisition functions determine where to sample next by balancing:

  • Exploitation: Sampling where the model predicts high value.
  • Exploration: Sampling where the model is uncertain.

Common functions include Expected Improvement (EI) and Upper Confidence Bound (UCB).

Practical Implementation

Implementing a full Bayesian Optimization pipeline involves defining the task, the metric, and iteratively updating the model.

# Running the optimization loop
for iteration in range(max_iterations):
    # Fit surrogate
    self.surrogate_model.fit(X_observed, y_observed)
    
    # Select next point using acquisition function
    next_config = self._select_next_configuration()
    
    # Evaluate
    score = self._evaluate_configuration(next_config)
    self._add_observation(next_config, score)