Language Models | Chapter 1 | DSPy: The Comprehensive Guide

🔧 Configuring Language Models

DSPy uses a consistent interface for all language models:

import dspy

# Create an LM instance
lm = dspy.LM(model="provider/model-name", api_key="your-key")

# Set it as the default
dspy.configure(lm=lm)

Once configured, all DSPy modules will use this LM automatically.

🌐 Supported Providers

💳

OpenAI

Models available:

gpt-4o - Latest flagship model
gpt-5-mini - Fast, cost-effective
gpt-4-turbo - Previous flagship

lm = dspy.LM(
    model="openai/gpt-5-mini",
    api_key="sk-your-key-here",
    temperature=0.7,
    max_tokens=500
)
dspy.configure(lm=lm)

Best for: General-purpose tasks, proven reliability

🤖

Anthropic (Claude)

Models available:

claude-4-5-sonnet-20241022 - Latest, most capable
claude-4-5-haiku-20241022 - Fast, economical
claude-4-5-opus-20240229 - Maximum capability

lm = dspy.LM(
    model="anthropic/claude-4-5-sonnet-20241022",
    api_key="your-anthropic-key",
    temperature=0.7,
    max_tokens=1000
)
dspy.configure(lm=lm)

Best for: Long contexts, detailed analysis, coding

🏠

Local Models (Ollama)

Models available:

llama3, llama3.1 - Meta's open models
mistral, mixtral - Mistral AI models
phi3 - Microsoft's small model

# No API key needed!
lm = dspy.LM(
    model="ollama/llama3",
    api_base="http://localhost:11434"
)
dspy.configure(lm=lm)

Best for: Privacy, no API costs, experimentation

🌡️ Temperature Guide

Temperature controls output randomness:

Value	Behavior	Use Case
0.0 - 0.3	Deterministic, focused	Classification, extraction
0.4 - 0.8	Balanced	General Q&A, summaries
0.9 - 1.5	Creative, diverse	Creative writing, brainstorming
1.6 - 2.0	Very random	Experimental, exploration

# For factual tasks - low temperature
factual_lm = dspy.LM(model="openai/gpt-5-mini", temperature=0.1)

# For creative tasks - higher temperature
creative_lm = dspy.LM(model="openai/gpt-5-mini", temperature=1.2)

🔀 Using Multiple Models

You can use different models for different tasks:

import dspy

# Fast model for simple tasks
fast_lm = dspy.LM(model="openai/gpt-5-mini")

# Powerful model for complex tasks
smart_lm = dspy.LM(model="openai/gpt-4o")

class Pipeline(dspy.Module):
    def __init__(self):
        self.classify = dspy.Predict("text -> category")
        self.analyze = dspy.ChainOfThought("text, category -> analysis")

    def forward(self, text):
        # Use fast model for classification
        with dspy.context(lm=fast_lm):
            category = self.classify(text=text).category

        # Use smart model for complex analysis
        with dspy.context(lm=smart_lm):
            analysis = self.analyze(text=text, category=category).analysis

        return analysis

📊 Model Selection Guide

By Task Type

Classification / Extraction gpt-5-mini or claude-4-5-haiku

Question Answering gpt-5-mini or claude-4-5-sonnet

Complex Reasoning gpt-4o or claude-4-5-sonnet

Long Context grok-4-1-fast (2M tokens), gemini-3-pro-preview (1M tokens)

Code Generation gpt-4o or claude-4-5-sonnet

By Budget

Free / Low Cost Ollama (local), gpt-5-mini

Balanced gpt-4o, claude-4-5-sonnet

Maximum Capability gemini-3-pro-preview, claude-4-5-opus

✅ Best Practices

📈

Start Small, Scale Up

# Development: Use small, fast models
dev_lm = dspy.LM(model="openai/gpt-5-mini")

# Production: Upgrade when needed
prod_lm = dspy.LM(model="openai/gpt-4o")

# Easy to switch!
lm = dev_lm if IS_DEVELOPMENT else prod_lm
dspy.configure(lm=lm)

🔐

Use Environment Variables

import os
from dotenv import load_dotenv

load_dotenv()

# Never hardcode API keys!
lm = dspy.LM(
    model="openai/gpt-5-mini",
    api_key=os.getenv("OPENAI_API_KEY")
)

⏱️

Set Appropriate Timeouts

# Default timeout might be too short for complex tasks
lm = dspy.LM(
    model="openai/gpt-4o",
    timeout=60  # 60 seconds for complex reasoning
)

💾

Enable Caching in Development

# DSPy has built-in caching
dspy.configure(lm=lm, cache=True)

# Speeds up development, saves costs

📋 Quick Reference

OpenAI lm = dspy.LM(model="openai/gpt-5-mini", api_key=key)

Anthropic lm = dspy.LM(model="anthropic/claude-4-5-sonnet-20241022", api_key=key)

Ollama (Local) lm = dspy.LM(model="ollama/llama3", api_base="http://localhost:11434")

Switch Models with dspy.context(lm=different_lm): result = module(input=data)

📝 Summary

Key Concepts:

DSPy supports multiple LM providers (OpenAI, Anthropic, local, etc.)
Configure once with dspy.configure(lm=...)
Use dspy.context() to temporarily switch models
Choose models based on task complexity and budget
Start with smaller models, scale up as needed

Best Practices:

Use environment variables for API keys
Set appropriate timeouts and token limits
Enable caching during development
Choose the right model for each task

Continue to Exercises