1. Building a Mini-RAG System
Objective: Create a simplified version of the enterprise RAG system for a personal knowledge base.
Requirements:
- Document ingestion from PDF files
- Vector storage using ChromaDB
- Retrieval and answer generation
- Basic citation support
Python
# Step 3: Answer Generation with DSPy
class RAGAnswerSignature(dspy.Signature):
"""Generate answer from retrieved context."""
context = dspy.InputField(desc="Retrieved document chunks")
question = dspy.InputField(desc="User question")
answer = dspy.OutputField(desc="Answer based on context")
sources = dspy.OutputField(desc="Source information")
class RAGAnswerer(dspy.Module):
def __init__(self):
super().__init__()
self.generate = dspy.Predict(RAGAnswerSignature)
def forward(self, question: str, retrieved_docs: List[Dict]):
context = "\n\n".join([doc['content'] for doc in retrieved_docs])
return self.generate(context=context, question=question)
2. STORM Writing Assistant Implementation
Objective: Build a simplified version of the STORM writing assistant for generating articles.
Requirements:
- Multi-perspective research simulation
- Outline generation from research
- Section-by-section content generation
- Basic citation integration
Python
class ContentGenerator(dspy.Module):
def __init__(self):
super().__init__()
self.generate_content = dspy.Predict(
"section_title, research_data, word_count -> content"
)
self.add_citations = dspy.Predict(
"content, research_data -> cited_content"
)