DSPy Pipelines: Wiring Steps Without Writing Prompts

In the first post, we got everything set up to start exploring DSPy. Here we’ll continue that journey by looking at the idea of pipelines.

In that first post, we established the foundation: a Signature declares what a task needs and produces, a predictor compiles that declaration into a prompt, and swapping the predictor strategy, literally just one word, generates a structurally different prompt without touching the Signature at all. The declaration and the strategy are decoupled.

That foundation handled a single step. One question in, one answer out. But most useful LLM pipelines aren’t single-step. You might need to expand a question into a detailed answer, then compress that answer into something concise. Or retrieve relevant context, then reason over it. Or validate an intermediate result before passing it forward.

This post introduces DSPy’s pipeline model: what it looks like when a single module owns multiple predictors, how data flows between steps, and what DSPy compiles when two Signatures are involved instead of one.

What Changes Architecturally

In the previous scripts, each module had exactly one predictor. The forward() method called it once and returned the result. That’s the degenerate case of a pipeline: a pipeline of length one.

DSPy’s module model doesn’t impose any limit on this. A single dspy.Module can own as many predictors as the task requires, and the forward() method defines how data flows between them. Each predictor is compiled against its own Signature independently, which means each step in the pipeline gets its own compiled prompt. One forward() call can trigger multiple LLM calls in sequence.

The other thing that changes is Signature scope. In a single-step program, your Signature describes the whole task. In a pipeline, each Signature describes only its local contract: what it receives from the previous step and what it hands to the next. A step doesn’t need to know about anything upstream or downstream from it. This is the same principle as function composition in ordinary programming: each function sees only its own inputs and outputs, and the caller is responsible for wiring them together.

The Script

The third script, dspy3.py, implements a two-step pipeline. Step one takes a question and produces a detailed answer. Step two takes that detailed answer and distills it to a single concise sentence. The same question as before flows through both steps.

Run it the same way as the scripts in the previous post. Either:

  python dspy3.py

Or, if you don’t want the default question, provide your own:

  python dspy3.py "What year did the Berlin Wall fall?"

One thing to notice before you run it: inspect_history is called with n=2 this time, not n=1. That’s because two LLM calls were made, one per pipeline step, and you want to see both compiled prompts.

What the Output Tells You

The Prediction Object

=== Prediction ===
Prediction(
    reasoning="The detailed answer provides a specific and well-known piece of information from Douglas Adams' *Hitchhiker's Guide to the Galaxy*. The core task is to provide a succinct summary of this answer.",
    summary="The ultimate answer to life, the universe, and everything, according to *Hitchhiker's Guide to the Galaxy*, is 42."
)

The Prediction object carries reasoning and summary. These are the fields from SummarySignature, the final step. The detailed_answer that step one produced is not here. It was consumed internally by the pipeline’s forward() method when it passed expanded.detailed_answer into step two. Intermediate results don’t accumulate on the final Prediction; only the last step’s output fields are returned.

This is deliberate. The pipeline’s caller asked for a summary. It doesn’t need the intermediate expansion; that was a means to an end. If you needed it, you could return it explicitly from forward(), but the default behavior keeps the interface clean.

The First Compiled Prompt

System message:

Your input fields are:
1. `question` (str):
Your output fields are:
1. `reasoning` (str):
2. `detailed_answer` (str):

...

User message:

[[ ## question ## ]]
What is the answer to life, the universe and everything?

Respond with the corresponding output fields, starting with the field
`[[ ## reasoning ## ]]`, then `[[ ## detailed_answer ## ]]`, and then
ending with the marker for `[[ ## completed ## ]]`.

This is the compiled prompt for DetailedAnswerSignature. It looks structurally familiar from the previous post: ChainOfThought injected the reasoning field again, ahead of the declared output field. The objective is to produce detailed_answer from question. This is step one’s local contract, compiled in full.

The Second Compiled Prompt

System message:

Your input fields are:
1. `detailed_answer` (str):
Your output fields are:
1. `reasoning` (str):
2. `summary` (str): A single concise sentence.

...

User message:

[[ ## detailed_answer ## ]]
According to the *Hitchhiker's Guide to the Galaxy*, the answer to the ultimate
question of life, the universe, and everything is 42.

Respond with the corresponding output fields, starting with the field
`[[ ## reasoning ## ]]`, then `[[ ## summary ## ]]`, and then ending with
the marker for `[[ ## completed ## ]]`.

Several things are worth comparing against the first prompt.

The input field is detailed_answer, not question. This prompt has no knowledge of the original question; it only sees what step one produced. That’s the local contract in action. SummarySignature declared its own scope, and DSPy compiled a prompt that reflects exactly that scope.

The user message contains the actual text of step one’s response, slotted directly into the [[ ## detailed_answer ## ]] section. This is the data flow made visible: the output of the first LLM call became the literal input of the second, passed through as a plain string via expanded.detailed_answer. No manual formatting. No string stitching. The Prediction object’s attribute access handled the handoff.

And the summary field in the system message reads: summary (str): A single concise sentence. That trailing description wasn’t in the previous scripts. It’s the first appearance of something worth isolating.

Field Descriptions: When Declarations Carry Intent

In the previous scripts, Signature fields declared a name and a type. That was enough for DSPy to compile a working prompt. But names and types don’t always convey what you actually want from a field. summary (str) tells DSPy what to call the field and what type to expect. It says nothing about what a good summary looks like.

The desc parameter on OutputField changes that:

summary: str = dspy.OutputField(desc="A single concise sentence.")

Look at what DSPy compiled into the system message: summary (str): A single concise sentence. The description was promoted directly into the prompt schema, alongside the field name and type. It’s not a comment. It’s not documentation that disappears at runtime. It’s part of the compiled artifact that the model sees.

This is the first sign that Signatures can carry intent, not just structure. A field description is a constraint expressed inside the declaration, compiled into the prompt automatically. You don’t write “please keep this to one sentence” somewhere in a prompt string you’re maintaining by hand. No, instead, you declare it where the field is defined, and DSPy handles the rest.

That distinction matters more as pipelines grow. When you have multiple steps, each with their own fields and constraints, keeping intent co-located with structure, rather than scattered across prompt strings, is what keeps things maintainable. And as we’ll see in the next post, it becomes load-bearing when retrieval enters the picture.

What Two Prompts From One Call Reveals

The inspect_history(n=2) output is worth pausing on as a whole, not just step by step. You made one call, in this case pipeline(question=q), and got back two structurally different compiled prompts, each reflecting a different Signature, each with its own field set and objective statement.

This is what it means for DSPy to treat a pipeline as a program. The forward() method is the program logic in that it defines the execution order and the data flow. DSPy’s job is to compile the prompt for each step independently and handle the parsing at each boundary. So, in essence, you write the wiring; DSPy writes the prompts.

As pipelines grow more complex, this separation becomes increasingly valuable. You can add a step, change a Signature, swap a predictor strategy on one step without touching the others, and inspect_history will show you exactly what each change compiled into, step by step.

Where This Goes Next

The pipeline model opens up a broad space of possibilities, but one is immediately practical: retrieval-augmented generation, or RAG. We certainly looked at this quite a bit in my AI and Testing series. The pattern is straightforward in DSPy terms — a retrieval step produces context, and a generation step reasons over it — but it introduces new Signature design questions, new field types, and the first real test of whether the declaration-first approach holds up when the inputs are dynamic and external.

That’s the focus of the next post. We’ll build a RAG pipeline in DSPy, walk through what gets compiled at each step, and look at how field descriptions become genuinely load-bearing when the model needs to know not just what a field is called, but what to do with what’s in it.

Stories from a Software Tester

Twice upon a time, in another space, no distance in any direction from here …