Understanding the execution model helps you predict what the library does and does not do for you.
Three layers compose cleanly: Lutum at the boundary, turn builders as stateless facades, and
Session as an explicit transcript accumulator.
On This Page
Three Layers
| Layer | Type | Role |
|---|---|---|
| Execution boundary | Lutum |
Validates ModelInput, reserves budget via OwnedLease, emits tracing, reduces the event stream |
| Turn builder | TextTurn / StructuredTurn<O> |
Stateless builder that configures a single turn and produces a stream or collected result |
| Transcript accumulator | Session |
Orders ModelInputItem values and gates writes behind explicit commit_* calls |
Nothing is hidden. Session is a convenience wrapper that adds user/assistant/tool messages to a
ModelInput; it is not a runtime, an event loop, or an approval gate.
Lutum — The Execution Boundary
Lutum is the only execution boundary in the library. When you call .collect() or .stream() on a turn builder, Lutum:
- Validates
ModelInput(role ordering, required fields) - Reserves a budget lease via
RequestBudget - Emits a tracing
llm_turnspan with request metadata - Passes the request to the
TurnAdapter - Reduces the returned event stream into typed results
- Finalises the lease with actual usage on completion
Adapters are public because providers need an SPI boundary, but calling an adapter directly bypasses
the execution contract. Prefer Lutum::new(adapter, budget) over direct adapter use.
Mixing Adapters with from_parts
When you need a completion adapter (for structured_completion) or a usage recovery adapter
(for OpenRouter), wire them explicitly:
use lutum::Lutum;
// turn adapter + completion adapter + usage recovery + budget
let llm = Lutum::from_parts(turn_adapter, completion_adapter, recovery_adapter, budget);The type-erased turn adapter (dyn TurnAdapter), completion adapter (dyn CompletionAdapter), and
usage recovery adapter (dyn UsageRecoveryAdapter) are each injected separately. This lets you mix,
e.g., a Claude turn adapter with an OpenRouter usage recovery adapter.
Streaming Pipeline
Lutum::text_turn(input)
→ TurnAdapter::stream(request) [provider SSE → LlmEvent]
→ TextTurnReducer [folds events into TextTurnState]
→ TextTurnResult [sealed when Completed event arrives]
The reducer accumulates text deltas, tool call deltas, usage, finish reason, and — critically —
stores the CommittedTurn emitted inside the Completed event. Adapters never reduce; they
translate provider wire format into LlmEvent.
You can consume the stream directly:
use futures::StreamExt as _;
use lutum::TextTurnEvent;
let mut stream = session.text_turn(&llm).stream().await?;
while let Some(event) = stream.next().await {
match event? {
TextTurnEvent::TextDelta(delta) => print!("{delta}"),
TextTurnEvent::Completed(result) => {
session.commit_text(result);
}
_ => {}
}
}Or collect the whole result at once with .collect().await?. The streaming path is always
primary; collection is a thin wrapper around it.
Transcript Model
Lutum stores committed turns as exact, adapter-owned values — not as a canonical IR. This preserves provider-specific fields (thinking blocks, cache control, reasoning traces) for lossless same-adapter replay without any normalization step at commit time.
CommittedTurn
CommittedTurn = Arc<dyn TurnView + Send + Sync>
Each adapter (Claude, OpenAI) produces its own committed turn type
(ClaudeCommittedTurn, OpenAiCommittedTurn) and implements TurnView + ItemView over it.
When a committed turn is added back into a ModelInput as ModelInputItem::Turn(committed_turn),
the adapter:
- Tries
committed_turn.as_any().downcast_ref::<ClaudeCommittedTurn>()for exact replay - Falls back to
TurnView/ItemViewprojection for cross-adapter turns
View Traits
Views are read-only, fine-grained accessors over committed turns:
| Trait | Provides |
|---|---|
TurnView |
role(), iteration over items via ItemView |
ItemView |
as_text(), as_tool_call(), as_reasoning(), as_refusal() |
Views exist for inspection and user-driven reduction into application-specific models. They are not the adapter's replay mechanism — replay always goes through the adapter with exact data.
If you want a portable application-specific IR (e.g. to store turns in a database), derive it
from TurnView/ItemView in your own code.
ModelInput and ModelInputItem
ModelInput is the canonical request algebra — a flat ordered list of ModelInputItem values:
| Variant | Content |
|---|---|
Message { role, content } |
User, system, or developer messages |
Assistant(AssistantInputItem) |
Manually constructed assistant message items |
ToolResult(ToolResult) |
Tool results following a tool-call turn |
Turn(CommittedTurn) |
An exact adapter-owned committed turn at its natural position |
The Turn variant is what makes context engineering precise: committed turns and new messages live
in the same ordered list. [user_1, Turn_0, user_2, Turn_1] is representable by construction.
ModelInput and ModelInputItem do not implement Serialize/Deserialize. The canonical
request algebra is an execution concern, not a persistence concern.