lash api · basics · lash docs

Runtime Alongside The Model

A turn is a RuntimeCommit against a durable SessionGraph: whole-turn atomic. Two execution modes, standard (native tool calls) and rlm (Lashlang in a no-syscall VM), share the same commit and trace shape.

Durable session graph: Conversation records, tool events, protocol events, and plugin nodes live in one SessionGraph per session. SessionReadView and ChronologicalProjection read it; persistence writes stay behind the runtime store.
Per-turn atomic commit: RuntimeCommit writes graph delta, checkpoint blobs, usage deltas, and head revision in one atomic store commit fenced by the session execution lease, with optimistic CAS on expected_head_revision as the stale-writer backstop. Partial turn = no commit.
Typed plugin capabilities: Tools see a named ToolContext surface at execute (processes(), sessions(), direct_completions(), attachments) plus tool_catalog() and snapshot reads on the ToolPrepareContext at prepare time, not a general host handle. The surface exposes no ToolContext::host() escape hatch.
Sandboxed code load, ready: RLM mode runs model-emitted Lashlang programs in a VM with no direct filesystem, OS-process, or network surface. Every effect crosses Lashlang's ExecutionHost, and the linked host surface decides which resource and background-process abilities exist. Use it when the model should compose multiple tool calls per turn instead of one.
Subagent capability boundaries: Capability::build_session_request(ctx) resolves a constrained SessionCreateRequest for the child against the parent policy. Hosts hide interactive-only tools from child surfaces through the factory's hidden-tool set (with_hidden_tools); at the maximum spawn depth the runtime hides spawn_agent itself.
Semantic event stream: Identity-bearing TurnActivity items: assistant prose deltas, reasoning deltas, tool start/complete pairs with correlation ids, code-block start/complete, terminal FinalValue/ToolValue, per-call and rolling usage. Assistant prose deltas are live previews; the settled transcript comes from TurnResult / SessionReadView after the runtime commit.
Session observation: ObservableSession::current_observation() returns the current SessionReadView plus an opaque SessionCursor. Persistent browser or worker transports should use subscribe_and_recover, which is a futures_util::Stream that replays buffered SessionObservationEvents, yields recoverable gap items, and resubscribes from gap.latest_cursor internally. resume_from_cursor is the one-shot poll alternative; subscribe_from_cursor is the lower-level primitive for hosts that want to handle resubscribe loops themselves. See Streaming and reconnect for browser folding and gap recovery.
Tracing as a first-class sink: Attach TraceSink implementations for structured runtime records. JSONL tracing and the HTML viewer cover turns, tools, LLMs, prompts, stream deltas, and usage; OpenTelemetry export is optional. Lashlang execution graphs use a separate opt-in sink so foreground blocks, durable process runs, and subagent child work can be observed without mutating process-registry state. Use LashCore::processes() for host process commands and TraceLashlangGraphStore for optional trace-derived graph snapshots.
Snapshot and restore seams: Plugins, tool state, and the current AgentFrame's Lashlang VM each persist through versioned snapshot writers so a parked session resumes intact across process restarts.

Out of scope: tenancy, artifact lifecycle, procedure discovery, cross-session shared state. Those belong to the embedder.

Session Facade Abstractions

The public lash API is session-centered. Hosts keep one stable LashSession handle per app conversation or task; the runtime may switch the active internal AgentFrame underneath it without changing that host handle.

Host-facing handles

LashCore: Shared runtime factory for provider, model, mode presets, plugin stack, tool providers, effect host, runtime stores, tracing, and durable policies. It is cheap to clone and does not represent a conversation.
SessionBuilder: Short-lived opener for one session id. It resolves SessionSpec into a concrete SessionPolicy, chooses the mode, attaches or creates the store, reloads persisted state when present, and then yields a LashSession.
LashSession: The stable host-owned work identity. Use it to start turns, resume leased turns, read committed state, queue inputs, cancel background work, and reach explicit admin controls. continue_as does not replace this handle; spawn_agent is the operation that creates another session.

Turn-facing handles

TurnInput: The user or host payload plus per-turn prompt/plugin context. It is data, not a running turn.
TurnBuilder: An ephemeral run request created by session.turn(input). It layers cancellation (a per-turn CancellationToken, or LashSession::cancel_running_turns() from any clone of the session; see turns · cancellation), provider/model overrides, protocol turn options, prompt edits, typed plugin input, and optional durable handler effects, then drives the turn through run(), stream_to(...), or pull-style stream().
TurnResult / TurnOutput: The committed result. TurnResult is the terminal outcome and usage; TurnOutput adds the collected semantic TurnActivity stream for app UIs and logs.
SessionObservation: The reconnect surface for hosts that observe a whole session. It is session-level and cursor-based; turn activity APIs remain local conveniences for the currently running turn. Persistent hosts should stream subscribe_and_recover and keep app product state separate from Lash observation rows. The host recipe lives in Streaming and reconnect.

Internal identity	What owns it	Why hosts care
`Session`	`lash-core` runtime state, graph, plugin instances, tool state, usage ledger, queue state, and frame list.	This is the durable unit behind `LashSession`. The session id is what app databases and stores should keep.
`AgentFrame`	The current agent assignment: effective policy, plugin options/source, tool access, subagent context, protocol turn options, execution snapshot, provenance, and prior-frame link.	`continue_as` appends a new frame in the same session with explicit `task` + `seed`. `/compact` appends a compaction frame seeded by an assistant summary. The facade keeps driving the same session until a real final result or stop.
`SessionGraph`	Durable nodes for inputs, assistant output, tool calls, protocol events, plugin nodes, and frame-switch boundaries. Nodes carry their `agent_frame_id`.	Read views can show the full session history, while model projection normally reads only the current frame unless a projection explicitly widens scope.
`RuntimeSessionState`	Persisted session envelope: graph, frames, current frame id, policy, plugin/tool snapshots, usage, pending turn input, queued work, checkpoints, and head metadata.	Hosts normally never construct it directly, but stores and advanced tools can persist or inspect it through the curated facade modules.

Rule of thumb: keep app code on LashCore -> SessionBuilder -> LashSession -> TurnBuilder. Drop below the facade only for custom runtime hosts, persistence implementations, protocol plugins, or diagnostics that genuinely need internal state.

Core Shape

One LashCore per app. One LashSession per conversation or task.

lash vs lash-core boundary

use std::sync::Arc;

use lash::{TurnEvent, TurnInput, plugins::runtime_plugin_stack, tools::*};

let data_dir = std::path::PathBuf::from(".lash-data");
let store_factory = Arc::new(lash_sqlite_store::SqliteSessionStoreFactory::new(
    data_dir.join("sessions"),
));
let artifact_store =
    Arc::new(lash_sqlite_store::Store::open(&data_dir.join("artifacts.db")).await?);

let factory = lash::rlm::RlmProtocolPluginFactory::new(
    lash::rlm::RlmProtocolPluginConfig::default(),
    artifact_store,
);
let core = lash::LashCore::rlm_builder(factory)
    .provider(provider)
    .model(
        lash::ModelSpec::from_token_limits("anthropic/claude-sonnet-4.6", None, 200_000, None)
            .expect("valid model metadata"),
    )
    .plugins(runtime_plugin_stack())
    .tools(Arc::new(AppTools) as Arc<dyn ToolProvider>)
    .store_factory(store_factory)
    .effect_host(Arc::new(lash::durability::InlineEffectHost::default()))
    .attachment_store(Arc::new(lash::persistence::FileAttachmentStore::new(
        data_dir.join("attachments"),
    )))
    .build()?;

let session = core.session("chat-123").open().await?;
let result = session
    .turn(TurnInput::text("Use the app tools."))
    .run()
    .await?;
let assistant_text: String = result
    .activities
    .iter()
    .filter_map(|activity| match &activity.event {
        TurnEvent::AssistantProseDelta { text } => Some(text.as_str()),
        _ => None,
    })
    .collect();
println!("{assistant_text}");

LashCore: Cloneable shared config for advanced custom protocol hosts. App code normally starts with LashCore::standard_builder() or LashCore::rlm_builder(factory).
LashCore::standard_builder() / LashCore::rlm_builder(factory): Sugar entry points for the built-in protocols. standard_builder() opens native provider-tool sessions; rlm_builder(factory) opens Lashlang/RLM sessions from a host-configured RLM protocol factory.
SessionSpec: Provider, model/variant, max context tokens, max turns, prompt layer. Root cores: SessionSpec::new(). Children and subagents: SessionSpec::inherit().
PluginStack: Ordered plugin factory list. LashCore::standard_builder() and LashCore::rlm_builder(factory) pre-seed their protocol plugin and the default runtime stack; generic LashCore::builder() requires an explicit protocol plugin. All paths still require explicit effect and attachment facets before build(); RLM also requires a Lashlang artifact store.
LashSession: One conversation/task. Parked/resumable runtime, optional per-session store, exposes turn(TurnInput), enqueue(TurnInput) (durable pending input for the next turn unless an active-turn ingress is specified), pending_turn_inputs() plus typed pending-input cancellation for user-input previews and edit reconciliation, queued_work() for admin inspection of non-user process wakes and session commands, await_queued_work_batch(batch_id) (resolve when a non-user queued-work batch you enqueued is drained or cancelled, by any handle or process, without claiming it yourself; bound it with tokio::time::timeout), read_view(), and admin().
TurnBuilder: Per-turn config: cancellation, typed plugin input, RLM projected bindings. .stream_to(&sink) for a push sink, .stream() for a pull-style futures_util::Stream, and .run() for a collected activity log. Durable handlers add .turn_id(...).effects(&controller).

Session Specs And Plugins

Root builder methods wrap one SessionSpec. Pass a complete spec via .session_spec(...); the same type configures child sessions.

Root defaults

use std::sync::Arc;

use lash::{SessionSpec, plugins::PluginFactory};

let root_spec = SessionSpec::new().provider_id(provider.kind()).model(
    lash::ModelSpec::from_token_limits("gpt-5.4", None, 200_000, None)
        .expect("valid model metadata"),
);

let factory = lash::rlm::RlmProtocolPluginFactory::new(
    lash::rlm::RlmProtocolPluginConfig::default(),
    Arc::new(lash::persistence::InMemoryLashlangArtifactStore::new()),
);
let core = lash::LashCore::rlm_builder(factory)
    .session_spec(root_spec)
    .effect_host(Arc::new(lash::durability::InlineEffectHost::default()))
    .attachment_store(Arc::new(lash::persistence::InMemoryAttachmentStore::new()))
    .configure_plugins(|plugins| {
        plugins.push(Arc::new(AppPluginFactory) as Arc<dyn PluginFactory>);
    })
    .build()?;

Explicit stacks

use std::sync::Arc;

let plugins = runtime_plugin_stack().configure(|plugins| {
    plugins.replace(Arc::new(CustomBudgetPlugin) as Arc<dyn PluginFactory>);
    plugins.push(Arc::new(AppPluginFactory) as Arc<dyn PluginFactory>);
});

let factory = lash::rlm::RlmProtocolPluginFactory::new(
    lash::rlm::RlmProtocolPluginConfig::default(),
    Arc::new(lash::persistence::InMemoryLashlangArtifactStore::new()),
);
let core = lash::LashCore::rlm_builder(factory)
    .session_spec(root_spec)
    .plugins(plugins)
    .effect_host(Arc::new(lash::durability::InlineEffectHost::default()))
    .attachment_store(Arc::new(lash::persistence::InMemoryAttachmentStore::new()))
    .build()?;

.plugin(...) appends; .plugins(...) replaces; .configure_plugins(...) mutates in place.

Run A Turn

TurnBuilder::run() returns a TurnOutput: terminal TurnResult plus ordered Vec<TurnActivity>.

Collected result

let collected = session
    .turn(TurnInput::text("Summarize this task."))
    .run()
    .await?;

let live_preview: String = collected
    .activities
    .iter()
    .filter_map(|activity| match &activity.event {
        TurnEvent::AssistantProseDelta { text } => Some(text.as_str()),
        _ => None,
    })
    .collect();
let settled_answer = collected.assistant_message().unwrap_or_default();
let total = collected.result.total_usage(); // parent + children
let parent_usage = collected.result.usage; // parent's own LLM tokens
let children = collected.result.children_usage; // per-(source, model) child entries
let outcome = collected.result.outcome;

Live stream

let ui_sink = Arc::new(AppEvents::new(tx));
let turn = session
    .turn(TurnInput::text(user_text))
    .stream_to(ui_sink.as_ref())
    .await?;

persist(
    turn.assistant_message().unwrap_or_default(),
    turn.total_usage(),
)?;

Fold TurnActivity directly for live UI state. At turn completion, replace the current-turn display from TurnResult.state.read_view(); do not append a separate final assistant event. Use TurnActivity.correlation_id as the stable identity for multi-phase rows: start events insert, completion events update.

Pending Input Reconciliation

Apps that let users edit submitted messages should reconcile from pending-input receipts, then cancel the runtime suffix before replacing product text.

use lash::{
    PendingTurnInputCancelOutcome, PendingTurnInputCancelTarget,
    PendingTurnInputSuffixCancelOutcome, TurnInput,
};

session
    .enqueue(TurnInput::text("first draft"))
    .id("message:1")
    .send()
    .await?;
let second = session
    .enqueue(TurnInput::text("second draft"))
    .id("message:2")
    .send()
    .await?;

// Queue previews come from runtime admission receipts, not local draft
// state. Persist `input_id` or `source_key` beside the product message.
let pending = session.pending_turn_inputs().await?;
render_pending_inputs(&pending);

// Before editing a product message, atomically cancel the runtime suffix
// rooted at that submitted revision.
let anchor = second
    .source_key
    .clone()
    .map(PendingTurnInputCancelTarget::source_key)
    .unwrap_or_else(|| PendingTurnInputCancelTarget::input_id(second.input_id.clone()));
match session.cancel_pending_turn_input_suffix(anchor).await? {
    PendingTurnInputSuffixCancelOutcome::AnchorNotFound { .. } => {
        render_pending_inputs(&session.pending_turn_inputs().await?);
    }
    PendingTurnInputSuffixCancelOutcome::Outcomes { outcomes, .. } => {
        for outcome in outcomes {
            match outcome {
                PendingTurnInputCancelOutcome::Cancelled(input)
                | PendingTurnInputCancelOutcome::AlreadyCancelled(input) => {
                    remove_pending_input_preview(&input.input_id);
                }
                PendingTurnInputCancelOutcome::AlreadyClaimed { input, .. }
                | PendingTurnInputCancelOutcome::AlreadyCompleted(input) => {
                    reconcile_pending_input_state(Some(&input.input_id));
                }
                PendingTurnInputCancelOutcome::NotFound => {
                    reconcile_pending_input_state(None);
                }
            }
        }
    }
}

let replacement = session
    .enqueue(TurnInput::text("updated second draft"))
    .id("message:2:v2")
    .send()
    .await?;
render_pending_inputs(std::slice::from_ref(&replacement));

PendingTurnInput is runtime admission evidence. Keep product editing state in your app, store the returned input_id or source_key with your message revision, and treat claimed/completed outcomes as reconciliation signals instead of forcibly editing already-applied input.

lash/api