lash guide

session/persistence

Sessions persist their runtime state through a small interface, RuntimePersistence. The default first-party implementation is SQLite-backed (lash-sqlite-store) and is what the CLI uses. Host apps wire it up by passing a SessionStoreFactory to the core builder, or by handing a per-session store directly to SessionBuilder.

The Contract

RuntimePersistence is the runtime-facing trait. Most app hosts never implement it themselves — they use lash-sqlite-store via its factory.

#[async_trait]
pub trait RuntimePersistence: Send + Sync {
    async fn load_session(&self, scope: SessionReadScope)
        -> Result<Option<PersistedSessionRead>, StoreError>;
    async fn load_node(&self, node_id: &str)
        -> Result<Option<SessionNodeRecord>, StoreError>;
    async fn commit_runtime_state(&self, commit: RuntimeCommit)
        -> Result<RuntimeCommitResult, StoreError>;
    async fn save_session_meta(&self, meta: SessionMeta) -> Result<(), StoreError>;
    async fn load_session_meta(&self) -> Result<Option<SessionMeta>, StoreError>;
    async fn tombstone_nodes(&self, ids: &[String]) -> Result<(), StoreError>;
    async fn vacuum(&self) -> Result<VacuumReport, StoreError>;
    async fn gc_unreachable(&self) -> Result<GcReport, StoreError>;
}

A SessionStoreFactory produces stores on demand — one per session — keyed by the session id, parent id (for subagents / forks), and policy:

pub trait SessionStoreFactory: Send + Sync {
    fn create_store(
        &self,
        request: &SessionStoreCreateRequest,
    ) -> Result<Arc<dyn RuntimePersistence>, String>;
}

Concurrent Writes And CAS

Every RuntimeCommit carries an expected_head_revision: Option<u64> read from the session state when the turn started. The store checks that field against the row's current revision under the same SQL transaction that applies the delta; mismatch means another writer has advanced the head since this turn read it, and the transaction is rolled back. The runtime does not retry — partial state never lands.

What that means for hosts: at most one turn per session can commit at a time. If you open the same session id from two processes (or two tasks in one process) and start turns concurrently, exactly one commits and the loser surfaces an error.

match session.turn(input).run().await {
    Ok(turn) => persist(turn),
    Err(err) if err.code == "store_commit_failed" => {
        // Another writer raced us. Re-open the session to read the new
        // head, then decide whether to retry, surface to the user, or
        // merge intent.
        let session = core.session(chat_id).open().await?;
        retry_or_surface(err, session);
    }
    Err(other) => bail!(other),
}

The conflict surfaces as a turn-level Err carrying RuntimeError { code: "store_commit_failed", .. }, not as a TurnStop variant. No partial commit and no half-finished turn in the graph; the next read of the session is identical to the read before the failed call. Hosts that expect concurrent access (multi-tab UIs, agent fleets writing to the same chat id) should either serialize turns at their own layer or treat the conflict as a recoverable signal and re-open the session against the refreshed head.

In-memory sessions (no store_factory set) enforce the same CAS check against an in-process revision counter, so the contract is identical regardless of backing store.

Embedders driving sessions from an external workflow runtime (Temporal, Restate, similar) bypass this whole-turn commit boundary in favour of per-effect durability — see Architecture → Durability for the lash-sansio checkpoint / restore seam.

Wiring It Up

The common path is a single factory installed on the core. Every session opened from that core uses it.

use std::sync::Arc;
use lash::{LashCore, ModeId, ModePreset};
use lash_sqlite_store::SqliteSessionStoreFactory;

let store_factory = Arc::new(SqliteSessionStoreFactory::new("./.lash-data"));

let core = LashCore::builder()
    .install_mode(ModePreset::rlm())
    .default_mode(ModeId::rlm())
    .provider(provider)
    .model(model, None)
    .max_context_tokens(200_000)
    .store_factory(store_factory)
    .build()?;

Override per session — useful for tests that want an isolated in-memory or temp-directory store:

let session = core
    .session(chat_id)
    .store(Arc::new(my_custom_persistence))
    .open()
    .await?;

If no store_factory is set, sessions run in-memory: they accept turns and produce results, but nothing is written to disk and nothing can be resumed after the process exits. Lash ships no separate "in-memory" implementation — absence of a factory is the mechanism.

What Gets Persisted

The SQLite schema is exhaustive and stable. A single session's database holds:

Graph nodes

Every conversation event — user inputs, runtime-materialized assistant responses, tool calls, mode events, prompt snapshots — lives in graph_nodes with a tombstoned flag for soft deletion. Standard and RLM modes return terminal outcomes; the shared runtime commit writes the settled assistant transcript exactly once.

Session head

A singleton row tracking the current logical head pointer and its revision. Optimistic concurrency uses the revision to detect conflicting writes.

Checkpoint blobs

Compressed snapshots of tool state, plugin state, execution state, and prompt material, keyed by content hash so identical blobs are deduplicated across resumes.

Usage deltas

Per-turn token counts: input_tokens, output_tokens, cached_input_tokens, reasoning_tokens. The runtime aggregates these into the session usage ledger surfaced by SessionReadView.

Session metadata

Session id, human-readable name, model, cwd, parent session id, creation timestamp. This is what the resume picker shows.

Attachments

Image / file attachments are stored as blobs and referenced from graph nodes by BlobRef. The same blob is reused by traces and exports.

Garbage Collection

Two operations on RuntimePersistence reclaim space. Both are safe to run from a background task or a CLI maintenance command.

gc_unreachable()

Walks reachable blob references from the session head and active checkpoints, marks orphaned blobs for deletion, and returns a GcReport with the number of blobs removed. Run this after pruning a branch or discarding a long unused thread.

vacuum()

Physically removes tombstoned graph-node rows that gc has already detached, returning a VacuumReport with the row count. Use this after a gc_unreachable pass to reclaim file size.

End-To-End Example

The agent-service example shows the canonical wire-up: a SQLite factory, sessions opened from the same core by chat id, runtime persistence handled transparently while the app keeps its own product database (chat rows, board state, frontend events) alongside.

// One factory at boot, shared across every chat.
let store_factory = Arc::new(SqliteSessionStoreFactory::new(
    data_dir.join("lash-sessions"),
));

let core = LashCore::builder()
    .install_mode(ModePreset::rlm())
    .default_mode(ModeId::rlm())
    .provider(provider)
    .model(model.clone(), Some(model_variant.clone()))
    .max_context_tokens(200_000)
    .store_factory(store_factory)
    .build()?;

// Per request: open a session keyed by the app's chat id.
let session = core.session(chat_id).rlm().open().await?;

The full source is at examples/agent-service/src/main.rs.