#[async_trait]
pub trait RuntimePersistence: Send + Sync {
async fn load_session(&self, scope: SessionReadScope)
-> Result<Option<PersistedSessionRead>, StoreError>;
async fn load_node(&self, node_id: &str)
-> Result<Option<SessionNodeRecord>, StoreError>;
async fn commit_runtime_state(&self, commit: RuntimeCommit)
-> Result<RuntimeCommitResult, StoreError>;
async fn save_session_meta(&self, meta: SessionMeta) -> Result<(), StoreError>;
async fn load_session_meta(&self) -> Result<Option<SessionMeta>, StoreError>;
async fn tombstone_nodes(&self, ids: &[String]) -> Result<(), StoreError>;
async fn vacuum(&self) -> Result<VacuumReport, StoreError>;
async fn gc_unreachable(&self) -> Result<GcReport, StoreError>;
}
A SessionStoreFactory produces stores on demand — one per session — keyed by the session id, parent id (for subagents / forks), and policy:
pub trait SessionStoreFactory: Send + Sync {
fn create_store(
&self,
request: &SessionStoreCreateRequest,
) -> Result<Arc<dyn RuntimePersistence>, String>;
}
What that means for hosts: at most one turn per session can commit at a time. If you open the same session id from two processes (or two tasks in one process) and start turns concurrently, exactly one commits and the loser surfaces an error.
match session.turn(input).run().await {
Ok(turn) => persist(turn),
Err(err) if err.code == "store_commit_failed" => {
// Another writer raced us. Re-open the session to read the new
// head, then decide whether to retry, surface to the user, or
// merge intent.
let session = core.session(chat_id).open().await?;
retry_or_surface(err, session);
}
Err(other) => bail!(other),
}
The conflict surfaces as a turn-level Err carrying RuntimeError { code: "store_commit_failed", .. }, not as a TurnStop variant. No partial commit and no half-finished turn in the graph; the next read of the session is identical to the read before the failed call. Hosts that expect concurrent access (multi-tab UIs, agent fleets writing to the same chat id) should either serialize turns at their own layer or treat the conflict as a recoverable signal and re-open the session against the refreshed head.
In-memory sessions (no store_factory set) enforce the same CAS check against an in-process revision counter, so the contract is identical regardless of backing store.
Embedders driving sessions from an external workflow runtime (Temporal, Restate, similar) bypass this whole-turn commit boundary in favour of per-effect durability — see Architecture → Durability for the lash-sansio checkpoint / restore seam.
use std::sync::Arc;
use lash::{LashCore, ModeId, ModePreset};
use lash_sqlite_store::SqliteSessionStoreFactory;
let store_factory = Arc::new(SqliteSessionStoreFactory::new("./.lash-data"));
let core = LashCore::builder()
.install_mode(ModePreset::rlm())
.default_mode(ModeId::rlm())
.provider(provider)
.model(model, None)
.max_context_tokens(200_000)
.store_factory(store_factory)
.build()?;
Override per session — useful for tests that want an isolated in-memory or temp-directory store:
let session = core
.session(chat_id)
.store(Arc::new(my_custom_persistence))
.open()
.await?;
If no store_factory is set, sessions run in-memory: they accept turns and produce results, but nothing is written to disk and nothing can be resumed after the process exits. Lash ships no separate "in-memory" implementation — absence of a factory is the mechanism.
Graph nodes
Every conversation event — user inputs, runtime-materialized assistant responses, tool calls, mode events, prompt snapshots — lives in graph_nodes with a tombstoned flag for soft deletion. Standard and RLM modes return terminal outcomes; the shared runtime commit writes the settled assistant transcript exactly once.
Session head
A singleton row tracking the current logical head pointer and its revision. Optimistic concurrency uses the revision to detect conflicting writes.
Checkpoint blobs
Compressed snapshots of tool state, plugin state, execution state, and prompt material, keyed by content hash so identical blobs are deduplicated across resumes.
Usage deltas
Per-turn token counts: input_tokens, output_tokens, cached_input_tokens, reasoning_tokens. The runtime aggregates these into the session usage ledger surfaced by SessionReadView.
Session metadata
Session id, human-readable name, model, cwd, parent session id, creation timestamp. This is what the resume picker shows.
Attachments
Image / file attachments are stored as blobs and referenced from graph nodes by BlobRef. The same blob is reused by traces and exports.
gc_unreachable()
Walks reachable blob references from the session head and active checkpoints, marks orphaned blobs for deletion, and returns a GcReport with the number of blobs removed. Run this after pruning a branch or discarding a long unused thread.
vacuum()
Physically removes tombstoned graph-node rows that gc has already detached, returning a VacuumReport with the row count. Use this after a gc_unreachable pass to reclaim file size.
// One factory at boot, shared across every chat.
let store_factory = Arc::new(SqliteSessionStoreFactory::new(
data_dir.join("lash-sessions"),
));
let core = LashCore::builder()
.install_mode(ModePreset::rlm())
.default_mode(ModeId::rlm())
.provider(provider)
.model(model.clone(), Some(model_variant.clone()))
.max_context_tokens(200_000)
.store_factory(store_factory)
.build()?;
// Per request: open a session keyed by the app's chat id.
let session = core.session(chat_id).rlm().open().await?;
The full source is at examples/agent-service/src/main.rs.