gc_unreachable()
Walks reachable blobs from head and active checkpoints, marks orphans for deletion, returns a GcReport. Run after pruning a branch.
Persist sessions by installing a SessionStoreFactory on the core or passing a store per session. Without one, sessions are in-process only. This page is the app-facing guide; effect replay and workflow durability live in the architecture docs.
Session persistence stores committed runtime state for a session id: transcript graph, AgentFrame records, plugin/tool snapshots, usage deltas, pending turn inputs, queued work, metadata, and attachment references.
A session store makes a session id reopenable. It is separate from app tables and separate from in-flight workflow replay.
Use .store_factory(...) when sessions reopen across requests, process restarts, pending turn inputs, queued work, process wakes, or managed child sessions.
Omit a session store for throwaway tests and local single-process demos where every session can reset on restart.
Install one session-store factory plus explicit artifact and attachment stores on the core; every session inherits them.
use std::sync::Arc;
use lash_sqlite_store::{SqliteSessionStoreFactory, Store};
let data_dir = std::path::PathBuf::from("./.lash-data");
let store_factory = Arc::new(SqliteSessionStoreFactory::new(data_dir.join("sessions")));
let artifact_store = Arc::new(Store::open(&data_dir.join("artifacts.db")).await?);
let factory = lash::rlm::RlmProtocolPluginFactory::new(
lash::rlm::RlmProtocolPluginConfig::default(),
artifact_store,
);
let core = lash::LashCore::rlm_builder(factory)
.provider(provider)
.model(
lash::ModelSpec::from_token_limits(model.clone(), None, 200_000, None)
.expect("valid model metadata"),
)
.store_factory(store_factory)
.effect_host(Arc::new(lash::durability::InlineEffectHost::default()))
.attachment_store(Arc::new(lash::persistence::FileAttachmentStore::new(
data_dir.join("attachments"),
)))
.build()?;
Use the core-level factory for app servers. Use a per-session override only for tests or hosts that already own a concrete store.
Per-session override:
let session = core
.session(chat_id)
.store(Arc::new(my_custom_persistence))
.open()
.await?;
The examples use the first-party SQLite adapter because it is the smallest local durable setup. The runtime contract is SessionStoreFactory / RuntimePersistence; a host can provide another database-backed implementation when it preserves the same session execution lease, fenced head commits, checkpoint/blob references, pending turn-input claims, queued work, and idempotency semantics.
Without .store_factory(...) or .store(...), turns still succeed but state stays in memory. You still provide explicit in-memory effect, artifact, and attachment facets.
Use lash-postgres-store plus lash-s3-store when multiple workers must share one durable runtime state and one attachment byte store.
use std::sync::Arc;
use lash_postgres_store::PostgresStorage;
use lash_s3_store::S3AttachmentStore;
let storage = PostgresStorage::connect(&database_url).await?;
let attachments = S3AttachmentStore::builder("lash-attachments", "us-east-1")
.endpoint_url("http://localhost:9000") // omit for AWS S3
.access_key_id("minioadmin")
.secret_access_key("minioadmin")
.path_style(true)
.prefix("prod/lash")
.build()?;
let factory = lash::rlm::RlmProtocolPluginFactory::new(
lash::rlm::RlmProtocolPluginConfig::default(),
Arc::new(storage.lashlang_artifact_store()),
);
let core = lash::LashCore::rlm_builder(factory)
.store_factory(Arc::new(storage.session_store_factory()))
.process_registry(Arc::new(storage.process_registry()))
.trigger_store(Arc::new(storage.trigger_store()))
.attachment_store(Arc::new(attachments))
// provider, model, effect host...
.build()?;
PostgresStorage::connect creates the exact supported schema on an empty database and rejects mismatched component versions. Runtime checkpoints, pending turn inputs, queued work, process rows, triggers, attachment manifests, and Lashlang artifacts live in Postgres. Attachment bytes live in S3-compatible object storage under content-addressed keys; MinIO uses the same implementation with an endpoint URL and path-style mode.
PostgresStoreConfig applies connection-level backstops by default: a 10 second lock_timeout and a 30 second statement_timeout. Mutating session work first claims the durable session execution lease; session commits then verify that lease fence and perform the head-revision CAS in one transaction. Postgres serialization failures, deadlocks, and lock-acquisition timeouts on the backstop write surface as retryable conflicts so hosts can reload and retry instead of treating ordinary write contention as an opaque backend failure.
The distributed integration harness is just restate-postgres-workers-e2e. It starts Postgres, MinIO, Restate, a deterministic OpenAI-compatible mock provider, two worker processes behind a Caddy h2c proxy, and a runner. The workers build a normal downstream-style LashCore with PostgresStorage, S3AttachmentStore, PostgresLashlangArtifactStore, PostgresTriggerStore, PostgresProcessRegistry, RestateRuntimeEffectController, RestateProcessDeployment, and JSONL trace sinks.
The scenario runs real session.turn(input).stream_to(...) turns through the public facade. It covers foreground tools, attachment creation through the configured attachment store, parent/nested/parallel Lashlang processes, durable sleep and wake paths, public trigger registration and delivery, live replay from a stored observation cursor, MinIO byte/metadata assertions, and failover where worker A exits from the real turn path after terminal finish or another durable effect and worker B completes through Restate replay. The runner asserts one provider call per journaled effect, one committed turn, no duplicate runtime rows, and no active Lash Restate invocations left in pending, ready, running, backing-off, or suspended.
Use LashCore::delete_session(..., scope) when an app-level delete should remove factory-backed runtime state for a session id.
let effect_host = core.effect_host();
let scope = effect_host.scoped(lash::runtime::ExecutionScope::runtime_operation(format!(
"delete-session:{chat_id}"
)))?;
let report = core.delete_session(chat_id, scope).await?;
if let Some(process_report) = report.process {
audit_process_cleanup(process_report)?;
}
Deletion requires a core-level SessionStoreFactory; per-session stores cannot be rediscovered from a bare id. If a ProcessRegistry is installed, Lash revokes that session's process handle grants, deletes wake bookkeeping addressed to it, deletes its trigger subscriptions, and reports any still-running zero-grant processes as orphans for host policy. It does not cancel processes automatically.
Think in runtime concepts first. The exact SQL layout belongs to the store implementation.
User inputs, assistant responses, tool calls, protocol events, prompt snapshots, and frame boundaries.
Current revision, current AgentFrame id, session policy, model/provider identity, and metadata for resume UIs.
Plugin state, tool state, current RLM execution state, prompt state, and Lashlang artifacts needed for resume.
User-visible model input awaiting an active checkpoint or the next idle turn. Active-turn input is anchored to a live turn id and checkpoint boundary; interrupt finalization completes accepted inputs and defers only unaccepted ones to the next turn.
Durable ingress for SessionCommand mutations and non-user TurnWork such as process wakes, with leases and completed claim ids cleared by the consuming commit. Triggers are runtime-level occurrences, and timers are host-owned scheduling.
Process execution environments are captured at start as content-addressed Lashlang artifacts. Persisted process and trigger rows reference those immutable environment blobs; this alpha cutover is not backward-compatible with pre-env-ref process rows.
Per-turn token deltas for uncached input, total output, cache-read input, cache-write input, and reasoning-output counts that aggregate into session.usage_report().
At most one mutating runner holds the session execution lease at a time. Stores still check the expected head revision in the same transaction as the fenced commit; mismatch rolls back with no partial state.
Concurrent turns from two workers on the same session id normally resolve at the lease boundary. A busy session returns session_execution_busy; a runner that loses the lease before commit returns session_execution_lease_lost. If the head CAS backstop fires, the loser receives a turn-level error such as store_commit_failed; these are not TurnStop variants.
Durable workflow hosts that already serialize one logical invocation can open the session with SessionBuilder::session_execution_owner(LeaseOwnerIdentity::opaque(...)). Reentry requires the same owner id and the same incarnation id; a retry that represents a new process incarnation must claim or reclaim through the fenced lease path instead of clearing the old owner. Local CLI runtimes attach process liveness metadata so a crashed same-host holder can be reclaimed quickly, while opaque or cross-host holders fall back to the TTL backstop. Composing owner identity, lease timings, and drain into a failover policy is covered in running in production.
use lash::runtime::RuntimeErrorCode;
match session.turn(input).run().await {
Ok(turn) => persist(turn)?,
Err(lash::EmbedError::Runtime(err))
if err.code == RuntimeErrorCode::SessionExecutionBusy =>
{
retry_later(err)?;
}
Err(lash::EmbedError::Runtime(err))
if err.code == RuntimeErrorCode::SessionExecutionLeaseLost =>
{
// The durable lane moved to another owner before commit: reopen and retry.
let session = core.session(chat_id).open().await?;
retry_or_report(err, session)?;
}
Err(lash::EmbedError::Runtime(err)) if err.code == RuntimeErrorCode::StoreCommitFailed => {
// The CAS backstop fired: reload and retry.
let session = core.session(chat_id).open().await?;
retry_or_report(err, session)?;
}
Err(other) => bail!(other),
}
Hosts expecting concurrent access should serialize at their layer or re-open the session after conflict and decide how to merge intent.
Run reclamation from a process, maintenance route, or CLI command after pruning branches or deleting sessions.
gc_unreachable()Walks reachable blobs from head and active checkpoints, marks orphans for deletion, returns a GcReport. Run after pruning a branch.
vacuum()Removes tombstoned graph-node rows already detached by gc and prunes terminal pending-turn-input evidence rows, returns a VacuumReport. Run after gc_unreachable to reclaim file size.
The process registry is a separate store with its own retention lever. prune_terminal_processes(cutoff_epoch_ms) physically deletes terminal process rows — with their events, wakes, handle grants, and leases — older than a cutoff, so a host that projects process outcomes into its own store keeps the registry bounded; non-terminal rows are never touched. Run it on the same maintenance cadence, with a window comfortably longer than any in-flight ProcessWorkDriver::await_terminal (a shorter cutoff can prune a process out from under a late await). See docs/adr/0017-process-observation-is-best-effort-push-over-state-truth.md.
The agent-service example keeps app storage and runtime storage separate: chat rows in one database, Lash sessions in the runtime store.
// One factory at boot, shared across every chat.
let store_factory = Arc::new(SqliteSessionStoreFactory::new(
data_dir.join("lash-sessions"),
));
let artifact_store =
Arc::new(lash_sqlite_store::Store::open(&data_dir.join("lash-artifacts.db")).await?);
let factory = lash::rlm::RlmProtocolPluginFactory::new(
lash::rlm::RlmProtocolPluginConfig::default(),
artifact_store,
);
let core = lash::LashCore::rlm_builder(factory)
.provider(provider)
.model(
lash::ModelSpec::from_token_limits(
model.clone(),
Some(model_variant.clone()),
200_000,
None,
)
.expect("valid model metadata"),
)
.store_factory(store_factory)
.effect_host(Arc::new(lash::durability::InlineEffectHost::default()))
.attachment_store(Arc::new(lash::persistence::FileAttachmentStore::new(
data_dir.join("attachments"),
)))
.build()?;
// Per request: open a session keyed by the app's chat id.
let session = core.session(chat_id).open().await?;
Full source: examples/agent-service.
This page owns app-facing persistence. Replay and store internals are intentionally elsewhere.