agent/service

The runnable browser example in examples/agent-service shows the app-facing lash API in a realistic host: Axum routes, OpenRouter, RLM mode, typed session plugin activation, app-owned tools, semantic streaming through session observation live replay, per-chat model selection, SQLite runtime stores, separate product persistence, and optional Restate-backed turns.

Run It

The example uses OpenRouter through the OpenAI-compatible provider. Environment variables define defaults; the browser UI can override model and variant per chat.

OPENROUTER_API_KEY=... cargo run -p agent-service
# then open http://127.0.0.1:3000
OPENROUTER_MODEL
Default model id for new chats. Defaults to anthropic/claude-sonnet-4.6.
OPENROUTER_MODEL_VARIANT
Default reasoning variant for new chats. Defaults to high; the UI offers low, medium, and high.
AGENT_SERVICE_DATA_DIR
App database, runtime session stores, and traces. Defaults to .agent-service.

What It Demonstrates

Read this example end-to-end when checking whether an app uses the lash facade correctly.

Shared core

One LashCore holds provider, model defaults, RLM mode, effect host, runtime stores, and trace sink for the whole app.

Session per chat

Each request opens a LashSession from the chat id and store. The app does not keep live runtime sessions in a process map.

Per-chat model

The product DB stores model and model_variant; turns apply the current selection through TurnBuilder::model(...).

App-owned board

The board lives in the app database. Prompt hooks and tools load the canonical chat_boards row by session id; message payload snapshots are only for browser replay.

Session replay stream

The browser renders live RemoteSessionObservationEvent rows from an opaque session cursor for assistant prose, reasoning deltas, code blocks, tool cards, usage, and final terminal values. The route emits replay_cursor first and replay_gap with RemoteLiveReplayGap plus RemoteSessionObservation when the bounded window was missed. Prose completion is settled from the returned TurnOutput, not from a separate final stream event.

Split persistence

lash-sqlite-store persists runtime state; the app database stores chat rows, board snapshots, model choice, titles, and rendered activity rows.

Process retention

A boot-time maintenance loop calls ProcessRegistry::prune_terminal_processes on a fixed cadence, dropping terminal process rows older than a 7-day window (ADR 0017) — host-scheduled retention alongside the session-store reclamation on persistence.

Core Wiring

Build one core for the process. The example installs RLM explicitly and wires the SQLite session factory, Lashlang artifact store, file attachment store, and effect host on the normal builder.

let provider = ProviderHandle::new(
    OpenAiCompatibleProvider::new(api_key, OPENROUTER_BASE_URL)
        .with_options(ProviderOptions {
            expose_thinking: true,
            ..ProviderOptions::default()
        })
        .into_components(),
);
let artifact_store = std::sync::Arc::new(
    lash_sqlite_store::Store::open(&data_dir.join("lash-artifacts.db")).await?,
);
let process_env_store = std::sync::Arc::new(
    lash_sqlite_store::Store::open(&data_dir.join("process-env.db")).await?,
);
let trigger_store = std::sync::Arc::new(
    lash_sqlite_store::SqliteTriggerStore::open(&data_dir.join("triggers.db")).await?,
);

let factory = lash::rlm::RlmProtocolPluginFactory::new(
    lash::rlm::RlmProtocolPluginConfig::default(),
    artifact_store,
);
let core = lash::LashCore::rlm_builder(factory)
    .provider(provider)
    .model(
        lash::ModelSpec::from_token_limits(
            model.clone(),
            Some(model_variant.clone()),
            200_000,
            None,
        )
        .expect("valid model metadata"),
    )
    .store_factory(store_factory)
    .effect_host(std::sync::Arc::new(
        lash::durability::InlineEffectHost::default(),
    ))
    .process_env_store(process_env_store)
    .trigger_store(trigger_store)
    .attachment_store(std::sync::Arc::new(
        lash::persistence::FileAttachmentStore::new(data_dir.join("attachments")),
    ))
    .trace_sink(trace_sink)
    .trace_level(TraceLevel::Extended)
    .build()?;

Turn Path

The route stores the user message and UI replay board snapshot first, updates the canonical chat_boards row, opens the Lash session, then runs a turn with the chat's model selection.

let session = state.open_session(&chat_id).await?;
let replay_cursor = session.observe().current_observation().cursor;

use lash::rlm::RlmTurnBuilderExt as _;

let turn = session
    .turn(TurnInput::text(text))
    .model(
        lash::ModelSpec::from_token_limits(
            model_selection.model,
            model_selection.model_variant,
            200_000,
            None,
        )
        .expect("valid model metadata"),
    )
    .require_finish()?;

let output = turn.stream_to(&ui_events).await?;
let assistant_text = assistant_text_for_persistence(
    &output,
    &turn_state.lock().expect("turn state lock").assistant_prose,
);

The board is app-owned SQLite state. User message payloads keep board snapshots so the browser can replay the game, while the plugin prompt and read_board / play_move tools read and mutate the canonical row. assistant_text_for_persistence derives the product chat row from terminal semantics: final values render from TurnFinish::FinalValue, while prose comes from TurnFinish::AssistantMessage with streamed prose only as a live UI preview.

The route starts live replay from replay_cursor through ObservableSession::subscribe_and_recover_remote(...) and emits an NDJSON replay_cursor row followed by RemoteSessionObservationEvent observation rows. If the cursor is trimmed or unavailable, it emits replay_gap with RemoteLiveReplayGap fields for session_id, requested_cursor, latest_cursor, latest_revision, and reason, plus a fresh RemoteSessionObservation snapshot inside the helper item; the browser replaces product state from its normal endpoints and continues from the latest cursor. Restate mode reuses this observation stream for semantic progress while its app outbox persists product rows for route restart and workflow handoff.

Restate Mode

Built with the restate feature and run with --durability restate (or AGENT_SERVICE_DURABILITY=restate; durability defaults to local), browser turns submit AgentServiceTurnWorkflow/{turn_id}/run/send through Restate ingress. The workflow request carries stable turn, chat, text, model, and model-variant data only; board state remains app-owned SQLite state.

The workflow id is the turn id. There is no durable submitted/running work-item row in front of it: Restate owns in-flight replay and cancellation, Lash owns the effect boundary plus final turn commit, and the app outbox stores product-visible rows keyed by turn_id.

just agent-service-restate-e2e

The one-command E2E starts restatedev/restate:1.7.0 with host networking, binds the in-process agent-service endpoint on 127.0.0.1:19080, registers it through Restate Admin, submits a turn through ingress, verifies app outbox/message persistence, and removes the container on exit. In app runs, AGENT_SERVICE_ADDR serves the Axum UI and AGENT_SERVICE_RESTATE_ADDR serves the Restate endpoint from the same process.

Browser Controls

The UI treats model choice as product state. New chats inherit defaults from /api/settings, the sidebar displays each chat's selected model, and edits are saved to /api/chats/{chat_id}/model.

function selectedModel() {
  return {
    model: modelInput.value.trim() || settings.default_model,
    model_variant: variantInput.value || null
  };
}

await api(`/api/chats/${activeChat}/messages`, {
  method: 'POST',
  body: JSON.stringify({
    text,
    board: currentBoard(),
    ...selectedModel()
  })
});
read on ·