Skip to main content
usezombie is in stealth-mode testing and pre-production. APIs and agent behavior may change between releases without long deprecation windows. Email [email protected] if you want a hand calibrating an agent or to join as a design partner.
Jun 07, 2026
BreakingWhat's newUIAPI

Set up a bring-your-own-key model provider without leaving the Models page

The Models page is now a guided, two-step wizard for bring-your-own-key (BYOK) setup. Choosing “use my own provider key” with an empty vault used to dead-end at a disabled button that sent you off to Credentials; now you paste an API key, the provider and a default model fill themselves in from the key’s prefix, you save the credential inline, pick the model, and activate — all on one screen. Platform-managed keys stay a one-click choice. A short note explains that switching providers applies to new runs while in-flight agents finish on their current one.Behind the wizard, the public model catalogue moved: the unauthenticated _um/<key>/model-caps.json document is now cap.json, and it carries the global run and event rates, the starter credit, and the free-trial window alongside the per-model catalogue — one document for the client config that needs no auth.

Upgrading

  • _um/<key>/model-caps.json is now _um/<key>/cap.json. The old path returns 404 (no alias). The per-model models[] shape is unchanged, so if you read the public catalogue directly, only the path moves — switch to cap.json. If you only use the dashboard or zombiectl, there is nothing to do; they handle this for you.

What’s new

  • Inline credential create on the Models page — a structured provider / API key / model form that names the credential after the provider and detects the provider from the key prefix. No trip to Credentials.
  • Catalogue-backed model picker — the model field is populated from cap.json, with free-text entry as a fallback for a model not in the catalogue.
  • Global config in cap.json — the document now includes a rates block (run and event rates) and a billing block (starter credit, free-trial window), so a client no longer hardcodes those constants.

API reference

  • GET /_um/<key>/cap.json — unauthenticated; replaces model-caps.json. Returns { version, models, rates, billing }. Each models[] row carries id, provider, context_cap_tokens, and the per-model token rates (unchanged); rates carries run_nanos_per_sec and event_nanos; billing carries starter_credit_nanos, free_trial_end_ms, and free_trial_stage_nanos. The optional ?model= filter narrows models[]; the global blocks are always present. A wrong key returns 404; an empty catalogue returns 503.
Jun 06, 2026
What's newAPI

Pin an agent to capable runners with tags

An agent’s tags: now decide where it runs. Add tags: [gpu, us-east] to your SKILL.md frontmatter and the agent runs only on a runner whose advertised labels cover every tag — capability-bound work lands on a host that can serve it. An agent with no tags runs on any runner, exactly as before. The tags you already write in the manifest drive this; there is no separate field to set.If no enrolled runner currently advertises the required labels, the agent waits for a matching runner rather than running on an unsuitable host.
A waiting agent surfaces no separate status today. If an agent never starts, confirm an enrolled runner advertises every tag the agent requires.

API reference

  • required_tags on an agent — derived from the SKILL.md tags: on create and re-derived when you PATCH a new source_markdown. A runner claims an agent only when required_tags is a subset of the runner’s labels. Each tag is 1–64 characters; an agent carries at most 32. Either bound exceeded rejects the request with UZ-REQ-001.
Jun 04, 2026
BreakingWhat's newUICLISecurity

Enroll runners from the dashboard, plus a regrouped left navigation

You can now add a runner to your fleet from the dashboard instead of the command line. A platform admin opens Runners, clicks Add runner, and copies a one-time runner token to install on the host — your platform-admin login never touches a shell. The runner list shows each host’s liveness honestly: a host you just enrolled reads registered until it first checks in, instead of a false online.The dashboard’s left navigation is regrouped into Operations (Agents, Approvals, Events), Configuration (Credentials, Model, and Runners for platform admins), and Organization (Settings, Billing). The model-and-provider settings move out of Settings to their own Model entry at /settings/models.

Upgrading

  • zombie-runner register is removed. The runner CLI no longer accepts --token, --host-id, or ZOMBIE_TOKEN. To enroll a host, a platform admin mints a runner token from the dashboard (Runners → Add runner) and installs it on the host as ZOMBIE_RUNNER_TOKEN. Runners already enrolled keep working — only the enrollment step changed.
  • Server errors on non-idempotent requests are no longer retried. The dashboard and zombiectl HTTP clients now retry a server 5xx only for idempotent methods (GET/PUT/DELETE/HEAD); a failed POST or PATCH surfaces immediately instead of risking a duplicate write. Network errors and Retry-After responses (429/503/504) retry as before.

What’s new

  • Fleet list in the dashboard — platform admins see every enrolled runner with its derived liveness: registered, online, busy, or offline.

API reference

  • GET /v1/fleet/runners — platform-admin only; paginated. Each item carries liveness ∈ {registered, online, busy, offline} and never includes the token hash. A tenant token gets 403.

CLI

  • zombie-runner register is gone. The subcommand and its --token / --host-id flags no longer exist; see Upgrading above for the dashboard enrollment flow.
Jun 03, 2026
What's newUI

One name across the product: the thing you install is an “agent”

The dashboard, the marketing site, and these docs now call the thing you install an agent, and define it the moment you meet the word: “An agent is a long-lived runtime you install once. It sleeps until an event wakes it, runs your skill against that event, and reports back with evidence.” The same concept used to appear as zombie, “the agent”, or “a runtime” in different places — it is one noun now. The brand stays usezombie, and the CLI (zombiectl), the routes, and the API fields are unchanged; only the words you read changed.

What’s new

  • The dashboard navigation, empty states, install buttons, and the stop/resume/kill dialogs all read “agent”.
  • First-touch surfaces — the “What is usezombie?” FAQ, the dashboard first-run card, and the agents-list empty state — now carry the definition above.
Jun 02, 2026
BreakingInternal

zombied runs on a single database role — the worker datastore credentials retire

Following the runner-fleet split, the deleted worker process left behind a worker_runtime Postgres role, a worker Redis access-control list (ACL) user, and DATABASE_URL_WORKER / REDIS_URL_WORKER deploy variables that zombied still required at startup but never used to connect. zombied already writes every event-path row through its API database role, so the worker role and its variables are removed. The control plane now runs on a single api_runtime role for both reads and writes; a host-resident runner continues to hold no datastore credentials at all.

Upgrading

  • Drop DATABASE_URL_WORKER and REDIS_URL_WORKER from the zombied deployment. The server boots on DATABASE_URL_API + REDIS_URL_API alone; the worker variables are no longer read.
  • Deploy order matters: ship the updated zombied first, then remove the worker_runtime Postgres role and the worker Redis ACL user. The old binary still authenticates as worker_runtime, so dropping the role before the new build is live would cut its connection.
Jun 02, 2026
What's newAPI

Usage-based billing — agent runs metered by the second

Agent run time is now billed by the second while an agent is actively running, at 0.0001/sec(0.0001/sec (≈ 0.36/hr), plus the model’s per-token cost on platform-key runs. This replaces the flat estimate taken when work started: a long run is charged for the time it actually used, a short one is not over-charged, and an agent that is not running is not billed. The run rate is the same whether you bring your own model key or use a platform key — bring-your-own-key runs record token usage but are charged for run time only.Credit drains as the run proceeds: each background lease renewal meters the elapsed slice plus any new token usage, so a multi-minute run bills continuously instead of in one lump at the end. A run that exhausts its balance stops at the next renewal rather than going negative.

API reference

  • GET /v1/tenants/me/billing/charges/{event_id}/metering-periods — the per-slice breakdown behind a single run charge: one row per renewal plus the final settle, each carrying the run milliseconds, token deltas, run fee, token cost, and the amount actually charged. Scoped to the calling tenant; a foreign event_id returns no rows.
Jun 01, 2026
What's newCLI

Host skills move to their own repo — install with npx skills add usezombie/skills

The /usezombie-* host-agent skills now live in the public usezombie/skills repository and install with npx skills add usezombie/skills. The npm package ships the zombiectl CLI and agent samples only; it no longer bundles the skills. Installing the CLI and adding the skills are now two commands. The curl -fsSL https://usezombie.sh | bash one-shot installer does both for you and is unchanged.

What’s new

  • npx skills add usezombie/skills — the install path on index.mdx, quickstart.mdx, cli/install.mdx, and zombies/install.mdx. Pass --host=<claude|amp|codex|opencode> to pin a specific host.
  • Skills iterate independently. A skill update ships from the skills repo without a CLI release, so skill fixes reach you without waiting on an npm publish.
  • npm install -g @usezombie/zombiectl installs the CLI and samples only. Older instructions that said the npm install also set up the skills no longer apply — add them with the command above.
Jun 01, 2026
What's newInternal

Long single events keep running — lease renewal lands

The follow-up promised in the May 27, 2026 runner-fleet release is here: an event that runs longer than the lease window is no longer reclaimed and replayed. While a runner is executing an event, it renews the lease in the background, pushing the kill deadline forward so a single long-running agent runs to completion instead of being cut off after 30 seconds. A hard 12-hour ceiling still caps a runaway event.

What’s new

  • Background lease renewal. A runner renews mid-execution and keeps ownership of the event while it is still working. A runner that loses its lease cannot reclaim it.
  • Fail-safe renewal. A renewal that hits a transient database fault is retried on the next tick rather than ending the event; only a genuinely missing or superseded lease stops it, so a long event survives a brief datastore blip.
  • Bounded runtime. Renewal cannot extend a single event past a 12-hour maximum, so a stuck agent is still reclaimed.
May 27, 2026
What's newInternal

Execution moves to a host-resident runner fleet

Behind an unchanged user surface, an event’s agent now runs in a separate zombie-runner daemon instead of inside the API server. zombied became the control plane — it owns Postgres, Redis, and the Vault, and hands work to runners over an authenticated HTTPS protocol — while the runner is the execution plane that runs each event in an isolated sandbox holding no datastore credentials. Steering, webhooks, cron, the live event tail, and history behave exactly as before; what changed is where the work runs.

What’s new

  • Control plane / execution plane split. A runner leases an event, runs it, and reports the result; zombied does the durable writes. Work can run on hosts that never see a database credential.
  • Lease-based ownership. Each lease carries a deadline. A runner that dies mid-event has its work reclaimed and re-run by another runner; a late report from the dead runner is rejected, so state is never double-written.
  • Sandbox is mandatory. Every event runs in a sandbox; a sandbox that fails to start fails closed rather than running unprotected.
Host-resident runners are not enabled in production yet — this release lands the architecture; turning them on follows in a later update.
One known limit: an agent that runs longer than the 30-second lease window is reclaimed and re-run, so long single events wait on a follow-up that adds lease renewal.
May 22, 2026
What's newCLISecurity

zombiectl login — verification-code device flow + non-interactive token auth

Logging in opens a browser approval page that shows a 6-digit verification code; you type it back into the terminal to finish. The code binds the browser approver to the terminal that started the flow, so phishing the approval URL alone can’t mint a credential. CI and scripts can skip the browser entirely and supply a token directly.

What’s new

  • Verification-code device flow. zombiectl login opens the approval page; after you Approve, the page shows a 6-digit code to type back into the terminal. The 6-digit shape is checked locally before it’s sent, so a typo just re-prompts; wrong codes cap at 5 per session.
  • Non-interactive auth. Supply a token without the browser via --token <token>, the ZOMBIE_TOKEN environment variable, or by piping it on stdin — resolved in that order. Prefer the env var or stdin to keep the token out of shell history. A non-interactive shell with no token exits with an error instead of waiting.
  • Device labels. --token-name <label> names the session on the approval page and in zombiectl auth status; it defaults to your platform family (macos-cli, linux-cli, …).
  • One auth-token env var. ZOMBIE_TOKEN is the single environment variable the CLI reads for a user token, everywhere a token is resolved.
May 21, 2026
What's newUISecurity

Create and manage API keys from the dashboard

Tenant API keys (zmb_t_…) — the credentials that authenticate service-to-service callers like CI, cron, or an integration — can now be created, viewed, revoked, and deleted from Settings → API keys, with no curl required. Creating a key reveals the raw value exactly once in a copy-and-store panel; closing the dialog discards it for good. The same release adds a light/dark theme toggle and a themed fallback avatar.

What’s new

  • API keys in the dashboard — mint a key with a name and optional description, see each key’s status (active or revoked) with created and last-used timestamps, and revoke or delete it from the list. Available to operator-role users and above; others are redirected to Settings.
  • One-time key reveal — the raw zmb_t_… value appears once at creation behind a copy button and is never rendered again. A key can be deleted only after it has been revoked, so the audit trail stays intact.
  • Light / dark theme toggle — switch the dashboard theme from the header; the choice persists across reloads and renders correctly on first paint, with no flash of the wrong theme.
  • Themed avatar fallback — with no profile photo set, your initials avatar now uses the dashboard palette instead of a stock fill.
May 21, 2026
What's newCLI

zombiectl steer opens a terminal prompt when no message is supplied

zombiectl steer <zombie_id> now opens a read-eval-print loop (REPL) when you invoke it from a terminal (TTY) without a message. Automation stays single-shot: explicit messages and piped input still post one steer message, stream one reply, and exit.

CLI

  • Terminal promptzombiectl steer <zombie_id> now prompts for one steer message at a time, streams the reply, then prompts again. Ctrl-D exits cleanly; Ctrl-C cancels the active stream and exits.
  • Single-shot automationzombiectl steer <zombie_id> "message" and echo "message" | zombiectl steer <zombie_id> keep the existing one-turn behavior for scripts and agents.
  • Forced prompt mode--tty forces the REPL even when stdin is piped, then exits when the pipe reaches end-of-file.
May 21, 2026
SecurityUI

The dashboard no longer puts an access token in the page source

An agent’s detail page used to hand its live-activity panel a short-lived API token as a component property, which placed the raw token in the page’s HTML source — readable by anyone with access to the open tab (a shared screen, a cached page, a browser extension) without any exploit. The dashboard now hands no token to the browser at all: steering an agent goes through a server-side action, and the activity stream loads its recent history on the server. This completes the dashboard token-handling work that began on May 19, 2026.

What’s new

  • Steer failures are visible — when a steer message can’t be delivered it shows a failed badge instead of staying on queued; a delivered steer reconciles to its live event as before.
  • Quieter retries — a failed steer is retried on the server automatically, so the dashboard no longer shows a per-attempt retry counter while steering or while loading recent activity.
May 20, 2026
DashboardBug fixesUI

Create a workspace inline, plus a readable sign-in and one loading mark

The workspace switcher can now create a workspace without leaving the dashboard, and a sign-in-onward pass fixed the sign-in input contrast, the install-toast fade, and a stray development route.

What’s new

  • Inline workspace creation — a New workspace item in the switcher opens a dialog, calls POST /v1/workspaces, and switches you onto the new workspace on success; a blank name lets the server generate one. The switcher now renders even when you have zero workspaces.
  • One loading mark — loaders that used a generic spinning icon now use the brand wake-pulse: a label beside the dot for page-level waits, the dot alone inside buttons.

Bug fixes

  • Sign-in input contrast — the Clerk sign-in and sign-up fields rendered the same color as the card behind them, so the box only showed on focus; they now sit on an inset surface with a stronger border.
  • Install-toast fade — the Hero install confirmation cleared its text the same frame it began to fade, so the 240 ms fade ran on a blank line; the message and a warning toast’s color now hold through the window.
  • Stray development route — the /ds-button-rsc component-preview page is removed from the build.
May 19, 2026
InternalSecurity

Dashboard tightens its token-handling posture

Authenticated reads from the dashboard no longer carry the API bearer token in browser memory. Every request from a client component routes through a same-origin proxy that mints the JWT server-side on each hop; the browser bundle never sees the raw value. The token type also moves to an opaque wrapper that masks itself in console logs, JSON serialization, and React Server Component prop boundaries — accidental leaks via console.log(token) or error-tracker capture print <redacted> instead of the raw bearer.No user-facing behavior change. The retry layer’s per-attempt timeout also stops leaking pending setTimeout handles when the request wins the race.
May 18, 2026
BreakingCLI

zombiectl telemetry — Supabase-aligned env vars, on by default, agent attribution

zombiectl now follows the same telemetry contract as the Supabase CLI: anonymous usage data is on by default, three env-var knobs control opt-out and host overrides, and every event carries the detected AI agent (Claude Code, Cursor, Cline, etc.) when one is wrapping the invocation. The opt-out names are public, documented in --help, and respect the industry-standard DO_NOT_TRACK signal. Agent attribution helps prioritise CLI ergonomics for the hosts where most usezombie traffic actually originates.

Upgrading

The opt-out env vars renamed and the default flipped. If you previously set DISABLE_TELEMETRY=0 to opt in, you can remove that line — telemetry is on by default now. To stay opted out, switch to one of the new names:
  • DISABLE_TELEMETRYZOMBIE_TELEMETRY_DISABLED (set to 1 to opt out; any other value is treated as unset)
  • ZOMBIE_POSTHOG_KEYZOMBIE_TELEMETRY_POSTHOG_KEY
  • ZOMBIE_POSTHOG_HOSTZOMBIE_TELEMETRY_POSTHOG_HOST
  • DO_NOT_TRACK=1 is honored unchanged (industry-standard signal)
  • ZOMBIE_TELEMETRY_DEBUG=1 for local span debug output (unchanged)
CLI binary upgrade is sufficient; no server-side change.

What’s new

  • On by default. Fresh installs bootstrap $ZOMBIE_STATE_DIR/telemetry.json (default ~/.config/zombiectl/telemetry.json) with consent: "granted" on the first invocation. No interactive prompt — the contract is the --help env-var section + the Configuration → Telemetry docs page.
  • AI agent attribution. Every event now carries an ai_tool property when an agent is wrapping the CLI (Claude Code, Cursor, Cline, Aider, Continue, Windsurf, Copilot, Replit, etc.). Detected via @vercel/detect-agent. Unknown non-interactive contexts fall back to unknown_non_interactive; CI contexts to ci.
  • Two opt-out signals, one persisted state. ZOMBIE_TELEMETRY_DISABLED=1 or DO_NOT_TRACK=1 both force consent to denied regardless of the persisted file. Either works; pick the one your team’s tooling already standardises on.

CLI

  • zombiectl --help now advertises all five telemetry env vars under “Environment variables” so the opt-out path is discoverable without leaving the terminal.
  • ZOMBIE_STATE_DIR is now documented as the override for local CLI state, including credentials, telemetry consent, and session files. Default remains ~/.config/zombiectl.
May 18, 2026
UIWebsite

Trigger panel multi-card + website OnboardingFlow + Hero CTA

The dashboard’s agent-detail page now renders one card per declared trigger — a guided registration card with rendered gh api / linear webhook create / jira webhook add snippets for known providers (GitHub, Linear, Jira, Grafana, Slack, AgentMail), a copy-URL fallback for unknown sources, and a CronCard showing the schedule + next-fire computed client-side from the host’s IANA timezone. The website Home page replaces the FeatureFlow evidence section with a 4-step OnboardingFlow pictorial (install → add skill → wire webhook → steer), and the Hero primary CTA becomes a terminal-style $ npm install -g … button that copies the install command, surfaces an inline <output aria-live> toast, and smooth-scrolls to the #onboarding-flow anchor (honoring prefers-reduced-motion).
  • TriggerPanel.tsx switches from Tabs to Radix Accordion. One AccordionItem per zombie.triggers[] entry; the first trigger with no recorded delivery auto-expands on mount (set-up state takes precedence). Inner router dispatches webhook to GuidedTriggerCard (known provider) or CopyUrlFallback (unknown), cron to CronCard, api to CopyUrlFallback with the legacy bare webhook URL.
  • provider-guidance.ts ships six provider entriesgithub, linear, jira, grafana, slack, agentmail. Each exposes a 3-arg command(vars, webhookUrl, events) template so per-provider snippets vary by trigger.events, a webUiDeepLink(vars) for the “Open <provider>” affordance, and a variables list driving the variable-input form. Snapshot-tested per provider.
  • webhookUrlFor(zombieId, source?) — dashboard helper reconstructs the server’s webhook_urls: { <source>: <url> } projection client-side. Called with a source, returns the per-trigger URL; called without, returns the legacy bare ingress at /v1/webhooks/{zombie_id}.
  • OnboardingFlow.tsx replaces FeatureFlow.tsx. Four numbered cards with arrow connectors on lg: breakpoint; mobile stacks vertically. Each card carries a real shell snippet, not a screenshot. Anchored at id="onboarding-flow" for Hero’s smooth-scroll target.
  • Hero.tsx primary CTA is now a <button> — copies npm install -g @usezombie/zombiectl && npx skills add usezombie/usezombie to navigator.clipboard, shows the toast for 2 s, scrolls to #onboarding-flow. Clipboard-blocked path surfaces a “Clipboard blocked — select the command above and copy manually” toast. <output aria-live="polite"> element used in place of a design-system Toast primitive (there isn’t one yet).
  • cron-parser@^5.5.0 added to ui/packages/app for the CronCard next-fire computation. Landing-js bundle stays at 132.6 kB gz under the 140 kB .size-limit.json ceiling (7.4 kB headroom).
  • Coverage gate — app package thresholds (statements 95 / branches 90 / functions 95 / lines 95) all green with three new targeted tests covering the setCopiedKey reset updater both branches, the Intl timezone fallback to "UTC", and the cron / api auto-expand TriggerBody paths.
  • Timer-leak fixesGuidedTriggerCard copy-reset, CopyUrlFallback copy-reset, and the existing Hero toast all now store their setTimeout handle in a useRef and clear it in a useEffect destructor on unmount; unmount-cleanup tests added at each call site.
  • Landing-page promo pill — small mono pill between the LIVE eyebrow and the headline links to /pricing and surfaces the “Free until July 31, 2026” free-trial posture that already lived on the pricing component but was invisible above the fold. Click emits trackNavigationClicked({ source: "hero_promo_pill", target: "pricing" }).
Operator note — TRIGGER.md is still the source of truth; the dashboard cards are read-only. Edit the markdown and reinstall to change triggers.
May 17, 2026
Docs

Docs sweep — npx skills add install path, full doctor table, tenant-provider commands

Cold-start install no longer routes through mkdir + curl. The host-agent skill ships via npx skills add usezombie/usezombie (one-liner — symlinks /usezombie-* into your host’s skill directory); the curl path stays as a fallback for environments without a node toolchain. The reconciliation pass also fixes a handful of doc claims that had drifted past what zombiectl actually returns and adds the tenant provider subcommands that were already shipped but undocumented.
  • npx skills add usezombie/usezombie — primary install path on index.mdx, quickstart.mdx, cli/install.mdx, zombies/install.mdx. Pass --host=<claude|amp|codex|opencode> to pin a specific host. https://usezombie.sh/skills.md remains as the curl fallback.
  • zombiectl doctor table now lists five checks (was three) — auth_token_present and tenant_provider were already returned by the CLI but missing from cli/install.mdx. The tenant_provider row carries { mode, model, context_cap_tokens, free_trial } so the launch trial window surfaces in the same place users check before installing.
  • zombiectl tenant provider show / set / reset — added to cli/zombiectl.mdx. Flips the tenant between platform-managed inference (default) and a self-managed provider credential from the workspace vault. Architecture canon for the flow lives in docs/architecture/user_flow.md §8.7 and billing_and_provider_keys.md; the CLI page now points readers there instead of recreating the narrative.
  • concepts.mdx — “Skill” card replaced with a “Tool” card that names the binding distinction (TRIGGER.md is sandbox-enforced; SKILL.md is advisory prose). Trigger accordion’s webhook URL updated to /v1/webhooks/{zombie_id}/{source} to match what M43 / M68 shipped.
  • zombies/overview.mdx lifecycle — mermaid expanded from three states to five (installAliveStopped, AliveKilledDeleted) so stop / resume show up in the diagram users find first.
May 17, 2026
ServerPerformance

Redis pool — lifts the single-mutex throughput cap

Previously, every server-side Redis command serialised through a single mutex, hard-capping per-process throughput at roughly 40 ops/sec/connection regardless of how many producers were issuing writes. The mutex is gone; concurrent producers now each take their own pooled connection. Long-lived blocking consumers — the per-agent worker stream reader, the watcher’s control-stream reader, the dashboard’s activity subscriber — hold dedicated connections outside the pool by design (pooling those would exhaust the pool at the 9th agent).
  • Pool sizing knobs (env vars)REDIS_POOL_MAX_IDLE (default 8) caps the idle pool size; REDIS_POOL_EAGER_MIN (default 2) preconnects on boot so the first burst doesn’t pay dial latency.
  • Request-path timeoutREDIS_REQUEST_TIMEOUT_MS (default 5000 ms). A command that doesn’t return within the budget surfaces as a transport timeout and closes the connection; the next acquire dials fresh.
  • Boot-time validation — a misparsed REDIS_REQUEST_TIMEOUT_MS value fails the process boot with error code UZ-STARTUP-ENV-CHECK instead of silently falling back to a default.
  • Server error replies surface in logsREADONLY, BUSYGROUP, WRONGTYPE (and other Redis-side error frames) now emit a redis_command_err_reply warn line carrying the server’s text before being mapped to the internal command error. Underlying cause is visible in operator logs, not flattened into a generic operation failure.
  • 8 new Prometheus pool series under the zombie_redis_pool_* namespace — dials_total, overflow_dials_total, poisoned_connections_total, reconnects_total, forced_closes_total, acquire_timeouts_total, idle, acquire_wait_ns_p99. The last two are stubs today: acquire_timeouts_total reports 0 (no acquire-timeout path yet — Pool.acquire never blocks), and acquire_wait_ns_p99 reports 0 (the per-acquire wait histogram wires in alongside the timeout-aware acquire). Both stay in the series list so dashboards don’t break when the wiring lands.
  • Pub/sub subscriber consolidation — one subscriber type with a configurable read-timeout; production passes none (block indefinitely), test harnesses pass a budget.
Operator note — env knobs take effect at process start; rotate the server to pick up changes.
May 17, 2026
What's newAPICLIUIIntegrations

Trigger DX overhaul — gh-driven webhook registration, free-trial pricing, dashboard chat

Installing the platform-ops agent no longer ends with a paste-into-GitHub step. The host-agent install skill (/usezombie-install-platform-ops) registers each declared webhook via your own gh CLI, HMAC-self-verifies the registration, and reports the result inline. Through end of July 2026, every customer-visible rate string reads “Try for free” — the stage-execution rate is set to zero nanos, and the website’s pricing component renders the historical rate with a strike-through plus a “Free until July 2026” banner. The dashboard’s /zombies/{zombie_id} page replaces the bespoke “Live activity” panel with a chat surface, so you can steer an agent from the same screen that shows its events.

Upgrading

  • TRIGGER.md trigger:triggers: (array). An agent declares 1–8 trigger entries under triggers:. The singular trigger: shape is rejected at install with ERR_ZOMBIE_INVALID_CONFIG: use "triggers:" (array) — no compat shim, no rewrite-on-load. Convert every TRIGGER.md you maintain:
    # before
    x-usezombie:
      trigger:
        type: webhook
        source: github
    
    # after
    x-usezombie:
      triggers:
        - type: webhook
          source: github
          events: [workflow_run]   # optional whitelist, 1–16 entries
    
    Entries are unique on (type, source); at most one cron entry per agent.
  • Install response shape — webhook_urlwebhook_urls. POST /v1/workspaces/{workspace_id}/zombies 201 returns webhook_urls: { <source>: <url> } keyed by triggers[].source. The old webhook_url scalar field is gone. The CLI’s --json install output emits the same map; the human-readable output prints one URL per declared webhook trigger.
  • Webhook URLs are source-suffixed. External systems POST to https://api.usezombie.com/v1/webhooks/{zombie_id}/{source} — the {source} segment matches triggers[].source in TRIGGER.md. Update any hand-maintained upstream hooks.
  • Mission ControlDashboard. Every “Mission Control” string in the website, app, and CLI output is now “Dashboard”. Bookmarks unaffected; only display copy changed.

What’s new

  • zombiectl auth status — inspect the stored credential without re-authenticating. Resolves the token source (file vs ZOMBIE_TOKEN env vs none), decodes JWT claims (iss, aud, sub, metadata.tenant_id, metadata.role, exp), and probes GET /v1/tenants/me/billing to verify the token is still accepted. Exit 0 on valid or unreachable (transient API issues don’t poison local state); exit 1 on UZ-AUTH-001 / UZ-AUTH-002 / TOKEN_EXPIRED.
  • Trigger panel goes multi-card. Each declared trigger renders its own card on the dashboard’s agent-detail page — a guided card with the upstream registration command for known providers (GitHub, Linear, Jira, Grafana, Slack, AgentMail, Clerk), a copy-URL card for unknown sources, a schedule + next-fire card for cron, and a catch-all for api.
  • Dashboard chat surface. The /zombies/{zombie_id} page mounts an @assistant-ui/react thread in place of the prior LiveEventsPanel. Webhook / cron / continuation events render as system chips; the agent’s reasoning streams as assistant bubbles; the composer at the bottom turns user input into a steer (POST /v1/workspaces/{workspace_id}/zombies/{zombie_id}/messages — existing endpoint).
  • Install skill is platform-neutral. /usezombie-install-platform-ops runs in any host with an AskUserQuestion-equivalent — Claude Code, Amp, Codex CLI, OpenCode all drive the same skill body. The skill loops gh api repos/<owner>/<repo>/hooks per declared webhook trigger and substitutes the workspace github credential’s webhook_secret into each registration. Re-running on a repo with an existing hook at the same URL is idempotent (matched on config.url, advanced).

API reference

POST /v1/workspaces/{workspace_id}/zombies (201):
{
  "zombie_id": "zmb_2041",
  "name": "platform-ops-agent",
  "webhook_urls": {
    "github": "https://api.usezombie.com/v1/webhooks/zmb_2041/github"
  }
}
GET /v1/workspaces/{workspace_id}/zombies list rows gain a triggers projection:
{
  "zombie_id": "zmb_2041",
  "triggers": [
    { "type": "webhook", "source": "github", "events": ["workflow_run"] }
  ]
}
GET /v1/workspaces/{workspace_id}/zombies/{zombie_id}/events accepts a new actor_prefix query parameter (e.g. actor_prefix=webhook:) for server-side filtering by event source. No client-side fallback.

CLI

  • zombiectl install --from <path> prints a per-trigger URL block. Output switches from a single webhook_url: line to Webhook URLs (register on the upstream provider): followed by one <source>: <url> line per declared webhook trigger. The --json shape returns webhook_urls as a { <source>: <url> } map.
  • zombiectl auth status — new subcommand documented above.
May 17, 2026
PerformanceObservability

Redis request path: concurrent by default, configurable, observable

Every short-lived Redis command (XADD, PUBLISH, XACK from HTTP handlers and per-step worker publishes) now flows through a connection pool instead of serializing behind a single client-wide mutex. Concurrent requests from worker threads and HTTP handlers each grab their own pooled connection, complete their round-trip without contending, and release. The prior ~40 ops/sec-per-connection ceiling that didn’t scale with CPU cores is gone.

What’s new

  • REDIS_POOL_MAX_IDLE (default 8) — maximum connections held in the idle pool per process. Operators raise it for sustained high-concurrency workloads to reduce overflow dials; the request-path completes in single-digit ms even over Upstash TLS, so > 16 is unusual.
  • REDIS_POOL_EAGER_MIN (default 2) — connections pre-warmed at boot. Covers cold-boot latency (Upstash TLS handshake is tens of ms per dial); raise if boot is followed by an immediate burst of short-lived commands.
  • REDIS_REQUEST_TIMEOUT_MS (default 5000) — request-path read timeout. A frozen Upstash proxy can no longer pin a worker thread indefinitely. Don’t raise above 5000 — Upstash regional p99 is single-digit-ms; >5s is failure, not slowness.
  • Long-lived blocking consumers stay on dedicated connections. The watcher’s XREADGROUP on zombie:control, per-agent workers’ XREADGROUP on zombie:{id}:events, and SSE subscribers each hold one connection for their lifetime — they don’t compete with the request-path pool, so a customer with 100 dashboard tabs open can’t exhaust the pool.
  • Prometheus metrics for pool state. New /metrics lines for active, idle, dials_total, overflow_dials_total, poisoned_connections_total, reconnects_total, forced_closes_total, acquire_timeouts_total — operator visibility into connection churn, dial pressure under bursts, and transport-layer flakiness.
May 12, 2026
CLITesting

zombiectl runs end-to-end against the live API on every deploy

Every backend deploy now exercises zombiectl against api-dev.usezombie.com, and every release re-runs the same suite against api.usezombie.com (plus a daily run at 13:00 UTC). Parse, auth-guard, install, lifecycle, read sweep, and zombiectl login are all hit on real infrastructure — CLI regressions in the network path fail in CI before they reach you.

CLI

  • uuidv7 validation on every positional id. zombiectl kill|stop|resume|delete|logs <id>, workspace use|delete <id>, agent delete <id>, and grant delete <id> reject malformed ids before the network call with invalid <name>: expected uuidv7 format (e.g. 0192a3b4-c5d6-7e8f-9012-345678901234).
  • zombiectl agent add|list|delete default to the current workspace when --workspace <id> is omitted — matches the agent commands.
  • SIGINT (Ctrl-C) during zombiectl login exits 130 cleanly without leaving a partial credentials.json behind. Re-run zombiectl login from a fresh slate.
  • -v short alias for --version.
  • zombiectl help is a first-class command — used to fall through to the “did you mean?” suggester. Behaves identically to bare zombiectl / -h / --help.
  • credentials.json lands at mode 0600 after a successful login. Verify with ls -l ~/.config/zombiectl/credentials.json.
May 12, 2026
UIBug fixesTesting

Dashboard error voice, sign-in card lifted, install/save races fixed

Every “Failed to X” fallback in the dashboard is replaced with operator-first language keyed on backend error codes. The sign-in card no longer disappears into the page background. Two install/save races that left you on the wrong URL after a router.push are fixed.

What’s new

  • presentError({errorCode, message, action}) is the single entry point for dashboard error rendering. Curated UZ-XXX-NNN codes (registry: api-reference/error-codes) map to a title + body the operator can act on — UZ-AUTH-003 reads “Your session expired. Sign in again to keep going.” instead of “Not authenticated”; UZ-ZMB-009 reads “We couldn’t find that agent. It may have already been deleted — refresh the list.” instead of “Internal Server Error”. Eight codes ship today and the helper grows organically as the dashboard surfaces new ones. Useless server "Failed to …" messages are detected and replaced rather than concatenated.
  • Sign-in card lifted from --surface-1 to --surface-2 on the auth route, with --border-strong on the edge. At --surface-1 the luminance delta against the page background was 3 units — close to invisible. The card now reads as a card.

Bug fixes

  • /zombies/new installInstallZombieForm.tsx no longer issues router.refresh() after router.push(/zombies/{id}). The force-dynamic detail route re-resolves on commit; the manual refresh was racing the URL commit and intermittently leaving you on /zombies/new with a stale form state. Same fix applied to ZombieConfig.tsx for the save-then-navigate path on the detail page.
  • tooltip test flake — restoring vitest’s default exclude patterns stops the runner from following the @usezombie/design-system workspace symlink and executing its tests without their test-setup.ts. The “Invalid Chai property: toBeInTheDocument” intermittent on bun run test is gone.

CLI

No zombiectl shape changes.
May 11, 2026
InternalTestingCI

Authenticated dashboard e2e ungated, runs on every dev + prod deploy

The Playwright authenticated suite covers the eight dashboard lifecycles M64_005 deferred behind test.fixme — every signed-in flow that touched a client-side useClientToken().getToken() call. Closing the gap means every KillSwitch, ZombieConfig, and provider-settings mutation now goes through a Next.js Server Action that mints the api-template JWT server-side. CI runs the suite against api-dev after every push to main and against api.usezombie.com after every production app deploy.
  • useClientToken retired — every dashboard mutation that previously called useClientToken().getToken() (six routes, eleven call sites) now invokes a per-route app/(dashboard)/<route>/actions.ts Server Action. Shared wrapper at lib/actions/with-token.ts exposes withToken<T> returning the discriminated ActionResult<T> ({ ok: true, data } or { ok: false, error, status? }). Token A → Token B handoff stays server-side; the browser never sees the api-template JWT.
  • Three test.fixme blocks gonelifecycle.spec.ts, kill.spec.ts, signup.spec.ts exercise the stop / kill / signup flows end-to-end. Lifecycle and kill assert on the dashboard listing’s data-state after the Server Action completes (no more waitForResponse(... PATCH) since the PATCH is server-internal). Signup drives Clerk DEV’s verification screen with the documented testing One-Time Password (OTP) 424242.
  • Five new test files landmulti-zombie pins the 5-simultaneous pulse cap (six active agents → exactly five data-live="true" + one static glow + header "6 live · capped at 5"). multi-workspace switches workspaces via the header dropdown and asserts the cookie-write Server Action keeps the URL on /zombies. settings-billing asserts the tabular-nums balance headline + the disabled Purchase Credits trigger. events and logs-detail assert SSR + WakePulse render on /events and /zombies/[id] respectively (the EventDetail dialog is still unbuilt — tracked in test Discovery).
  • <RadioGroup> + <RadioGroupItem> ship in @usezombie/design-system — Radix-backed wrapper with the shared focus-ring + token map. ModeRadio.tsx and ProviderSelector.tsx consume it; the last raw <input type="radio"> in ui/packages/app/** is gone.
  • zombiectl coverage threshold + new validation testsbunfig.toml enforces coverageThreshold = { line = 0.95, function = 0.95 }. New test/workspace-helpers.unit.test.js covers the previously-uncovered VALIDATION_ERROR branches in commandWorkspace for use and delete (the existing tests handed in alphanumeric input that passes isValidId and routed through UNKNOWN_WORKSPACE instead). Suite-wide coverage lifts to 95.76% function / 95.64% line.
  • auth-e2e-dev job in .github/workflows/deploy-dev.yml — runs after verify-dev against https://usezombie-app.vercel.app, injects CLERK_SECRET_KEY + CLERK_WEBHOOK_SECRET from the project’s dev secret vault, gates the notify step, uploads playwright-auth-report/ as auth-e2e-dev-<sha> artifact.
  • auth-e2e-prod job in .github/workflows/smoke-post-deploy.yml — fires on Vercel deployment_status: success for the usezombie-app PROD environment. Same Playwright suite, Clerk credentials supplied from the project’s prod secret vault, https://api.usezombie.com for the API base.
May 11, 2026
BreakingAPIUIInternal

Nanos billing unit, posture-dispatched stage rates, BYOK term retired

Pricing splits into a two-rate gradient: per stage on platform default, per stage when you bring your own provider key — self-managed is 10× cheaper to scale. Event receipts are free in both postures. The starter grant stays at , now denominated in nanos (1 USD = 1_000_000_000 nanos) so sub-cent rates have nine decimal places of precision. “BYOK” is retired everywhere in user-facing surfaces; the canonical term is self-managed provider key.

Upgrading

  • Billing column rename — tenant_billing.balance_centstenant_billing.balance_nanos (BIGINT NOT NULL CHECK (balance_nanos >= 0)). Same row, same UPDATE shape, new unit. SDK + dashboard already read the new column.
  • PUT /v1/tenants/me/providermode: "byok" now returns 400. Pass mode: "self_managed" instead. There is no compat shim, no 410, no legacy alias — clean break, pre-v2.0.
  • zombiectl tenant provider set — flag value --mode byok removed. Use --mode self_managed.
  • Schema reseed required for local devmake down && make up. There is no migration; the column rename is forward-only.
  • Internal constants renamedSTARTER_CREDIT_CENTSSTARTER_CREDIT_NANOS, STAGE_CENTS split into STAGE_PLATFORM_NANOS + STAGE_SELF_MANAGED_NANOS, EVENT_PLATFORM_CENTS collapsed into EVENT_NANOS = 0. The names are identical across Zig + website TS + app TS + zombiectl JS (cross-tier parity rule).

What’s new

  • Single canonical contact email[email protected] resolves through a SUPPORT_EMAIL constant per repo (Zig, website TS, app TS, CLI JS) + this Mintlify snippet at /snippets/contact.mdx. [email protected] / [email protected] literals are gone from active source and copy.
  • Per-model token rates now in nanosmodel-caps.json ships input_nanos_per_mtok / output_nanos_per_mtok (was _cents_per_mtok). Same response shape otherwise.
  • docs/architecture/billing_and_provider_keys.md rewrote shape-first — the architecture doc names the rate constants by identifier and points at three authoritative sources (tenant_billing.zig, snippets/rates.mdx, the model-caps.json endpoint) instead of pinning specific dollar amounts. Future rate ratchets no longer require a doc rewrite.

API reference

GET /v1/tenants/me/billing:
{
  "balance_nanos": 4710000000,
  "updated_at": 1778330263985,
  "is_exhausted": false,
  "exhausted_at": null
}
GET /v1/tenants/me/billing/chargescharge_typereceive () and stage ( platform / self-managed). Row amounts now in credit_deducted_nanos (was credit_deducted_cents).PUT /v1/tenants/me/provider:
{ "mode": "self_managed", "credential_ref": "account-fireworks-key", "model": "accounts/fireworks/models/kimi-k2.6" }
Sending mode: "byok" returns 400 mode_not_recognized with the generic “mode must be one of: platform, self_managed” message — no special-case retired-mode branch.

Tests

Website vitest 129/129 · app vitest 357/357 · zombiectl bun test 567/567 · Zig 29/29 · integration 1508/0. New paired pin tests on each tier locking the rate constants and the SUPPORT_EMAIL literal.
May 9, 2026
BreakingWhat's newAPIUI

Single-rate pricing — tier ladder retired

One number per surface: per event receipt, per stage, drawn from a starter credit on signup. You pick the model and pay your provider directly — zero markup on tokens. Hobby/Scale tiers are gone from the site, API, and dashboard.

Upgrading

  • GET /v1/tenants/me/billingplan_tier and plan_sku removed. Other fields unchanged. Upgrade server + client together.
  • UZ-WORKSPACE-003 — message now Credit pool exhausted (was tier-flavoured). Code unchanged; only string-matchers need to update.
  • /pricing route 404s — content lives at /#pricing. Topbar and footer already updated.
  • PostHog pricing_hobby_start_free / pricing_scale_upgrade events removed. Pricing-page intent now signup_completed with source = pricing_install. Rebuild affected funnels.

What’s new

  • One billing flow on usezombie.com/#pricing — horizontal diagram: event cell () → N stage cells () → separate LLM stratum proving your model bill is not ours.
  • Operational extras turn on per workspace, never as a paywall — multi-workspace, approval gating, workspace credentials, higher concurrency, longer windows, priority support.
  • Marketing site is now a single page — pricing, features, FAQ all on /. Only /agents and external /docs route away.
  • Marketing headings use design-system primitives<DisplayXL>, <DisplayLG>, <SectionLabel> back every hero, section, and eyebrow.

API reference

GET /v1/tenants/me/billing:
{
  "balance_cents": 471,
  "updated_at": 1778330263985,
  "is_exhausted": false,
  "exhausted_at": null
}
GET /v1/tenants/me/billing/charges — unchanged. charge_typereceive () and stage ().UZ-WORKSPACE-003:
Credit pool exhausted
Tenant credit pool exhausted. Top up your balance to resume execution.
May 9, 2026
What's newUI

Operational Restraint — one design system across every surface

Dashboard, marketing site, docs, and zombiectl now share one visual language: dark-first chrome, Commit Mono headings on Instrument Sans body, and a single cyan-mint wake-pulse used as currency. If something pulses, it’s alive.

What’s new

  • One token set across app.usezombie.com, usezombie.com, docs.usezombie.com, and the CLI — same surface, border, text, accent values. Visual handoff between surfaces is seamless.
  • Pulse cyan is currency, never decoration — only live signals: status dots, primary CTAs, focus rings, the brand mark. No decorative gradients or glows.
  • Light mode is first-class — full WCAG AA contrast (body 7:1, inline code 4.5:1, visible focus rings either way).
  • prefers-reduced-motion honoured — wake-pulse swaps to a static halo. Metaphor survives.
  • Docs site landed last — Commit Mono headings, Instrument Sans body, 68-char measure. Heritage orange is gone.

What’s next

Accessibility scores, performance budgets, and dashboard live-state instrumentation will land in a future entry.
May 8, 2026
What's newCLI

zombiectl reads as part of the brand

CLI now renders against the same design system as the web surfaces. Same cyan-mint pulse, same status glyphs ( live, parked, warn, failed) across every command.

What’s new

  • One palette end-to-end — pulse cyan, evidence amber, success green, warn amber, error red, muted/subtle greys. 256-color, mirrors the web tokens.
  • zombiectl --version — single line: pulse dot, binary name, version. No more agent emoji or box border.
  • Pulse cyan is currency — only on the live-status glyph, the --version mark, and help section headings. Dividers and table headers use bold default text.
  • Quiet machines stay quietNO_COLOR=1, piped output, and --json emit zero ANSI escapes.
  • Old terminals fall back — <256-color terminals get a 16-color palette automatically (one stderr notice, then silent). TERM=dumb and non-TTY pipes go plain ASCII.
  • zombiectl --help fits 80 cols.

CLI

No new commands or flags. zombiectl install swaps 🎉 <name> is live. for ✓ <name> is live. via the shared glyph helpers (plain under NO_COLOR).
May 8, 2026
BreakingCLIObservability

Telemetry off by default; consistent error shape

Fresh installs send zero analytics events until you opt in. Every command also renders errors through one shared boundary now — same format and code/message pairing across login, doctor, steer, and the rest.

Upgrading

  • ZOMBIE_POSTHOG_ENABLED removed — replaced by DISABLE_TELEMETRY with inverted default:
    • Unset / DISABLE_TELEMETRY=1|true|on|yes → telemetry off.
    • DISABLE_TELEMETRY=0|false|off|no → opt-in.
    • Migration: ZOMBIE_POSTHOG_ENABLED=false users can drop the var. To keep sending events, set DISABLE_TELEMETRY=0.
  • CLI-only release — server untouched; upgrade either side independently.

What’s new

  • Stable error code + friendly messageerror: UZ-AUTH-003 Token expired — run `zombiectl login` to refresh. JSON mode mirrors: {"error": {"code": "UZ-AUTH-003", "message": "…", "status": 401, "request_id": "…"}}.
  • One renderer for every command — login, doctor, workspace, agent, grant, tenant, billing, every agent subcommand share one error/exit-code path. No per-command drift.

CLI

  • Per-HTTP-request observability — when opted in, each request emits a span (status, duration, attempts, retry reason). When off (the default), silent.
May 7, 2026
CLISchemaCleanup

zombiectl login auto-selects the signup workspace

Fresh installs no longer need a follow-up zombiectl workspace add. After credentials persist, login fetches /v1/tenants/me/workspaces and writes the signup-provisioned default into local state. npm install -g @usezombie/zombiectl && zombiectl login is enough to reach zombiectl doctor green.Failure-tolerant: unreachable endpoint exits 0 with credentials saved; empty items[] is a no-op (never overwrites local state).

Schema teardown

Pre-v2.0 cleanup of legacy structure with zero production reads:
  • workspace_integrations table (schema/012) — never shipped.
  • workspace_entitlements table (schema/004) — plan-tier scoring config superseded by credit-pool billing.
  • core.workspaces columnsrepo_url, default_branch, paused*, version, monthly_token_budget, updated_at. From the 1:1 workspace-to-repo era; production INSERTs had already stopped writing them.
  • integration_grants.scopes — defaulted to ARRAY['*'], never read.
Doc cross-effect: billing_and_provider_keys.md (then named billing_and_byok.md) corrected to / 500¢ starter (was $10 / 1000¢ — doc-vs-implementation drift).

Tests

bun test 361/361. New cases in login.unit.test.js (happy path, empty items, preserved state, hydration failure) plus three integration cases for the fresh-state workflow.
May 7, 2026
CLIBug fixes

zombiectl first-install hardening

Three CLI bugs fixed at the customer’s first command.

Bug fixes

  • Default API URL is now https://api.usezombie.com (was http://localhost:3000). Fresh installs hit production; --api / ZOMBIE_API_URL still override.
  • Sticky --api per-install now works. zombiectl login --api https://api-dev.usezombie.com was writing to credentials.json but subsequent calls silently fell back to default. Root cause: parseGlobalArgs was eager-resolving DEFAULT_API_URL and short-circuiting the precedence chain. Override order, highest first: --apiZOMBIE_API_URLAPI_URLcredentials.json → default. Pinned by a 16-case integration matrix.
  • zombiectl agent add (non-JSON mode) no longer crashes. Was calling ui.bold which the theme doesn’t export; replaced with ui.warn.

Tests

300 → 354 tests. New scaffolding spins up a Bun.serve loopback per integration test so the full request lifecycle is exercised end-to-end. Cross-cutting failure-mode tests use the actual Zig error codes (UZ-AUTH-003, UZ-AUTH-004, UZ-WORKSPACE-002, UZ-ZMB-006, UZ-EXEC-013, UZ-INTERNAL-001).

Deferred

Opt-in telemetry consent prompt at first login — deferred. The bundled PostHog write key is a placeholder (no real ingest), so a first-run prompt is friction without benefit. Implementation lives at commit fe748ee9 for cherry-pick when the key becomes valid.
May 6, 2026
Bug fixesSecurityInternal

Secrets no longer leak into the activity stream

Streaming agent replies and final agent replies were emitting raw secret bytes alongside scrubbed tool-call arguments. Both now apply the same placeholder substitution (${secrets.llm.api_key}, ${secrets.github.token}) before reaching the activity stream or pub/sub.

Tests

Regression harness asserts the bytes the executor emits against a deterministic LLM stub. Covers four invariants: tool-arg redaction, streaming-reply redaction, final-reply redaction, and pub/sub no-leak. Multi-secret coverage (LLM key + GitHub installation token in one execution).

Bug fixes

  • Closed an executor memory leak where the final-reply buffer was duplicated and never freed.
May 4, 2026
BreakingAPICLI

Dormant API + stale CLI teardown; agent lifecycle FSM unified

Breaking — API removals

  • PATCH /v1/workspaces/{workspace_id} (workspace pause/unpause) — never called by first-party UI/CLI.
  • GET /v1/tenants/me/diagnostics — server-side tenant doctor block; zombiectl doctor runs local probes instead.
  • GET /v1/workspaces/{workspace_id}/zombies/{zombie_id}/telemetry — per-agent wrapper. Underlying store and tenant-scoped reader (/v1/tenants/me/billing/charges) unchanged.

Breaking — Agent lifecycle FSM

Every state transition now flows through PATCH /v1/workspaces/{ws}/zombies/{id} with {status: "active"|"stopped"|"killed"}. The FSM is encoded as SQL gates inside the UPDATE so parallel writes can’t bypass it:
  • active → stopped | killed
  • paused → stopped | active | killed (resume from auto-pause)
  • stopped → active | killed
  • killed → terminal (404 on further PATCH)
  • paused is platform-only (anomaly gate); operators can’t set it.
Plus:
  • DELETE /v1/workspaces/{ws}/zombies/{id} added — hard purge. Precondition status=killed; returns 204 / 409. Cascades across events, telemetry, sessions, approval gates, memory. Historical billing debits not reversed.
  • DELETE /v1/.../current-run removed. Replaced by PATCH {status: "stopped"}, which emits a zombie_status_changed control-stream signal.
  • Operator role required for every status transition (was only on the retired current-run). Pure config_json patches still permit workspace-member.

CLI

Removed (called endpoints that weren’t in the route manifest): zombiectl admin config add scoring_context_max_tokens, zombiectl workspace upgrade-scale, zombiectl workspace billing, the OPERATOR COMMANDS help section, and the ZOMBIE_OPERATOR=1 help-toggle.New lifecycle subcommands:
zombiectl zombie stop    <zombie_id>   → PATCH {status: "stopped"}
zombiectl zombie resume  <zombie_id>   → PATCH {status: "active"}
zombiectl zombie kill    <zombie_id>   → PATCH {status: "killed"}
zombiectl zombie delete  <zombie_id>   → DELETE (precondition: killed)
Two-step delete is intentional — no --force flag.

Dashboard

KillSwitch panel is now state-aware: Stop / Resume / Kill render based on current status, panel disables once killed.

Internal

Admin platform-keys + tenant API-keys handlers gained playbook header references (load-bearing for admin bootstrap, not dormant). delete_zombie policy class corrected to critical in agent-manifest.json + skill.md.
May 4, 2026
BreakingAPIPricingMarketing

M51 follow-up — route teardown, starter-credit cut, marketing honesty

Breaking — API removals

  • POST /v1/execute — orphan handler from the M10 pipeline-v1 removal. Gone from binary, OpenAPI, public/llms.txt, public/skill.md, public/agent-manifest.json. External integrators (LangGraph, CrewAI, Composio) hardcoded on the execute_tool operationId now get HTTP 404. Replacement: per-agent webhook + agent-key flow. Pre-v2.0 carve-out — no graceful 410.
  • GET /internal/v1/telemetry — operator endpoint, never wired to an admin tool. Data collection continues; customer-facing telemetry unchanged.
  • Three execute-path-only error codes removed (UZ-CRED-004, UZ-PROXY-001, UZ-GATE-005).

Pricing

  • Starter credit halved to (was $10). STARTER_GRANT_CENTS in src/state/tenant_billing.zig is the only constant that changed.
  • Marketing copy names the two debit points wherever pricing is described — hosted execution drains on event receipt and per-stage execution.

Marketing

  • Hero rewritten — “Operational knowledge isn’t executable. When a deploy fails, teams guess.” Tighter lead-ins on markdown-defined agents and the GitHub Actions → Slack flow.
  • FAQ context-window answer rewritten to match runtime reality: three signals (tool-result window, memory checkpoints, stage-chunk threshold), agent enforces via memory_store(category='conversation', ...), worker re-enqueues on a 10-stage continuation chain. See concepts/context-lifecycle.
  • Vendor name-drops (Fly, Upstash) genericized to “your infrastructure and run logs”.

Dashboard

  • Unauthenticated hits now redirect("/sign-in") (was notFound() 404). notFound() reserved for legitimate missing resources.
  • Install form shows zombiectl install --from (was the removed zombiectl up).

Design system

Marketing pricing cards + feature-flow rows now use the Card primitive via asChild. Visually identical, semantically aligned with the dashboard.
May 3, 2026
FeatureCLIIntegrations

One-command platform-ops install

/usezombie-install-platform-ops is a slash-command skill that installs the platform-ops agent on any repo. Runs in Claude Code, Amp, Codex CLI, and OpenCode — same skill, same screenshot.Two-command bootstrap:
npm install -g @usezombie/zombiectl
npx skills add usezombie/usezombie

What’s new

  • Slash-command skill — twelve steps: zombiectl doctor preflight, repo detection, three operator inputs (Slack channel, prod branch glob, optional cron), credential resolution, GitHub webhook secret generation, install, in-flow HMAC-SHA256 self-test, smoke-test steer.
  • Bundled agent templatesnpm install -g @usezombie/zombiectl copies canonical templates to ~/.config/usezombie/samples/. Package version = template version (no URL fetch, no cache).
  • TRIGGER.md frontmatter overrides take effectx-usezombie.model and the four x-usezombie.context knobs (context_cap_tokens, tool_window, memory_checkpoint_every, stage_chunk_threshold) are now honoured by the worker. Previous releases parsed and dropped them.
  • tool_window: auto — string sentinel accepted alongside integer 0.
May 3, 2026
FeatureBreakingAPICLIUI

Bring your own key (BYOK) + credit-pool billing

Tenants can run events against their own LLM provider (“BYOK”) or the platform-managed default. Both modes share one gate, one metering path, and one credit pool — they differ in drain rate, not eligibility. Every new tenant gets a $10 starter grant; gate trips on the next event after exhaustion (no in-flight kill).

What changed

  • Provider posture is tenant-scoped. New core.tenant_providers row pins platform or byok per tenant. Legacy PUT|GET|DELETE /v1/workspaces/{ws}/credentials/llm removed (404, no 410, no compat shim — pre-v2.0 carve-out).
  • Two-debit metering. Each event yields up to two core.zombie_execution_telemetry rows: receive (committed at gate-pass) and stage (committed pre-execution, updated post-run with token counts).
  • Per-token rates. Public _um/<key>/model-caps.json now carries input_cents_per_mtok and output_cents_per_mtok per model. API server caches from core.model_caps at boot.
  • Starter grant on signup. tenant_billing.insert_starter_grant runs in the tenant-create transaction; once per tenant, never re-applied.

API

Tenant provider:
  • GET /v1/tenants/me/provider — resolved config; api_key never returned.
  • PUT /v1/tenants/me/provider — flip to BYOK with { "mode": "byok", "credential_ref": "<vault-name>", "model"?: "<override>" }. Tenant-admin only (403 otherwise).
  • DELETE /v1/tenants/me/provider — equivalent to PUT mode=platform. Surfaces a low-balance warning if applicable.
Tenant billing:
  • GET /v1/tenants/me/billing — balance snapshot (unchanged).
  • GET /v1/tenants/me/billing/charges?limit= — newest-first credit-pool rows, one per (event_id, charge_type). Backs the Billing Usage tab.
Removed: PUT|GET|DELETE /v1/workspaces/{ws}/credentials/llm (never wired to a runtime resolver). Use /v1/tenants/me/provider plus a workspace-vault credential.

CLI

  • zombiectl tenant provider {get|set|reset} — manage the tenant’s LLM posture. set --credential <name> [--model <override>]. reset warns if balance < 100¢.
  • zombiectl billing show [--limit N] [--json] — read-only balance + last N events (receive / stage / total cents). No purchase/topup subcommands; Stripe lands in v2.1.

Dashboard

  • Settings → LLM Provider (/settings/provider) — mode toggle + BYOK form. Credential dropdown sources from the active workspace vault.
  • Settings → Billing (/settings/billing) — read-only summary. Headline balance + disabled “Purchase Credits” (tooltip: “Coming in v2.1”). Usage tab grouped by event; Invoices and Payment Method tabs are v2.1 placeholders.

Upgrading

  • CLI: drop direct calls to /workspaces/{ws}/credentials/llm. Store the credential in the workspace vault, then zombiectl tenant provider set --credential <name>.
  • Dashboard: existing tenants stay on platform-managed by default. Switch via Settings → LLM Provider.
  • Custom integrations: model-caps.json is additive — old fields preserved, plus the new per-model rate fields.

Notes

  • Pricing visibility — per-model rates are in the public-but-unguessable model-caps.json. Trade-off accepted: cacheable, unauthenticated, low-latency tenant provider set resolution.
  • No plan tiers — “Free” is just “starter grant not yet exhausted.” Platform and BYOK share processEvent and compute_*_charge; they differ in drain rate, not eligibility.
May 1, 2026
BreakingAPI

URL hygiene — verb routes become resource collections

Two URL families lose their verb-shaped URLs. Pre-v2.0 carve-out: retired URLs return 404, no 410 shim.

Upgrading

CLI and server upgrade together.
  1. Steering an agent: POST /v1/.../zombies/{zid}/steerPOST /v1/.../zombies/{zid}/messages. Body unchanged. CLI subcommand stays zombiectl zombie steer (verb on CLI, noun on wire).
  2. Memory: four verb endpoints collapse into one resource:
    • POST /v1/memory/storePOST /v1/.../zombies/{zid}/memories
    • GET /v1/memory/recall?...GET /v1/.../zombies/{zid}/memories?query=...
    • GET /v1/memory/list?...GET /v1/.../zombies/{zid}/memories (no ?query=)
    • POST /v1/memory/forgetDELETE /v1/.../zombies/{zid}/memories/{memory_key}
    DELETE is idempotent — missing key returns 204 (was {"deleted": true|false}). zombie_id moves from query string to path segment. memory_store / memory_recall agent-tool names unchanged.

What’s new

  • Stricter routing — dispatcher parses paths into segments once at the boundary; // and trailing slashes no longer match wrong handlers. Malformed paths return deterministic 404.
  • Single source of truth for v1 — version literal in one place.
Apr 30, 2026
BreakingAPICLI

REST cleanup — /complete + /kill move to PATCH; config hot-reload lands

Two verb-suffix endpoints retire to PATCH on the resource. Agent config edits now hot-reload mid-loop: edit in the dashboard or via PATCH config_json and the worker swaps tools, network policy, and context budget on the next event boundary. Old config freed in the same step.

Upgrading

CLI commands (zombiectl kill, zombiectl login) are unchanged — they wrap these URLs internally. Direct API consumers:
  • POST /v1/.../zombies/{id}/killPATCH /v1/.../zombies/{id} with { "status": "killed" }.
  • POST /v1/auth/sessions/{id}/completePATCH /v1/auth/sessions/{id} with { "status": "complete", "token": "<user-jwt>" }. Response now matches the GET poll shape.
  • POST /v1/.../zombies/{id}/steer unchanged this release; the rename to POST /events lands in a future URL pass.
Retired URLs return 404. CLI and server upgrade independently — the CLI was already issuing the new shapes.

What’s new

  • Config hot-reload — tools list, network allowlist, secrets map, and tool_window / memory_checkpoint_every / stage_chunk_threshold all swap mid-loop. No worker restart, no memory leak on swap.
  • One PATCH for combined updates{ config_json, status } in one request is atomic; one SQL UPDATE + one control-stream signal per dirty surface.
  • Cleaner OpenAPI — three verb-suffix paths gone; Slack and GitHub OAuth callbacks moved to a vendor-immortal classification (pinned, but distinguished from internal cleanup debt).

API

  • PATCH /v1/.../zombies/{id} — partial body { config_json?, status? }. Both optional; empty body = 200 no-op. When status is set it must equal "killed". Response includes config_revision.
  • PATCH /v1/auth/sessions/{id} — body { status: "complete", token }. Bearer auth (depositor proves it can mint a user-jwt). Response: { status, token, request_id }.
  • Validation message for invalid status: status must be "killed" (UZ-VAL-001).
Retired (404, no 410 stub): POST /v1/.../zombies/{id}/kill, POST /v1/auth/sessions/{id}/complete.
Apr 29, 2026
BreakingAPICLI

Frontmatter cleanup — runtime config moves under x-usezombie:

TRIGGER.md no longer carries runtime keys at the top level. tools, credentials, network, budget, trigger all live under one x-usezombie: block. SKILL.md now requires name, description, version; install rejects bundles where SKILL.md and TRIGGER.md name: disagree.

Upgrading

Every agent bundle. Migration is mechanical:
  1. TRIGGER.md — add x-usezombie: at top level and indent the existing blocks under it. Keep top-level name:.
  2. SKILL.md — frontmatter needs name:, description:, version:. Match name: to TRIGGER.md.
  3. zombiectl install --from <dir> — re-run until field-level errors clear.
See Authoring skills for the canonical shape.

What’s new

  • Disciplined parser — unknown subkeys under x-usezombie: fail loud (UnknownRuntimeKey). Top-level stays permissive — x-amp: and other vendor blocks pass through.
  • Cross-file identityname: must match across both files; enforced at install.
  • Real YAML — bespoke converter replaced with kubkon/zig-yaml 0.2.0. Multi-line strings, escapes, standard scalar tags, arbitrary nesting all work.

API

Two new error codes from POST /v1/workspaces/{ws}/zombies:
  • UZ-ZMB-008 (MSG_ZOMBIE_INVALID_CONFIG) — now also fires for malformed SKILL.md frontmatter.
  • UZ-ZMB-011 (MSG_ZOMBIE_NAME_MISMATCH) — when SKILL.md and TRIGGER.md name: disagree.
Internal SQL path: config_json->'trigger'->...config_json->'x-usezombie'->'trigger'->... (operators reading raw rows).
Apr 29, 2026
What's newAPIUI

Approval inbox — pending gates surface in the dashboard

Approvals used to flow only through Slack DMs. Now every pending gate surfaces in a workspace-wide /approvals list and on each agent’s detail page, with proposed action, blast-radius, evidence, and a timeout countdown rendered next to Approve and Deny buttons. Slack callbacks and dashboard clicks share one resolve core — whichever lands first wins, the other channel’s stale button no-ops with the original outcome and resolver attribution.

What’s new

  • /approvals page — workspace-wide list, oldest-first. Row shows agent, gate kind, proposed-action one-liner, blast radius, age, timeout countdown, inline Approve/Deny. Refreshes every 5s; empty state renders clean.
  • /approvals/{gate_id} detail page — full proposed-action prose, evidence as expandable JSON, context grid, Resolve panel with optional reason. Once resolved, flips to Resolved as <outcome> by <who> at <when>.
  • Per-agent Pending approvals panel — on each agent’s detail page, plus a destructive-variant badge in the header (N pending approval(s) or 50+).
  • Sidebar nav — new “Approvals” entry between Credentials and Events.
  • Auto-timeout sweeper — background thread scans core.zombie_approval_gates every 60s; transitions pending rows past timeout_at to timed_out (worker treats as denied for destructive ops). Default 24h.

API

  • GET /v1/workspaces/{ws}/approvals?status=&zombie_id=&gate_kind=&cursor=&limit= — paginated. Default status=pending, limit=50, max 200. Cursor encodes (requested_at, gate_id).
  • GET /v1/workspaces/{ws}/approvals/{gate_id} — single read; 404 on missing or cross-workspace.
  • POST /v1/workspaces/{ws}/approvals/{gate_id}:approve — body {reason?} (≤4096). 200 / 409 (UZ-APPROVAL-006 — original outcome + resolver returned).
  • POST /v1/workspaces/{ws}/approvals/{gate_id}:deny — same shape.
  • ApprovalGate shape gains gate_kind, proposed_action, evidence (JSONB), blast_radius, timeout_at, resolved_by.

Bug fixes

  • Slack/dashboard race fixed — both paths now go through UPDATE … WHERE status='pending'. Loser sees 409 with the original outcome, never a silent overwrite.
Apr 29, 2026
InternalPerformance

Streaming substrate hot-path cleanup

Worker → live-tail performance pass:
  • Activity-frame JSON encoding reuses a per-event scratch buffer — per-frame heap alloc gone (~43µs → ~2µs on chunk-heavy responses).
  • Executor transport parses each progress frame once (was twice, ~46% faster).
  • Workers open a dedicated Redis client for activity PUBLISH — no contention with stream commands on the queue client’s mutex.
  • Per-agent events index leads with (zombie_id, created_at DESC, event_id DESC) — covers the dashboard view + keyset pagination directly.
No user-visible change; steadier live-tail latency under concurrent dashboard tabs.
Apr 28, 2026
BreakingWhat's newAPICLIUI

Streaming substrate — every event has provenance; live activity tails the dashboard

Every event (steer, webhook, cron, chunked continuation, gate-resolved continuation) lands on one Redis stream with a normalized envelope and an actor field carrying provenance forward. Every event start/end is durably persisted in core.zombie_events with payload, response, tokens, wall time, failure label. Dashboard ships a live SSE activity panel with sub-200ms publish-to-receive.

Upgrading

  • POST /steer shape changed — now does a direct XADD and returns {event_id}. Legacy zombie:{id}:steer Redis key gone. Scripts reading the steer key directly: switch to the SSE stream or events history.
  • GET /v1/.../zombies/{id}/activity removed (per-agent + workspace-aggregate). Replace with GET /v1/.../zombies/{id}/events or GET /v1/.../events?zombie_id=. Response carries actor, status, response_text, tokens, wall_ms. zombiectl logs migrated automatically.
  • core.activity_events table dropped. Pre-v2.0 teardown — no migration. Switch to core.zombie_events; primary key is (zombie_id, event_id) for idempotent replay under XAUTOCLAIM.
  • Executor RPC bumped to v2. Worker + executor must upgrade together (HELLO handshake on connect; aborts on executor.rpc_version_mismatch). Roll executor first.

What’s new

  • One ingress, one durable record per event — each event produces one core.zombie_events row, one zombie_execution_telemetry row, one core.zombie_sessions mutation. Same event_id joins narrative, billing, session state. Replays idempotent via composite-key ON CONFLICT.
  • Continuation actors stay flat — chunked or gate-resolved continuations re-enter as actor=continuation:<original_actor>, never nested. actor LIKE '%steer:kishore' finds origin + every continuation; resumes_event_id walks back via recursive CTE.
  • gate_blocked events visible but unresolvable until the Approval Inbox ships. Row enters terminal state with failure_label populated + XACK. Admin-resume fallback dropped.
  • Dashboard live panel<LiveEventsPanel /> above the history table. Native EventSource → same-origin Next Route Handler that mints an API-audience JWT server-side. Browser never holds the JWT; backend never sees a cookie. Exponential backoff capped at 15s, rolling 20-frame buffer.

API

  • POST /v1/workspaces/{ws}/zombies/{id}/steer — body {message} (≤8192). 202 with {status: "accepted", event_id}.
  • GET /v1/workspaces/{ws}/zombies/{id}/events?cursor=&actor=&since=&limit= — paginated. actor accepts globs (steer:*, webhook:*). since accepts Go durations (15s, 2h) or RFC 3339. Default 50, max 200. since and cursor mutually exclusive.
  • GET /v1/workspaces/{ws}/events?cursor=&actor=&zombie_id=&since=&limit= — workspace-aggregate; items carry zombie_id.
  • GET /v1/workspaces/{ws}/zombies/{id}/events/stream — SSE. Frame kinds: event_received, tool_call_started, tool_call_progress (~2s heartbeat), chunk, tool_call_completed, event_complete. Per-connection seq ids reset on SUBSCRIBE; Last-Event-ID ignored. Disconnect → backfill via GET /events?since=<last_seen> then reopen.

CLI

  • zombiectl steer {id} "<message>" — batch mode. POSTs, opens SSE, prints [claw] <chunk> as chunks arrive, exits 0 on event_complete. Polls GET /events?since= for 60s if SSE drops. Interactive REPL deferred.
  • zombiectl events {id} — paginated history. --actor=, --since=, --json, --cursor=. Default 50/page.
  • zombiectl logs {id} — repointed at events endpoint; row format now actor + response_text summary.
Apr 27, 2026
Bug fixesAPICLISecurity

Install actually works — contract aligned, parser fixed, doctor tightened

Three bugs that made zombiectl install --from <path> unusable on a fresh workspace, all fixed in one pass.

Upgrading

  • Install POST shape changedPOST /v1/workspaces/{ws}/zombies now accepts {trigger_markdown, source_markdown}. Server is the single parser of TRIGGER.md frontmatter; name + config_json derived server-side. Pre-v1.0, no compat shim.
  • TRIGGER.md key skills:tools: — sample already used tools:; parser now matches. Older specs need the rename; ERR_ZOMBIE_INVALID_CONFIG with hint when missing.
  • zombiectl install + doctor require zombiectl login — were previously exempt and produced opaque 401s. Now fail locally with AUTH_REQUIRED before any HTTP call.

What’s new

  • Doctor checks the three things that matterserver_reachable (GET /healthz, 5s timeout), workspace_selected, workspace_binding_valid. Old healthz/readyz/credentials checks folded in or dropped.
  • Doctor --json schema{ok, api_url, checks: [{name, ok, detail}]}. Each failed check carries a one-line detail pointing at the next action.
  • Install response{zombie_id, name, status}. CLI displays the server-derived name; copy/paste matches what the server stored.

API

  • POST /v1/workspaces/{ws}/zombies — body {trigger_markdown, source_markdown} (≤64KB each). 201 with {zombie_id, name, status}. 400 ERR_ZOMBIE_INVALID_CONFIG on frontmatter parse failure; 400 ERR_INVALID_REQUEST (MSG_ZOMBIE_TRIGGER_REQUIRED) on empty/oversized trigger.

CLI

  • zombiectl install --from <path> — sends the new shape; success line uses the server’s name.
  • zombiectl doctor — three checks, per-check 5s timeout, exit 0/1.
Apr 26, 2026
What's newAPICLI

Worker substrate — install an agent, see it work in seconds

Agents installed via POST /v1/workspaces/{ws}/zombies are claimed by a worker thread within ~1s of the 201. No worker restart needed. A new POST .../kill aborts in-flight agents cleanly; PATCH .../zombies/{id} hot-reloads config; SIGTERM triggers graceful drain.

What’s new

  • Atomic install — INSERT into core.zombies + XGROUP CREATE MKSTREAM + XADD zombie:control * type=zombie_created happen synchronously before the 201 returns. Webhooks arriving 1ms later find the consumer group ready.
  • Fleet-wide control plane — Redis stream zombie:control carries created / status_changed / config_changed / drain_request. One watcher thread per worker dispatches to spawn / cancel / reconfigure handlers.
  • Per-agent cancel flag — atomic flag at top of every loop iteration. POST /kill flips it; thread exits within ~100ms.
  • zombiectl kill <zombie_id> — now POSTs to /kill, requires explicit agent id (was a DELETE that defaulted to “kill all in workspace” — footgun gone).

API

  • POST /v1/workspaces/{ws}/zombies/{id}/kill — 200 {zombie_id, status: "killed", queued_at}; 404 on missing/already-killed (idempotent).
  • PATCH /v1/workspaces/{ws}/zombies/{id} — body {config_json?}. 200 {zombie_id, config_revision} (revision = monotonic updated_at).
  • DELETE /v1/workspaces/{ws}/zombies/{id} — removed.
Apr 24, 2026
What's newIntegrations

platform-ops — flagship agent for GitHub Actions deploy failures

New sample at samples/platform-ops/. Wakes on a GitHub Actions workflow_run.conclusion=failure webhook, gathers evidence from the failed workflow logs, your hosting provider, and your data-plane, then posts an evidenced diagnosis to Slack. Reachable manually via zombiectl steer {id}. Read-only against GitHub, Fly, Upstash; only write path is the Slack post.

What’s new

  • Sample bundleSKILL.md (diagnosis prompt + evidence flow), TRIGGER.md (webhook trigger, network allowlist for the four hosts, 1/day+1/day + 8/month caps), README.md (operator walkthrough including the GitHub webhook setup).
  • Four credential shapesgithub, fly, upstash use {host, api_token}; slack uses {host, bot_token}. Add via zombiectl credential add <name> --host <host> --api-token <token> (or --bot-token).
  • Installzombiectl install --from samples/platform-ops. Webhook URL printed at install time; paste into your GitHub repo’s webhook settings filtered to workflow_run.
  • Sandbox — bwrap + landlock + cgroups (Linux); network deny-by-default; only network.allow hosts reachable.
  • Provenance — events land with actor=webhook:github or actor=steer:<operator>.
Credential bytes are substituted into outbound HTTPS at the credential firewall, after the sandbox closes around the agent. They never reach the LLM context, logs, or database.
Apr 22, 2026
What's newBreakingUIAPICLI

Dashboard — full lifecycle in the browser

app.usezombie.com reaches its first “I can run my day from here” shape: overview tiles + recent activity, agents list with cursor pagination + search, install form, per-agent detail page (webhook URL, config, one-click kill). Workspace switcher in the header. Credit-exhaustion banner driven by is_exhausted / exhausted_at on GET /v1/tenants/me/billing.

Upgrading

  • Kill switch path renamedPOST /v1/.../zombies/{id}/stopDELETE /v1/.../zombies/{id}/current-run. Same behavior, same shape, same 200/409/404 semantics. Old path returns 404 (pre-v1.0 alpha, no deprecation window). REST hygiene: current-run is a singleton sub-resource; DELETE is the idiomatic verb.

What’s new

  • Overview (/) — status tiles + tenant credit balance + live recent-activity feed. Server Components with independent Suspense boundaries.
  • Agents list (/zombies) — cursor pagination, in-view search across name/id/status.
  • Install form (/zombies/new) — design-system Form primitive (react-hook-form + zod). Toast on duplicate name.
  • Agent detail (/zombies/[id]) — webhook copy, trigger + firewall panels, rename/describe/delete-with-confirm, React-19 useOptimistic kill switch with 409 auto-recovery.
  • Workspace switcherGET /v1/tenants/me/workspaces + Server Action writing active_workspace_id cookie. No session reissue.
  • Placeholder pages at /firewall, /credentials, /settings.
  • Credit-exhaustion banner + per-agent badge — automatic from is_exhausted.
  • Auth abstraction@clerk/nextjs flows through lib/auth/{server,client}.ts. Switching auth provider is a two-file edit.
  • Same-origin /backend proxy — browser fetches go through /backend/:path* (Next rewrites to API_BACKEND_URL). No CORS surprises.

API

  • New GET /v1/tenants/me/workspaces{ items: [{id, name, created_at}], total }.
  • Changed GET /v1/workspaces/{ws}/zombies?cursor={ts}:{id}&limit=N — default 20, max 100. Response adds nullable cursor.
  • Renamed (breaking) DELETE /v1/workspaces/{ws}/zombies/{id}/current-run — transitions to stopped, returns {zombie_id, workspace_id, status: "stopped", request_id}. 409 UZ-ZMB-010 on already-stopped/killed; 404 UZ-ZMB-009 on cross-workspace. Operator role required.

CLI

  • zombiectl --help surfaces full lifecycle: install | up | status | kill | logs | credential.
  • zombiectl list [--workspace-id] [--cursor] [--limit] [--json] — mirrors the dashboard’s agents list (≤100 limit clamp).
  • zombiectl workspace show — mirrors /settings (workspace id, name, active status).
  • Active workspace is persistentzombiectl workspace use <id> writes ~/.config/zombiectl/workspaces.json; subsequent commands default to it. Independent of the dashboard’s cookie.
  • zombiectl kill unchanged (full delete, not current-run kill).
Apr 22, 2026
What's newBreakingAPISecurityBilling

Admin-by-env-var removed; credit exhaustion observable

The API_KEY env-var bypass (which minted an admin with no tenant or audit identity) is gone. Admin auth now flows exclusively through Clerk sessions with publicMetadata.role=admin. Programmatic admin access: tenant-minted zmb_t_… key from POST /v1/api-keys. Tenant billing now surfaces credit exhaustion explicitly.

Upgrading

  • Drop API_KEY from your server env — silently ignored. Server refuses to start without OIDC (OIDC_JWKS_URL, OIDC_ISSUER, OIDC_AUDIENCE).
  • Promote your admin in Clerk — Dashboard → Users → Metadata → Public → {"role": "admin"}. See playbooks/012_usezombie_admin_bootstrap/001_playbook.md for the dev + prod walkthrough that ends with a zmb_t_… key in op://ZMB_CD_<env>/usezombie-admin/api_key.
  • If you read balance_cents == 0 — switch to is_exhausted / exhausted_at.

What’s new

  • BALANCE_EXHAUSTED_POLICY={continue|warn|stop} (default warn).
    • stop — pre-empts delivery, XACK so it doesn’t retry, emits balance_gate_blocked.
    • warn — logs + emits rate-limited balance_exhausted (1/workspace/24h).
    • continue — old behavior, made explicit.
  • First-exhausting debit atomically stamps balance_exhausted_at and emits a one-shot balance_exhausted_first_debit. Replays don’t double-emit.

API

GET /v1/tenants/me/billing gains two fields:
  • is_exhausted (boolean) — true once balance hits zero on a worker debit.
  • exhausted_at (integer epoch ms or null) — non-null only when is_exhausted is true.
Apr 22, 2026
Bug fixesWhat's newObservability

Observability — per-agent tokens wired, OTLP histograms exported

Two observability paths that looked live but weren’t.

What’s new

  • zombie_agent_tokens_by_workspace_total carries both workspace_id and zombie_id labels; reports real data on every completed delivery. Useful for top-N spend dashboards at either granularity.
  • zombie_workspace_metrics_overflow_total exposed — saturation indicator for the 4096-slot (workspace_id, zombie_id) table; overflow falls back to _other aggregation.

Bug fixes

  • Per-workspace token counter was a no-op (helper existed, never called). Now fires from the same spot as zombie_tokens_total.
  • OTLP JSON exporter silently dropped _bucket/_sum/_count — histograms (zombie_execution_seconds, zombie_agent_duration_seconds, zombie_executor_agent_duration_seconds) never reached collectors. Exporter now emits OTLP histogram data points with cumulative-to-delta conversion, explicitBounds, aggregationTemporality: 2.
  • Removed zombie_gate_repair_loops_* counters — pipeline-era concept with no agent-era call site, always read zero, misled operators.
Apr 22, 2026
What's newUICLIIntegrations

Docs follow-up — rewritten for the v2 MVP

docs.usezombie.com rewritten end-to-end against the current product. Quickstart walks a fresh operator from Clerk sign-up to a live agent firing webhook events in under ten minutes. Stale pre-Clerk vocabulary cleared from every page outside the historical changelog.

What’s new

  • New quickstart — sign up → dashboard → create agent → copy webhook → curl trigger → verify credit debit. One page, end-to-end.
  • New CLI reference at /cli/zombiectl.
  • Self-hosting section under /operator(removed in M51 prep when self-host was deferred to v3; see usezombie/usezombie:docs/architecture/ for the canonical reference).
  • Concepts page — four nouns (tenant, workspace, agent, skill) + tenant-scoped credit model.
  • Billing pages — rewritten around single-wallet, multi-workspace.
Apr 21, 2026
What's newAPIBilling

Tenant-scoped billing

Billing moves from workspace to tenant. Every signup gets one billing.tenant_billing row (plan_tier=free, plan_sku=free_default, 1000¢ balance). All workspaces under a tenant share that balance — no more per-workspace credit grants on workspace creation. Workspace-scoped billing endpoints removed.

Removed

  • POST /v1/workspaces/{ws}/billing/events
  • POST /v1/workspaces/{ws}/billing/scale
  • GET /v1/workspaces/{ws}/billing/summary
  • GET /v1/workspaces/{ws}/zombies/{id}/billing/summary
  • POST /v1/workspaces/{ws}/scoring/config

What’s new

  • One tenant, one billing rowbilling.tenant_billing(plan_tier, plan_sku, balance_cents, grant_source, updated_at) with tenant_id as PK.
  • Atomic worker debit — conditional UPDATE … WHERE balance_cents >= $cents RETURNING. Exhausted balance returns UZ-BILLING-005 CreditExhausted (no partial debits).
  • Schema slots resequenced to contiguous 001..018 (tidy pre-v2.0 baseline).

API

GET /v1/tenants/me/billing — caller’s tenant snapshot:
{
  "plan_tier": "free",
  "plan_sku": "free_default",
  "balance_cents": 1000,
  "updated_at": 1713700000000
}
Auth: Bearer Clerk JWT (operator or admin). 401 UZ-AUTH-001 without a valid token.
Apr 21, 2026
What's newAPISecurityIntegrationsObservability

Clerk-powered signup

Users sign up through Clerk and get auto-provisioned. A Clerk user.created webhook to POST /v1/webhooks/clerk atomically creates tenant + user (bound to Clerk OIDC subject) + owner membership + default workspace (Heroku-style name) + 0-cent credit state. Idempotent on replay.

What’s new

  • Signup webhook — Svix signature verified inline against CLERK_WEBHOOK_SECRET; stale timestamps (>5 min drift) rejected.
  • Heroku-style names — 1,024,000-combo namespace (32 adjectives × 32 nouns × 1000 suffixes); per-tenant uniqueness via partial index.
  • Identity model — new core.users (indexed by Clerk OIDC subject) + core.memberships (user→tenant with role). Ready for team accounts later.

API

POST /v1/webhooks/clerk — body is a Clerk user.created envelope; headers svix-id, svix-timestamp, svix-signature required. Responses:
  • 200 {workspace_id, workspace_name, created}
  • 400 UZ-REQ-001 (malformed / missing email)
  • 401 UZ-WH-010 (bad sig) / UZ-WH-011 (stale ts)
  • 413 UZ-REQ-002 (body > 2 MB)
  • 500 UZ-INTERNAL-*
Non-user.created events are 200-ignored so Clerk stops retrying.

Observability

  • Three Prometheus counters: zombie_signup_bootstrapped_total, zombie_signup_replayed_total, zombie_signup_failed_total (with reason label).
  • PostHog event signup_bootstrapped (distinct_id = oidc_subject); email domain only, never full email.
  • Log scopes: clerk.bad_sig, clerk.stale_ts, clerk.bad_request.
Apr 19, 2026
ImprovementsUIPerformance

Unified design system across the dashboard and marketing site

Buttons, cards, dialogs, inputs, and other UI primitives now come from a single @usezombie/design-system package. The dashboard and marketing site share one source of truth — tweak a variant once, both surfaces update.The new /agents page adds an interactive hero and animated terminal. Landing JS is under 90 kB gzipped, with a size-limit CI gate guarding bundle size. PostHog loads on idle so first paint is no longer blocked.
Apr 19, 2026
ImprovementsCLIAPI

One credential surface for agents

Workspace credentials now flow through a single path: zombiectl credential add writes to the workspace vault, and that’s what every agent reads at runtime. No parallel surfaces, no guessing which command owns a given secret.
Apr 19, 2026
ImprovementsCLI

Agent lifecycle is the unified product model

The CLI and API now speak one language: zombiectl install → up → status → logs → kill. zombiectl --help is shorter, the API surface is tighter, and the docs, product, and code all describe the same thing. See Agents.
Apr 18, 2026
Docs

Docs reshaped around the agent lifecycle

A new Agents section walks through installing a template, adding credentials, running, observing, and killing an agent. Pages describing the legacy v1 pipeline have been retired; the old /specs/* and /runs/* URLs now 404.New pages: overview, install, running, credentials, webhooks, skills, templates.
Apr 18, 2026
NewAPISecurity

Tenant API keys

Tenant admins can now mint named, rotatable API keys via POST /v1/api-keys — scoped to the tenant, revocable, and audited. Raw keys (zmb_t_…) are shown once on creation; only the hash is stored. The legacy API_KEY env var still works as a bootstrap fallback.Workspace-scoped external agent keys were renamed to agent keys: /v1/workspaces/{ws}/external-agents/v1/workspaces/{ws}/agent-keys.
Apr 18, 2026
NewSecurityIntegrations

Unified webhook authentication — seven first-class providers

Every per-agent webhook flows through one fail-closed middleware that handles URL-embedded secrets, Bearer tokens, HMAC signatures, and Svix multi-signature rotation with constant-time comparisons.Seven providers ship first-class: agentmail, Grafana, Slack, GitHub, Linear, Jira, and Clerk (via Svix). Onboarding takes one field in TRIGGER.md; secrets are workspace-vaulted and rotate without an agent redeploy. See Webhooks.
Apr 16, 2026
NewAPIUI

Operator dashboard foundation

Workspace-wide activity feed, operator kill switch for runaway agents, and per-agent billing summary that mirrors the workspace view. Billing numbers now come from real execution telemetry (previously zeroed since v0.10).Ships with six accessible React primitives — StatusCard, EmptyState, Pagination, DataTable, ConfirmDialog, ActivityFeed — and Tailwind v4 semantic design tokens.
Apr 16, 2026
ImprovementsAPI

Consistent pagination and full OpenAPI coverage

Every list endpoint returns the same { items, total, cursor? } envelope so SDK generators can emit a single Paginated<T> type. Memory reads moved to GET, and openapi.json now documents every route the server exposes — 26 previously undocumented operations are authored in.
Apr 16, 2026
ImprovementsAPISecurity

Workspace-scoped REST paths

Identity — workspace, agent, grant — is now always in the URL path (/v1/workspaces/{ws}/zombies/{id}), and query parameters are reserved for pagination and search. Every handler authorizes workspace membership after authentication; cross-workspace lookups return 404, so the API does not leak the existence of resources you cannot see.
Apr 15, 2026
NewAPI

Live agent steering

Redirect a running agent mid-execution without killing it. POST /v1/workspaces/{ws}/zombies/{id}/steer injects a message into the agent’s event stream — delivered mid-execution if the agent is running, queued otherwise (300-second TTL).
Apr 12, 2026
NewMemoryAPI

Persistent agent memory

Agents remember facts across executions. Memory is row-scoped per agent and persists in Postgres — a lead-collector agent doesn’t re-research the same lead, a support agent doesn’t re-ask customers their plan. Tools: memory_store, memory_recall, memory_list, memory_forget.
Apr 12, 2026
NewIntegrationsAPI

Integration grants + credentialed proxy

Agents — internal or external (LangGraph, CrewAI) — call external services through usezombie’s credentialed proxy. Credentials never leave the platform: injected server-side, stripped from response echoes, and logged to the activity stream.An agent requests a grant, humans approve once via Slack/Discord/dashboard, and the grant is reusable until revoked. Launch providers: Slack, Gmail/AgentMail, Discord, Grafana. New CLI: zombiectl agent create|list|delete, zombiectl grant list|revoke.
Apr 12, 2026
NewObservabilityAPI

Agent execution telemetry

Every event delivery records token_count, time_to_first_token_ms, wall_seconds, and credit_deducted_cents, queryable per-agent via GET /v1/workspaces/{ws}/zombies/{id}/telemetry. Each delivery also emits an OpenTelemetry zombie.delivery span that lines up correctly in Grafana Tempo.
Apr 12, 2026
NewIntegrations

Slack plugin

Connect Slack via “Add to Slack” OAuth or zombiectl credential add slack. Bot tokens live in the vault; events and interactions are HMAC-verified with constant-time comparison. Any agent with a slack_event trigger fires automatically on matching messages.
Apr 12, 2026
NewObservability

Agent observability

Every trigger and delivery shows up in Grafana and PostHog. Prometheus exposes zombies_triggered_total, zombies_completed_total, zombies_failed_total, zombie_tokens_total, and a zombie_execution_seconds histogram; PostHog fires zombie_triggered and zombie_completed with tokens, wall-time, and exit status.
Apr 12, 2026
NewBilling

Agent credit metering

Free-plan agents deduct from consumed_credit_cents after each successful delivery at 1 cent per agent-second; Scale is unlimited and short-circuits without a DB write. Crash replay is idempotent on event_id, and a DB hiccup never drops or double-charges an event.
April 11, 2026
New releasesAgentsAPI

Agent directory format, AI Firewall, error standardization, pipeline v1 removal

Agent directory format

Agents are now two-file directories (SKILL.md + TRIGGER.md). SKILL.md follows the ClaHub registry format — same file uploads to the CLI and publishes to the skill registry. TRIGGER.md carries deployment config (trigger, chain, budget, network, credentials). zombiectl install scaffolds both; zombiectl up sends them raw.

Dynamic skills

Skills are config-driven. The NullCraw executor reads SKILL.md and uses built-in tools (shell, http, file_read) to call external APIs. Adding a new skill = new directory; no server rebuild.

AI Firewall — 4-layer outbound inspection

  • Domain allowlist — only network.allow domains reachable.
  • Endpoint policy — per-endpoint rules in firewall: (e.g., allow GET, deny POST).
  • Prompt-injection detection — outbound bodies scanned for instruction override / role hijacking / jailbreaks.
  • Content scanning — response bodies scanned for credential and PII leakage.
All decisions logged as activity events. Fails closed.

API error format (RFC 7807)

All errors now use application/problem+json with UZ- prefixed codes. Every code has a stable HTTP status — callers no longer parse status codes independently.

Pipeline v1 removed

All /v1/runs/* and /v1/specs return 410 Gone with ERR_PIPELINE_V1_REMOVED. Use agent-native SSE stream + chat-inject instead.

Webhook auth — URL-embedded secret

Preferred: POST /v1/webhooks/{zombie_id}/{secret}. Bearer token still supported as fallback.

Internal

All handler boilerplate (arena, request id, Bearer auth) moves to a shared hx.zig wrapper. Handlers contain only business logic.
April 11, 2026
New releasesAgents

Lead Agent — v2 core ships

usezombie is now a runtime for always-on agents. Two commands, running agent:
zombiectl install lead-collector
zombiectl up

What’s new

  • Agent config format — YAML frontmatter (trigger, skills, credentials, budget) + markdown body. CLI compiles YAML → JSON before upload; server sees JSON only. Voice-transcribed instructions supported as the body.
  • Webhook ingestion — every agent gets POST /v1/webhooks/{zombie_id}. Routing by primary key (no name collisions). Bearer auth per agent. Idempotent via Redis SET NX (24h TTL). Returns 202 / 200.
  • Activity stream — append-only core.activity_events (UPDATE/DELETE blocked by trigger). zombiectl logs streams it; cursor-paginated replay.
  • Credential injection — vault → sandbox at runtime. No credentials in config files. zombiectl credential add to register.
  • Session checkpoint — conversation context upserted to Postgres after each event. Resume from last checkpoint after crash.
  • CLIzombiectl install | up | status | kill | logs | credential add | credential list.
  • Schema additionscore.zombies (JSONB config), core.zombie_sessions (checkpoint), core.activity_events. Applied automatically by zombied migrate.
  • API — 16 v1 endpoints removed from OpenAPI; POST /v1/webhooks/{zombie_id} added.
  • Version toolingmake sync-version / make check-version prevent drift across build.zig.zon and zombiectl/package.json.

Bug fixes

  • YAML parser was silently dropping array items in CLI config upload.
  • UTF-8 truncation was splitting multi-byte characters in session context.
April 6, 2026
New releasesImprovements

Steer running agents mid-run

Interrupt a running agent without aborting it. zombiectl runs interrupt <run_id> <message> or POST /v1/runs/{id}:interrupt. Picked up at the next gate checkpoint. Two modes: queued (next checkpoint) and instant (IPC delivery).

What’s new

  • Live run streaming (CLI)zombiectl run --spec <file> --watch streams gate results in real time. Last-Event-ID reconnect replays only missed events. Ctrl+C clean exit.
  • Run replay (CLI)zombiectl runs replay <run_id> prints a per-gate narrative for completed runs (exit codes, stdout/stderr, wall time).
  • Workspace billing breakdownzombiectl workspace billing --workspace-id <id> shows completed / non-billable / score-gated runs. --period, --json. Backed by GET /v1/workspaces/{id}/billing/summary.
  • Run observability — full trace tree in Grafana Tempo ({run.id="<id>"} waterfall). Per-workspace Prometheus metrics: tokens, run outcomes, gate-repair loop distribution.
  • Resource efficiency scoring v2 — runs now scored on actual memory + CPU usage; agents staying within limits score higher.

Breaking

SSE id: on live events changed from sequential counter to created_at Unix milliseconds. Clients parsing Last-Event-ID as a sequence must update.
March 30, 2026
New releases

Live run streaming (API)

The SSE stream endpoint is live: GET /v1/runs/{id}:stream emits gate results in real time as the agent works. CLI support (--watch) is coming in a future release.

Run replay (API)

Replay any finished run step by step via the API: GET /v1/runs/{id}:replay returns a structured gate narrative with exit codes, stdout/stderr, and wall time. CLI support (zombiectl runs replay) is coming in a future release.
March 28, 2026
New releases

Per-run cost control

Set token budgets, wall-time limits, and repair loop caps on each run. Runs that exceed limits are cancelled automatically.
March 25, 2026
New releases

OpenAPI spec

A complete OpenAPI 3.1 specification covering all 43 API endpoints is now published.

@usezombie/zombiectl on npm

The CLI is now available as a scoped npm package.