platform-ops agent on one of your repositories. Total time: about ten minutes from a cold machine. At the end, a deploy failure on the repo lands an evidenced diagnosis in your Slack channel.
Prerequisite: a host agent
You’ll drive the install via a markdown-skill host. Any of these work — the install skill ships identical prompts in each:
- Claude Code —
claude.ai/code - Amp —
ampcode.com - Codex CLI —
npm install -g @openai/codex - OpenCode —
opencode.ai
Install zombiectl and skills
zombiectl is the only binary you install on your machine; it ships the skill samples under ~/.config/usezombie/samples/ so the install skill can drive zombiectl install --from against a known-good source. npx skills add usezombie/skills then symlinks the /usezombie-* host skills from the public usezombie/skills repo into every supported agent it detects — so /usezombie-install-platform-ops is ready to invoke in the next step.Sign in
workspace add required.No terminal — CI, or an agent driving the CLI? Skip the browser entirely: supply a token via the
--token <token> flag, the ZOMBIE_TOKEN environment variable, or piped on stdin — resolved in that order (prefer the env var or stdin to keep it out of shell history). Add --no-input to fail fast instead of waiting for the code — a no-TTY shell with no token exits with an error. Full precedence in zombiectl login.(Optional) Create a named workspace
zombiectl workspace use <id>. See Command reference → workspaces.Run the install skill
The The skill drives
npx skills add usezombie/skills in Step 2 already added the /usezombie-install-platform-ops skill to your agent. Invoke it:zombiectl install --from ~/.config/usezombie/samples/platform-ops under the hood. Power users can run that directly; everyone else gets the guided flow.The skill asks three gating questions:- Slack channel — where diagnoses post (e.g.
#platform-ops). - Production branch glob — which branches count as production (default
main). - Cron schedule (optional) — for periodic health checks (e.g.
*/30 * * * *). Leave blank for webhook-only.
.usezombie/platform-ops/SKILL.md and .usezombie/platform-ops/TRIGGER.md into the current repo. Re-running against the same workspace is idempotent; existing files are diffed and updated.Verify the registered webhook
The install skill registers the GitHub webhook automatically. It reads the The agent filters incoming payloads for
triggers[] block in the generated TRIGGER.md, then calls gh api repos/<owner>/<repo>/hooks per webhook trigger — using the user’s existing gh authentication and the webhook_secret field from the workspace github credential. There is no paste-into-GitHub step.After install, the skill prints a summary line per registered hook:conclusion=failure on a production-branch workflow — the rule lives in the generated .usezombie/platform-ops/SKILL.md and is editable. See Authoring to widen or narrow it.If gh is not authenticated for admin:repo_hook scope, the skill stops with the exact recovery command (gh auth refresh -s admin:repo_hook). If the hook already exists at the same URL, the skill matches on config.url and advances — re-running the install is idempotent.Trigger a real diagnosis
Cause a deploy failure on the production branch — push a known-broken commit, fail a test on purpose, whatever you have handy. Within seconds:Need to slice by source or time window?
- GitHub fires
workflow_run.completedwithconclusion: failureto the webhook URL. - The agent wakes, calls the tools
TRIGGER.mdallow-lists (http_requestagainst your hosting provider,memory_storefor findings), gathers evidence from the failed workflow. - An evidenced diagnosis posts to your Slack channel.
<zombie_id> with the value the install step printed (e.g. zmb_2041):zombiectl events zmb_2041 --actor 'webhook:*' --since 30m. Lost track of the ID? zombiectl list prints every agent in the active workspace. Or watch the same stream in the dashboard at app.usezombie.com.What just happened
The platform-ops agent installedSKILL.md (the prose system prompt — what to investigate, how to phrase a diagnosis) and TRIGGER.md (which tools the model can call) into your repo. From here, behaviour iterates on prose: edit the .usezombie/platform-ops/SKILL.md file, push, the next trigger runs the new behaviour. No redeploys, no DAG editor.
Next steps
Steer the agent manually
zombiectl steer <zombie_id> "morning health check" — same reasoning loop, manual trigger. Useful before you have a webhook firing.Read SKILL.md / TRIGGER.md
The two markdown files in
.usezombie/platform-ops/ are the entire behaviour. Edit them like any other source file.Context lifecycle
Why deep incidents keep reasoning past the model’s context cap. Defaults work for 95% of cases.
Authoring an agent
The
SKILL.md + TRIGGER.md reference for writing your own.