The ER for software projects. A guild of señor engineers who go into the dark and bring failing systems back: stabilizing infrastructure, rescuing stalled builds, and keeping production alive at scale.
This is where the Guild opens its tools. The operators, libraries, and infrastructure we forge in the field, hardened on real rescues and released for you to run.
Open-source infrastructure tooling built by people who carry pagers. Not toy projects: these are the instruments we reach for when a system is on fire. We open them because infra depth is something you verify, not something you're told.
Everything here is Apache-2.0. Contributions welcome via DCO (git commit -s).
-
failsafe — The fail-safe for AI coding agents. Stops the irreversible command before your agent deletes prod.
-
hermes-operator — Kubernetes operator for running Hermes agents securely and at scale.
More experiments from The Deep Levels → undermountain.cc/labs
Servers collapsing under load. A launch that won't ship. The tech lead who's gone and took the architecture with them. That's a CODE/911 call. We arrive in 48 hours and stabilize in 14 days.
Field reports from the depths: Kubernetes, GitOps, chaos engineering, and infrastructure at scale.
- Site — undermountain.cc
- Lab Reports (blog) — undermountain.cc/blog
- Emergency response — undermountain.cc/911
- Contact — contact@undermountain.cc