Wachd is an innovative self-hosted SaaS solution designed to revolutionize on-call incident response by providing AI-driven root cause analysis. Instead of just telling engineers that an alert fired, Wachd diagnoses why it fired, offering immediate, actionable insights. It's built for DevOps engineers, SREs, and on-call teams seeking to reduce Mean Time To Resolution (MTTR) and alleviate on-call burnout.
Imagine an on-call engineer receiving a generic alert like “HighErrorRate firing.” Traditionally, this would trigger a lengthy manual diagnosis process. Wachd eliminates this by automatically running a diagnosis while the alert is still routing. It collects context such as recent GitHub commits, error logs from Loki or Datadog, and metric history around the alert window. The AI then analyzes this causal timeline to provide a plain-English root cause and a suggested action, empowering the engineer to resolve issues faster.
Wachd is particularly beneficial for organizations with strict data privacy requirements or those operating in regulated environments. Its air-gapped capabilities and synchronous PII sanitization ensure that sensitive information never leaves your cluster or touches external AI services without being stripped. This makes it an ideal solution for teams needing robust incident management without compromising security or compliance.
Wachd operates on a freemium model. The "Open Source" tier is free and self-hosted, offering core AI alert analysis (Ollama), on-call scheduling, per-user notification rules, Slack/email/SMS/voice integration, CVE breach intelligence, and unlimited teams/users under an Apache 2.0 license. "SMB" and "Enterprise" tiers are coming soon, providing additional features like cloud AI options (Claude, OpenAI, Gemini), analytics dashboards, SSO, compliance reports, and dedicated support.
Designed for ease of deployment and use, Wachd is self-hosted on Kubernetes with a Helm chart providing sane defaults. Engineers can easily configure their personal notification preferences, ensuring they receive alerts in the most effective way. While the core product is open-source, future paid tiers will offer priority email support for SMB customers and dedicated support with SLAs for Enterprise clients, ensuring comprehensive assistance.
Wachd is built for Kubernetes environments, supporting both external (RDS/ElastiCache) and in-cluster (Postgres/Redis) data stores. It integrates seamlessly with existing monitoring stacks including Grafana, Datadog, Prometheus, and Loki, receiving alerts via webhooks. For context collection, it connects to GitHub/GitLab for commits and various log/metric sources. AI analysis leverages Ollama (for air-gapped deployments), Claude, OpenAI, or Gemini. Notification channels include Slack, Microsoft Teams, Twilio (for SMS/voice), and email. It's compatible with major Kubernetes platforms like AWS EKS, Azure AKS, and GKE.
Wachd stands out as a powerful, privacy-focused solution for modern incident management. By automating the diagnosis of on-call alerts with AI, it empowers engineering teams to respond faster, reduce operational overhead, and maintain higher service reliability. Its commitment to self-hosting and open-source principles offers unparalleled control and flexibility. Explore Wachd today to transform your incident response strategy and give your on-call engineers answers, not just alerts.