glitch logo

Automated CI failure remediation for charm development — turn flaky tests and obscure failures into actionable fixes.

╱╲ the pipeline

Three phases, one goal: reduce MTTR. Discovery scores flakiness locally. Collection captures telemetry on failure. Analysis classifies root causes and generates patches — all with human-in-the-loop approval.

Phase 1
🔍 DISCOVERY
local · heuristic
Phase 2
📦 COLLECTION
CI-side · telemetry
Phase 3
🧠 ANALYSIS
classify + auto-fix
📊
Phase 1

Discovery

Local CLI that scores and ranks flaky tests using pass/fail volatility, retry rates, timing variance, change-independence, and recency-weighting. Zero telemetry artifacts required.

uv run glitch discover --workflow "CI" --output table
📦
Phase 2

Collection

Captures comprehensive telemetry when tests fail: runner logs, charm logs, Kubernetes events, Ceph status, LXD state, and test artifacts — bundled as a single artifact for analysis.

uv run glitch collect
🧠
Phase 3

Analysis

Ingests flakiness scores + telemetry bundle to classify failures by root cause (flaky, charm-bug, test-bug, infrastructure, environment) with confidence scores, then generates patches.

uv run glitch analyze --artifact bundle.tar.gz

╱╲ see it in action

📺

[ asciinema demo coming soon ]

A terminal recording of glitch discovering flaky tests, collecting failure telemetry, and generating an automated patch — all in under 60 seconds.

╱╲ getting started

Prerequisites

Python 3.11+, uv, and a GitHub token with repo and actions:read scopes.

Installation

git clone https://github.com/MichaelThamm/glitch.git && cd glitch
uv sync

Quick Start

Set your GitHub token
export GITHUB_TOKEN=ghp_your_token_here
Discover flaky tests

Score and rank tests by flakiness using CI run history:

uv run glitch discover --repo owner/repo --workflow "CI"
Collect telemetry on failure

When tests fail in CI, capture everything needed for diagnosis:

uv run glitch collect
Analyze and get patches

Feed scores and artifacts into the analysis engine:

uv run glitch analyze --artifact bundle.tar.gz --scores discover.json

Configuration

glitch works out of the box, but you can customize collectors, scoring weights, and analysis thresholds. See the README for details.

╱╲ classification taxonomy

🎲 flaky

Non-deterministic; likely timing or environment sensitivity. Proposes retry policies and ordering guards.

🐛 charm-bug

Defect in the charm's logic or configuration. Generates a concrete patch — human approval required before landing.

🧪 test-bug

Defect in the test itself: bad assertion, wrong assumption. Patch generated with re-run verification.

🏗️ infrastructure

CI runner, Kubernetes, Ceph, or LXD-level issue. Files an annotated issue with reproduction steps.

🌐 environment

Transient external dependency: network, registry, upstream package. Logged for trend analysis.

❓ unknown

Insufficient signal to classify with confidence. Queued for human review with all available context.