Skip to content

feat(autopilot): Staleness Sentinel — automated PR aging & escalation pipeline#177

Draft
labgadget015-dotcom with Copilot wants to merge 2 commits into
mainfrom
copilot/fix-import-errors-bughunteragent
Draft

feat(autopilot): Staleness Sentinel — automated PR aging & escalation pipeline#177
labgadget015-dotcom with Copilot wants to merge 2 commits into
mainfrom
copilot/fix-import-errors-bughunteragent

Conversation

Copilot AI commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

The daily autopilot summary surfaces stale PRs (108d, 97d open in ai-automation-engine) but takes no action on them. This adds a StalenessEngine that classifies open PRs by age, tracks tier transitions in a local state file, and posts escalation comments when a PR moves to a higher-severity tier.

Core: autopilot/staleness_engine.py

  • Four-tier classification: T0 fresh (0–6d) → T1 aging (7–29d) → T2 stale (30–89d) → T3 critical (90+d)
  • Dry-run ON by default — no GitHub writes unless ENABLE_LIVE_MODE=true in env
  • init_state_from_prs() — first-run seeding of staleness_state.json without posting any comments; prevents spam on already-stale PRs at deployment time
  • process_prs() — escalates only on tier increases; stable or downgrading PRs are skipped
  • Claude-generated comments via core/llm_provider.py with a hard cap (MAX_CLAUDE_CALLS_PER_RUN, default 20, overridable via env); falls back to static templates when cap is hit or llm_config is absent
  • Cost logging delegated to LLMClient.get_session_cost()
  • Standalone CLI: python -m autopilot.staleness_engine [--init-only]
# Recommended first-run sequence (safe — seeds state, posts nothing)
python -m autopilot.staleness_engine --init-only

# Subsequent runs in dry-run (default)
python -m autopilot.staleness_engine
# [dry-run] Would escalate acme/repo/30 to T3 (critical, 108d): fix: resolve import errors…

# Go live
ENABLE_LIVE_MODE=true python -m autopilot.staleness_engine

Integration & config

  • autopilot/config.yaml — new staleness: block with enabled, max_claude_calls_per_run, tier day-thresholds, and state_file path
  • autopilot/autopilot.py_run_staleness_check() invoked at the end of each daily run() when staleness.enabled: true; auto-seeds on first run (state file absent)

Safety

  • autopilot/staleness_state.json added to .gitignore (runtime state, not source)
  • 40 new unit tests covering tier boundaries, dry-run enforcement, call-cap behaviour, state persistence, init/process logic, and GitHub API posting

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review any files in this pull request.

@labgadget015-dotcom

Copy link
Copy Markdown
Owner

🤖 DRC Agent Analysis

Recommendation: 🟠 P1 IMPORTANT

Summary: Staleness Sentinel: Tiered PR Escalation Pipeline (Realist #1 recommendation) vs BugHunterAgent-First (Dreamer #2 top pick)

Next steps:

  1. Step 1: Audit all 7 repo names and ownership to confirm GITHUB_TOKEN scope covers cross-repo API reads — block implementation if any repo requires separate auth
  2. Step 2: Implement staleness_engine.py with StalenessEngine class, tiered classification logic, and dry-run guard (ENABLE_LIVE_MODE=false default) — target 6 hours
  3. Step 3: Create .github/workflows/staleness-sentinel.yml with nightly cron, artifact download/upload for state persistence, and environment variable injection for GITHUB_TOKEN and ANTHROPIC_API_KEY
  4. Step 4: Pre-populate staleness_state.json with PR ⚠️ Test Coverage Below Threshold - 2025-12-01 #30 (108d, ZOMBIE, notified=true) and PR ⚠️ Test Coverage Below Threshold - 2025-12-01 #31 (97d, ZOMBIE, notified=true) to prevent spam on first live run
  5. Step 5: Run 48-hour dry-run, review logs for correct tier assignments and Claude comment quality — do not enable live mode until output is manually verified

Strategic fit: Consulting: high · Product: high · Tech debt: reduces


Analysed by GadgetLab DRC Agent (Dreamer → Realist → Critic) · Run run_1782893961096

@labgadget015-dotcom

Copy link
Copy Markdown
Owner

🤖 DRC Agent Analysis

Recommendation: 🟠 P1 IMPORTANT

Summary: State-Aware StalenessEngine with Action History (Dreamer ID 2, Realist Recommended)

Next steps:

  1. Step 1: Create data/ directory in repo root and add to docker-compose.yml as a named volume mount at container path /app/data — do this before any code to unblock state persistence
  2. Step 2: Implement autopilot/staleness_engine.py following Realist implementation path Steps 1-4, with StateStore inner class, atomic write, tier logic, and Claude call cap
  3. Step 3: Add staleness config block to autopilot/config.yaml with dry_run: true, max_claude_calls_per_run: 10, tiers at 14/30/60 days, cooldown_days: 7 for warn and nudge
  4. Step 4: Add startup assertion validating state_path parent is writable; add cost projection warning if daily run rate would exceed $5/month
  5. Step 5: Integrate into autopilot/autopilot.py with config-gated optional block (Steps 6 of Realist path)

Strategic fit: Consulting: high · Product: high · Tech debt: reduces


Analysed by GadgetLab DRC Agent (Dreamer → Realist → Critic) · Run run_1782894133980

Copilot AI changed the title [WIP] Fix import errors and implement BugHunterAgent feat(autopilot): Staleness Sentinel — automated PR aging & escalation pipeline Jul 1, 2026
Copilot AI requested a review from labgadget015-dotcom July 1, 2026 08:28
@labgadget015-dotcom

Copy link
Copy Markdown
Owner

🤖 DRC Agent Analysis

Recommendation: 🟠 P1 IMPORTANT

Summary: Autonomous BugHunterAgent with Claude-Powered Triage Pipeline (Dreamer ID 3, Realist recommended)

Next steps:

  1. Step 1: Before any new code, complete the import error audit using py_compile and pylint --errors-only across all autopilot modules and commit fixes as a standalone PR to unblock CI — this is a hard prerequisite
  2. Step 2: Add bandit, ruff, anthropic>=0.25.0 to requirements.txt and verify clean install in Docker environment; pin versions explicitly
  3. Step 3: Implement BugHunterAgent class with BugFinding dataclass and scan() method (Bandit + Ruff JSON normalization, 50-finding cap) — no Claude calls yet; validate false-positive rate with dry run against repo
  4. Step 4: Implement ClaudeTriage helper mirroring StalenessEngine pattern with max_claude_calls=10 default, cost logging to bug_hunter_state.json, and BudgetExhaustedError; reduce unit test target to 25 focused tests covering cap enforcement, cost accumulation, and graceful degradation
  5. Step 5: Add bug_hunter config section to config.yaml with enabled=false, dry_run=true defaults; integrate into autopilot.py with full try/except isolation so BugHunterAgent failure never blocks StalenessEngine

Strategic fit: Consulting: high · Product: high · Tech debt: neutral


Analysed by GadgetLab DRC Agent (Dreamer → Realist → Critic) · Run run_1782894462613

@labgadget015-dotcom

Copy link
Copy Markdown
Owner

🤖 DRC Agent Analysis

Recommendation: 🟠 P1 IMPORTANT

Summary: Sentinel-as-Designed: Faithful StalenessEngine Implementation (Solution 1)

Next steps:

  1. Step 1: Create autopilot/staleness_engine.py with TierState dataclass, StalenessConfig dataclass, classify_tier(), load_state(), save_state(), init_state_from_prs(), process_prs(), execute_escalation_actions(), and CLI entrypoint with --init-only and --dry-run flags
  2. Step 2: Add generate_escalation_comment() wired to core/llm_provider.py with max_claude_calls_per_run counter enforcement and static Jinja2-style fallback template
  3. Step 3: Add staleness: config block to autopilot/config.yaml with enabled: false, dry_run: true, tier_thresholds_days: [7, 30, 90], max_claude_calls_per_run: 5, state_file: autopilot/staleness_state.json
  4. Step 4: Add _run_staleness_check() to autopilot/autopilot.py wrapped in try/except with structured error logging, called at end of run() guarded by config.staleness.enabled
  5. Step 5: Add staleness_state.json to .gitignore and document state recovery procedure in README to mitigate state-loss risk

Strategic fit: Consulting: high · Product: high · Tech debt: reduces


Analysed by GadgetLab DRC Agent (Dreamer → Realist → Critic) · Run run_1782894492264

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

📊 Code Complexity Analysis

Summary:

  • Total Functions Analyzed: 788
  • Average Complexity: 3.48
  • High Complexity Functions: 25
  • Low Maintainability Files: 57

⚠️ High Complexity Functions

These functions exceed the complexity threshold and should be refactored:

File Function Complexity Line
core/risk_scorer.py score_pull_request 35 141
autopilot/autopilot.py generate_summary 24 195
autopilot/ai_optimization/performance_monitor.py get_benchmark_stats 15 184
.github/scripts/weekly_digest.py build_blocks 15 38
.github/scripts/metrics_collector.py parse_workflow_metrics 14 148
.github/scripts/setup_branch_protection.py main 14 240
.github/scripts/self_healing_system.py analyze_failure_patterns 14 256
.github/scripts/ai_code_suggestor.py _check_import_organization 14 113
.github/scripts/prometheus_exporter.py collect_metrics 14 99
.github/scripts/workflow_monitor.py get_workflow_statistics 14 216

... and 15 more

Recommendations:

  • Break down large functions into smaller, focused units
  • Extract complex conditional logic into separate functions
  • Use early returns to reduce nesting

🔧 Low Maintainability Files

These files have low maintainability scores and may need refactoring:

File Score Status
.github/scripts/health_dashboard_generator.py 28.14 🔴
.github/scripts/workflow_monitor.py 33.73 🔴
.github/scripts/ai_code_suggestor.py 33.76 🔴
.github/scripts/ai_workflow_optimizer.py 35.51 🔴
.github/scripts/performance_benchmark.py 39.46 🔴
autopilot/autopilot.py 40.18 🔴
.github/scripts/self_healing_system.py 40.27 🔴
.github/scripts/threshold_monitor.py 41.13 🔴
.github/scripts/parallel_code_analyzer_optimized.py 41.16 🔴
autopilot/ai_optimization/anomaly_detector.py 42.56 🔴
agents/triage_agent.py 42.79 🔴
.github/scripts/refactoring_assistant.py 43.03 🔴
autopilot/ai_optimization/intelligent_cache.py 43.28 🔴
autopilot/ai_optimization/commit_summarizer.py 44.05 🔴
.github/scripts/async_parallel_analyzer.py 44.47 🔴
autopilot/ai_optimization/performance_monitor.py 44.69 🔴
.github/scripts/badge_generator.py 45.28 🔴
.github/scripts/copilot_integration.py 45.37 🔴
.github/scripts/distributed_monitoring.py 45.53 🔴
.github/scripts/elite_copilot.py 45.69 🔴
agents/dependency_agent.py 45.76 🔴
.github/scripts/cost_calculator.py 46.4 🔴
.github/scripts/inline_pr_commenter.py 46.63 🔴
.github/scripts/complexity_reporter.py 46.78 🔴
.github/scripts/pr_triage.py 47.13 🔴
autopilot/staleness_engine.py 47.39 🔴
core/risk_scorer.py 48.15 🔴
autopilot/ai_optimization/nlp_relevance_filter.py 48.43 🔴
.github/scripts/pr_inline_commenter.py 48.47 🔴
.github/scripts/metrics_collector.py 48.91 🔴
.github/scripts/dependency_updater.py 48.91 🔴
autopilot/ai_optimization/ml_priority_scorer.py 49.53 🔴
.github/scripts/changelog_generator.py 49.75 🔴
.github/scripts/parallel_code_analyzer.py 49.96 🔴
autopilot/ai_optimization/api_optimizer.py 50.46 🟡
.github/scripts/issue_auto_creator.py 50.89 🟡
agents/security_scan_agent.py 51.04 🟡
.github/scripts/workflow_optimizer.py 51.67 🟡
.github/scripts/cot_selector.py 51.73 🟡
.github/scripts/release_manager.py 51.92 🟡
.github/scripts/llm_router.py 52.55 🟡
.github/scripts/auto_pr.py 52.72 🟡
.github/scripts/notification_manager.py 53.58 🟡
.github/scripts/prometheus_exporter.py 54.96 🟡
.github/scripts/weekly_digest.py 55.02 🟡
core/audit_logger.py 55.6 🟡
.github/scripts/gather_context.py 56.0 🟡
core/llm_provider.py 56.32 🟡
.github/scripts/streaming_results.py 56.64 🟡
.github/scripts/setup_branch_protection.py 57.0 🟡
.github/scripts/optimized_github_client.py 58.27 🟡
agents/orchestrator_agent.py 59.02 🟡
agents/code_review_agent.py 60.45 🟡
core/github_client.py 61.96 🟡
core/message_queue.py 63.22 🟡
core/agent_config.py 63.86 🟡
core/idempotency.py 64.45 🟡

Maintainability Index Guide:

  • 🟢 85-100: Excellent maintainability
  • 🟡 65-84: Good maintainability
  • 🟠 50-64: Moderate maintainability (consider refactoring)
  • 🔴 0-49: Poor maintainability (needs refactoring)

@github-actions github-actions Bot added testing autopilot Changes to autopilot/ labels Jul 1, 2026
@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🟢 Risk Assessment: LOW (2.1/10)

Analysed 5 files, 1059+ / 0− lines. Test coverage unchanged or improved.

Scoring breakdown

Factor Score
Change volume — 1059 lines changed +1.1
Risky extensions — 1 config/script files +0.5
Draft PR — marked as draft +0.5

✅ Eligible for auto-merge (subject to CI passing).

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🤖 Elite AI Copilot Analysis

Elite AI Copilot Analysis Report

Generated: 2026-07-01 10:26:50
Session ID: copilot_1782901610
Repository: .

🎯 Health Score: 100.0/100

🚀 Top Recommendations

  1. ✅ Repository is in excellent shape - continue current practices

📊 Detailed Insights

Code Quality Baseline Established

  • Category: code_quality
  • Severity: info
  • Description: Repository code quality metrics captured
  • Suggested Action: Continue monitoring for regressions
  • Confidence: 90%

Security Scan Initiated

  • Category: security
  • Severity: info
  • Description: No critical vulnerabilities detected in initial scan
  • Suggested Action: Enable continuous security monitoring
  • Confidence: 85%

Repository Structure Analyzed

  • Category: architecture
  • Severity: info
  • Description: Well-organized modular structure detected
  • Suggested Action: Maintain separation of concerns
  • Confidence: 80%

Performance Baseline Captured

  • Category: performance
  • Severity: info
  • Description: Repository performance metrics recorded
  • Suggested Action: Monitor for performance regressions
  • Confidence: 75%

Documentation Structure Good

  • Category: documentation
  • Severity: info
  • Description: Comprehensive documentation files present
  • Suggested Action: Keep documentation in sync with code changes
  • Confidence: 90%

Powered by Elite AI Copilot v1.0

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

Code Quality Analysis ❌ FAILED

Duration: 0.01s
Total Issues: 10

Tool Results

  • pylint: ❌
  • flake8: ❌
  • bandit: ❌
  • radon_cc: ❌
  • radon_mi: ❌
View detailed results
{
  "timestamp": "2026-07-01 10:26:56",
  "elapsed_seconds": 0.01,
  "summary": {
    "total_issues": 10,
    "critical": 0,
    "high": 0,
    "medium": 0,
    "low": 0
  },
  "tools": {
    "pylint": {
      "status": "failed",
      "output": "",
      "errors": "Pylint error: [Errno 2] No such file or directory: 'pylint'"
    },
    "flake8": {
      "status": "failed",
      "output": "",
      "errors": "Flake8 error: [Errno 2] No such file or directory: 'flake8'"
    },
    "bandit": {
      "status": "failed",
      "output": "",
      "errors": "Bandit error: [Errno 2] No such file or directory: 'bandit'"
    },
    "radon_cc": {
      "status": "failed",
      "output": "",
      "errors": "Radon error: [Errno 2] No such file or directory: 'radon'"
    },
    "radon_mi": {
      "status": "failed",
      "output": "",
      "errors": "Radon MI error: [Errno 2] No such file or directory: 'radon'"
    }
  },
  "passed": false
}

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🔒 Security Scan Results

🛡️ Bandit Security Scan

  • 🔴 HIGH: 0
  • 🟡 MEDIUM: 9
  • 🟢 LOW: 77

📦 Dependency Vulnerabilities

  • Total vulnerable dependencies: 61

Vulnerable Dependencies:

  • pygithub 2.9.1
  • aiohttp 3.14.1
  • multidict 6.7.1
  • yarl 1.24.2
  • pyyaml 6.0.3
  • ... and 56 more

Security scans run automatically on every PR. View detailed reports in the Actions tab.

@github-actions

github-actions Bot commented Jul 1, 2026

Copy link
Copy Markdown
Contributor

🔍 Pre-commit Checks

⚠️ Pre-commit checks found issues that could not be auto-fixed.

Please run the following locally to fix them:

pre-commit run --all-files

Or install pre-commit hooks to automatically check on commit:

pre-commit install

Pre-commit hooks help maintain code quality and consistency.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

autopilot Changes to autopilot/ testing

Projects

None yet

Development

Successfully merging this pull request may close these issues.

📅 Daily Repository Summary - 2026-06-30

3 participants