Skip to content

Refactor PR code quality reviewer to use grumpy sub-agent and strict A2A triage#34555

Merged
pelikhan merged 5 commits into
mainfrom
copilot/refactor-pr-code-quality-reviewer
May 25, 2026
Merged

Refactor PR code quality reviewer to use grumpy sub-agent and strict A2A triage#34555
pelikhan merged 5 commits into
mainfrom
copilot/refactor-pr-code-quality-reviewer

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 25, 2026

The PR code quality reviewer was too friendly and low-signal. This change rewires it around a dedicated grumpy sub-agent and explicit agent-to-agent adjudication so reviews are materially more critical and merge-gating is more deterministic.

  • Prompt architecture

    • Added inline sub-agent grumpy-coder to perform a hostile first-pass over changed lines.
    • Main reviewer now consumes sub-agent findings as advisory input, then performs a second-pass review before publishing comments.
  • A2A adjudication model

    • Introduced explicit triage states: KEEP, HARDEN, DROP.
    • Clarified that pseudo-encoding is allowed for private reasoning only; published PR output must remain plain, actionable language.
  • Decision policy hardening

    • Replaced “friendly/constructive” posture with skeptical, risk-first guidance.
    • Added deterministic escalation rules for REQUEST_CHANGES (severity/count/impact based), including fallback behavior when sub-agent output is malformed.
  • Sub-agent contract tightening

    • Defined strict JSONL output contract for grumpy-coder findings.
    • Added normalization rules (line coercion, invalid-path/severity drop, concise truncation) to reduce parsing fragility.
### Step 3: Judge Agent-to-Agent Findings
- `KEEP` — valid issue, comment on PR
- `HARDEN` — valid but underexplained; strengthen impact/rationale
- `DROP` — incorrect, non-actionable, or outside changed lines

Use `REQUEST_CHANGES` when:
- any `critical`/`high` issue is valid, or
- 3+ `medium` issues are valid, or
- issue can cause data loss/auth bypass/panic/broken CI.

@pelikhan pelikhan marked this pull request as ready for review May 25, 2026 02:59
Copilot AI review requested due to automatic review settings May 25, 2026 02:59
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot merge main and recompile

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Refactors the Copilot-driven “PR Code Quality Reviewer” workflow prompt to use an inline grumpy-coder sub-agent and a stricter triage/escalation model aimed at producing more merge-gating reviews.

Changes:

  • Updated reviewer workflow prompt to invoke a grumpy-coder sub-agent, then adjudicate findings via KEEP/HARDEN/DROP and stricter REQUEST_CHANGES rules.
  • Added an inline grumpy-coder agent block with a strict JSONL findings contract.
  • Applied gofmt/indentation-only changes in two Go test files.
Show a summary per file
File Description
pkg/cli/forecast_test.go Whitespace/gofmt-only reformatting within a test.
pkg/cli/compile_schedule_calendar_test.go Whitespace/gofmt-only reformatting within an end-to-end fuzzy schedule test.
.github/workflows/pr-code-quality-reviewer.md Reworked prompt to use a sub-agent + strict triage/escalation, and added the grumpy-coder agent definition.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 1/3 changed files
  • Comments generated: 2

Comment thread pkg/cli/forecast_test.go
Comment on lines +129 to +135
// Reproduce the λ calculation from forecastWorkflow.
const (
historyDays = 30
sampledRuns = 15
projectedDays = 30 // "month" period
)
observedRunsPerPeriod := float64(sampledRuns) / float64(historyDays) * float64(projectedDays)
Comment on lines +320 to +332
fuzzyExpressions := []struct {
fuzzyCron string
workflowID string
expectedHours int // how many distinct hour values we expect (1 for DAILY patterns)
}{
{"FUZZY:DAILY * * *", "ci-doctor", 1},
{"FUZZY:DAILY_WEEKDAYS * * *", "daily-planner", 1},
{"FUZZY:DAILY_AROUND:14:0 * * *", "weekly-audit", 1},
}

totalSlots := 0
for _, day := range grid {
for _, count := range day {
totalSlots += count
}
}
assert.Greater(t, totalSlots, 0,
"grid should contain at least one scheduled slot for %s", scatteredCron)

// Step 4: displayScheduleCalendar should produce output referencing the hour.
oldStderr := os.Stderr
r, w, pipeErr := os.Pipe()
require.NoError(t, pipeErr)
os.Stderr = w

displayScheduleCalendar(statsList)

w.Close()
os.Stderr = oldStderr

var buf bytes.Buffer
_, _ = buf.ReadFrom(r)
output := buf.String()

assert.Contains(t, output, "Schedule Heatmap",
"output should contain Schedule Heatmap header")
// The hour from the scattered cron should appear in the output.
for _, h := range hours {
hourStr := fmt.Sprintf("%02d", h)
assert.Contains(t, output, hourStr,
"output should contain hour %s from scattered cron %s", hourStr, scatteredCron)
}
})
}
for _, tt := range fuzzyExpressions {
t.Run(fmt.Sprintf("%s/%s", tt.fuzzyCron, tt.workflowID), func(t *testing.T) {
// Step 1: scatter the fuzzy expression to a real cron string.
@pelikhan pelikhan merged commit 5d05a5d into main May 25, 2026
@pelikhan pelikhan deleted the copilot/refactor-pr-code-quality-reviewer branch May 25, 2026 03:08
Copilot stopped work on behalf of pelikhan due to an error May 25, 2026 03:08
Copilot AI requested a review from pelikhan May 25, 2026 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants