Add `sub_agent_strategy` A/B experiment to `smoke-gemini` workflow by Copilot · Pull Request #33540 · github/gh-aw

Copilot · 2026-05-20T12:55:31Z

This updates smoke-gemini to run as an explicit A/B experiment comparing the current single-agent smoke flow vs a sub-agent decomposition strategy. The goal is to measure token-efficiency impact while preserving reliability guardrails.

Experiment frontmatter
- Added experiments.sub_agent_strategy in rich-object form with:
  - variants: single_agent, sub_agents
  - primary metric: effective_tokens
  - secondary metrics: run_duration_seconds, success_rate
  - guardrail: success_rate >= 0.95
  - sampling/weights/analysis metadata (min_samples, weight, start_date, analysis_type, tags)
Prompt branching by variant
- Replaced the single static requirements block with explicit conditional branches:
  - single_agent: keep baseline sequential execution in the main agent
  - sub_agents: instruct launching 5 parallel background task sub-agents (one per smoke check), then aggregate via read_agent
- Kept shared output/reporting behavior outside variant blocks.
Compiled workflow artifact
- Regenerated smoke-gemini.lock.yml from updated source to include experiment selection/runtime wiring and branch-specific prompt content.

experiments:
  sub_agent_strategy:
    variants: [single_agent, sub_agents]
    metric: effective_tokens
    secondary_metrics: [run_duration_seconds, success_rate]
    guardrail_metrics:
      - name: success_rate
        threshold: ">=0.95"

Co-authored-by: pelikhan <[email protected]>

Copilot

Pull request overview

Adds an explicit A/B experiment (sub_agent_strategy) to the smoke-gemini agentic workflow to compare the baseline single-agent smoke flow against a sub-agent decomposition strategy, with metrics/guardrails wired through the compiled lock workflow.

Changes:

Added experiments.sub_agent_strategy frontmatter (variants/metrics/guardrails/weights/analysis metadata).
Introduced variant-conditional prompt branches for single_agent vs sub_agents.
Regenerated smoke-gemini.lock.yml to include experiment selection, state restore/upload/push, and updated runtime wiring.

Show a summary per file

File	Description
.github/workflows/smoke-gemini.md	Defines the experiment and branches the smoke instructions by variant.
.github/workflows/smoke-gemini.lock.yml	Compiled workflow updates to pick variants, persist experiment state, and render the variant-specific prompt.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 3

+1. **Agent: github-mcp-test** — Use GitHub MCP tools to fetch details of exactly 2 merged pull requests from ${{ github.repository }} (title and number only). Return ✅ if successful.
+2. **Agent: web-fetch-test** — Use the web-fetch MCP tool to fetch https://github.com and verify the response contains "GitHub". Return ✅ if successful.
+3. **Agent: file-write-test** — Create a test file `/tmp/gh-aw/agent/smoke-test-gemini-${{ github.run_id }}.txt` with content "Smoke test passed for Gemini at $(date)". Return ✅ if successful.
+4. **Agent: bash-test** — Execute bash commands to verify file creation was successful (use `cat` to read the file back). Return ✅ if successful.


@@ -1473,25 +1500,16 @@ jobs:
      (github.event_name != 'pull_request' || github.event.pull_request.head.repo.id == github.repository_id) &&
      (github.event_name != 'pull_request' || github.event.action != 'labeled' || github.event.label.name == 'smoke')


          node-version: '24'
          package-manager-cache: false
      - name: Install AWF binary
-        run: bash "${RUNNER_TEMP}/gh-aw/actions/install_awf_binary.sh" v0.25.49
+        run: bash "${RUNNER_TEMP}/gh-aw/actions/install_awf_binary.sh" v0.25.46
      - name: Install Gemini CLI


Initial plan

cfd2f03

Copilot AI assigned Copilot and pelikhan May 20, 2026

Copilot started work on behalf of pelikhan May 20, 2026 12:55 View session

Copilot AI linked an issue May 20, 2026 that may be closed by this pull request

[ab-advisor] Experiment campaign for smoke-gemini: A/B test sub_agent_strategy #33521

Closed

7 tasks

Add sub_agent_strategy experiment to smoke-gemini workflow

a5d1acb

Co-authored-by: pelikhan <[email protected]>

Copilot AI changed the title ~~[WIP] Add A/B test for sub agent strategy in smoke-gemini workflow~~ Add sub_agent_strategy A/B experiment to smoke-gemini workflow May 20, 2026

Copilot AI requested a review from pelikhan May 20, 2026 13:11

Copilot finished work on behalf of pelikhan May 20, 2026 13:11

pelikhan marked this pull request as ready for review May 20, 2026 13:12

Copilot AI review requested due to automatic review settings May 20, 2026 13:12

pelikhan merged commit 91ec205 into main May 20, 2026
1 check passed

pelikhan deleted the copilot/experiment-smoke-gemini-sub-agent-strategy branch May 20, 2026 13:13

Copilot started reviewing on behalf of pelikhan May 20, 2026 13:13 View session

Copilot AI reviewed May 20, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `sub_agent_strategy` A/B experiment to `smoke-gemini` workflow#33540

Add `sub_agent_strategy` A/B experiment to `smoke-gemini` workflow#33540
pelikhan merged 2 commits into
mainfrom
copilot/experiment-smoke-gemini-sub-agent-strategy

Copilot AI commented May 20, 2026 •

edited

Loading

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		@@ -1473,25 +1500,16 @@ jobs:
		(github.event_name != 'pull_request' \|\| github.event.pull_request.head.repo.id == github.repository_id) &&
		(github.event_name != 'pull_request' \|\| github.event.action != 'labeled' \|\| github.event.label.name == 'smoke')

Conversation

Copilot AI commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented May 20, 2026 •

edited

Loading