Skip to content

Add sub_agent_strategy A/B experiment to smoke-gemini workflow#33540

Merged
pelikhan merged 2 commits into
mainfrom
copilot/experiment-smoke-gemini-sub-agent-strategy
May 20, 2026
Merged

Add sub_agent_strategy A/B experiment to smoke-gemini workflow#33540
pelikhan merged 2 commits into
mainfrom
copilot/experiment-smoke-gemini-sub-agent-strategy

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented May 20, 2026

This updates smoke-gemini to run as an explicit A/B experiment comparing the current single-agent smoke flow vs a sub-agent decomposition strategy. The goal is to measure token-efficiency impact while preserving reliability guardrails.

  • Experiment frontmatter

    • Added experiments.sub_agent_strategy in rich-object form with:
      • variants: single_agent, sub_agents
      • primary metric: effective_tokens
      • secondary metrics: run_duration_seconds, success_rate
      • guardrail: success_rate >= 0.95
      • sampling/weights/analysis metadata (min_samples, weight, start_date, analysis_type, tags)
  • Prompt branching by variant

    • Replaced the single static requirements block with explicit conditional branches:
      • single_agent: keep baseline sequential execution in the main agent
      • sub_agents: instruct launching 5 parallel background task sub-agents (one per smoke check), then aggregate via read_agent
    • Kept shared output/reporting behavior outside variant blocks.
  • Compiled workflow artifact

    • Regenerated smoke-gemini.lock.yml from updated source to include experiment selection/runtime wiring and branch-specific prompt content.
experiments:
  sub_agent_strategy:
    variants: [single_agent, sub_agents]
    metric: effective_tokens
    secondary_metrics: [run_duration_seconds, success_rate]
    guardrail_metrics:
      - name: success_rate
        threshold: ">=0.95"

Copilot AI changed the title [WIP] Add A/B test for sub agent strategy in smoke-gemini workflow Add sub_agent_strategy A/B experiment to smoke-gemini workflow May 20, 2026
Copilot AI requested a review from pelikhan May 20, 2026 13:11
@pelikhan pelikhan marked this pull request as ready for review May 20, 2026 13:12
Copilot AI review requested due to automatic review settings May 20, 2026 13:12
@pelikhan pelikhan merged commit 91ec205 into main May 20, 2026
1 check passed
@pelikhan pelikhan deleted the copilot/experiment-smoke-gemini-sub-agent-strategy branch May 20, 2026 13:13
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an explicit A/B experiment (sub_agent_strategy) to the smoke-gemini agentic workflow to compare the baseline single-agent smoke flow against a sub-agent decomposition strategy, with metrics/guardrails wired through the compiled lock workflow.

Changes:

  • Added experiments.sub_agent_strategy frontmatter (variants/metrics/guardrails/weights/analysis metadata).
  • Introduced variant-conditional prompt branches for single_agent vs sub_agents.
  • Regenerated smoke-gemini.lock.yml to include experiment selection, state restore/upload/push, and updated runtime wiring.
Show a summary per file
File Description
.github/workflows/smoke-gemini.md Defines the experiment and branches the smoke instructions by variant.
.github/workflows/smoke-gemini.lock.yml Compiled workflow updates to pick variants, persist experiment state, and render the variant-specific prompt.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 3

1. **Agent: github-mcp-test** — Use GitHub MCP tools to fetch details of exactly 2 merged pull requests from ${{ github.repository }} (title and number only). Return ✅ if successful.
2. **Agent: web-fetch-test** — Use the web-fetch MCP tool to fetch https://github.com and verify the response contains "GitHub". Return ✅ if successful.
3. **Agent: file-write-test** — Create a test file `/tmp/gh-aw/agent/smoke-test-gemini-${{ github.run_id }}.txt` with content "Smoke test passed for Gemini at $(date)". Return ✅ if successful.
4. **Agent: bash-test** — Execute bash commands to verify file creation was successful (use `cat` to read the file back). Return ✅ if successful.
@@ -1473,25 +1500,16 @@ jobs:
(github.event_name != 'pull_request' || github.event.pull_request.head.repo.id == github.repository_id) &&
(github.event_name != 'pull_request' || github.event.action != 'labeled' || github.event.label.name == 'smoke')
Comment on lines 548 to 552
node-version: '24'
package-manager-cache: false
- name: Install AWF binary
run: bash "${RUNNER_TEMP}/gh-aw/actions/install_awf_binary.sh" v0.25.49
run: bash "${RUNNER_TEMP}/gh-aw/actions/install_awf_binary.sh" v0.25.46
- name: Install Gemini CLI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ab-advisor] Experiment campaign for smoke-gemini: A/B test sub_agent_strategy

3 participants