Skip to content

feat(dataframe): expose the executed physical plan with per-operator metrics#97

Open
LantaoJin wants to merge 1 commit into
apache:mainfrom
LantaoJin:feat/dataframe-executed-plan
Open

feat(dataframe): expose the executed physical plan with per-operator metrics#97
LantaoJin wants to merge 1 commit into
apache:mainfrom
LantaoJin:feat/dataframe-executed-plan

Conversation

@LantaoJin
Copy link
Copy Markdown
Contributor

Which issue does this PR close?

Rationale for this change

df.explain(true, true) already runs the plan and attaches per-operator metrics, but the result is a DataFrame of text rows. Programmatic consumers — query-shape regression tests, operational audit feeds, build-time benchmarks — have to scrape output_rows=12345, elapsed_compute=4.2ms strings out of those rows. Brittle to upstream wording, ergonomically painful, and the typed metric values (Count, Time, Gauge) lose their type along the way.

This PR adds a typed accessor df.executedPlan() that returns an immutable ExecutedPlan tree once the DataFrame has been executed via collect() / executeStream(). Each node carries the operator name, a one-line display rendering, child nodes, and an OperatorMetrics POJO with OptionalLong fields for the well-known metric variants plus a Map<String, Long> for custom counters.

try (DataFrame df = ctx.sql("SELECT count(*) FROM events");
     ArrowReader r = df.collect(allocator)) {
    while (r.loadNextBatch()) { /* drain */ }
}
ExecutedPlan plan = df.executedPlan();
long rows = plan.metrics().outputRows().orElse(-1L);

The contract is post-mortem: executedPlan() requires a prior collect / executeStream and rejects with IllegalStateException("call collect() or executeStream() first") if called pre-execution. A future PR can extend the surface to make pre-execution structure inspection available too — that follow-up is intentionally out of scope here to keep this PR focused on the metric-snapshot surface.

What changes are included in this PR?

  • New public records ExecutedPlan and OperatorMetrics.
  • New DataFrame.executedPlan() method.
  • New proto/executed_plan.proto (ExecutedPlanNodeProto).
  • Native side: executed_plan.rs.
  • Java-side: one new final long planId field assigned at construction.

Out of scope (deferred to follow-up PRs):

  • Per-partition metric breakdown.
  • Time/Gauge-shaped custom metrics; v1 surfaces Count-shaped customs only.

Are these changes tested?

Yes. 10 new tests in the ExecutedPlanTest.

Are there any user-facing changes?

Yes, additive only -- no behavior changes for existing callers.

  • New public types ExecutedPlan and OperatorMetrics (records).
  • New DataFrame.executedPlan() method.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(dataframe): expose the executed physical plan with per-operator metrics

1 participant