pass baseExperimentId to summarize by barrettpyke · Pull Request #2124 · braintrustdata/braintrust-sdk-javascript

Barrett Pyke (barrettpyke) · 2026-06-15T22:10:41Z

Description

Eval stores baseExperimentId correctly on the experiment but the final summary does not pass it as the explicit comparison ID. As a result, summary comparison can fall back to project/default baseline resolution and show wrong diffs.

Reproduction

Navigate to Experiments in the Braintrust web app.
Select an Experiment and click Set as default baseline.
Run the following script:

import { Eval } from "braintrust";

const projectName = "<project-name>";
const suffix = Date.now();

console.log("creating baseline...");
const baseline = await Eval(projectName, {
  experimentName: `baseline-${suffix}`,
  data: [{ input: "hello", expected: "hello" }],
  task: (input) => input,
  scores: [
    ({ output, expected }) => ({
      name: "exact_match",
      score: output === expected ? 1 : 0,
    }),
  ],
  summarizeScores: false,
});

const baseExperimentId = baseline.summary.experimentId;
console.log("baseline experiment id:", baseExperimentId);

console.log("creating comparison with baseExperimentId...");
const comparison = await Eval(projectName, {
  experimentName: `comparison-${suffix}`,
  baseExperimentId,
  data: [{ input: "hello", expected: "hello" }],
  task: (input) => input,
  scores: [
    ({ output, expected }) => ({
      name: "exact_match",
      score: output === expected ? 1 : 0,
    }),
  ],
});

console.log("comparison experiment id:", comparison.summary.experimentId);
console.log("comparison baseline name:", comparison.summary.comparisonExperimentName);
console.log(JSON.stringify(comparison.summary, null, 2));

Expected
comparison-${timestamp} experiment should be compared to baseline-${timestamp} experiment

Observed
comparison-${timestamp} is compared to the default baseline experiment selected in Step 2

Fix

Use explicit baseExperimentId in summarize call. If undefined fallback to Experiment's persisted base_exp_id which may have been resolved during experiment init from baseExperimentName, BaseExperiment(...) data, or backend baseline resolution.
When experiment.summarize() receives an explicit comparisonExperimentId fetch v1/experiment/${comparisonExperimentId} to resolve the comparison experiment name for display.

Testing

Tested changes manually with repro steps for baseExperimentId and baseExperimentName
Added unit tests for:
- runEvaluator forwards baseExperimentId to summary
- runEvaluator forwards persisted baseExperimentName id to summary
- experiment.summarize resolves explicit comparison experiment name

pass baseExperimentId to summarize

f78aceb

Barrett Pyke (barrettpyke) requested a review from Abhijeet Prasad (AbhiPrasad) June 15, 2026 22:11

add changeset

c7311ba

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pass baseExperimentId to summarize#2124

pass baseExperimentId to summarize#2124
Barrett Pyke (barrettpyke) wants to merge 2 commits into
mainfrom
barrettpyke/fix-comparison-exp

Barrett Pyke (barrettpyke) commented Jun 15, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Barrett Pyke (barrettpyke) commented Jun 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Reproduction

Fix

Testing

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Barrett Pyke (barrettpyke) commented Jun 15, 2026 •

edited

Loading