extend command line --test to trigger speedup measure by xadupre · Pull Request #2536 · microsoft/Olive

xadupre · 2026-06-22T14:37:51Z

Describe your changes

--test only enables discrepancy measure by default, it needs to be extended to let the user add speedup measures

xadupre · 2026-06-22T14:38:56Z

@copilot can you documentation the option somewhere

Copilot · 2026-06-22T14:47:15Z

@copilot can you documentation the option somewhere

Added documentation for the new --test_metrics option (including speedup) in docs/source/how-to/cli/cli-fast-test.md in commit 15287d8.

Copilot

Pull request overview

This PR extends Olive’s CLI --test workflow by allowing callers to choose which discrepancy-check metrics are evaluated (accuracy via MAE and/or speedup via latency timing), and adds coverage + documentation for the new option.

Changes:

Add --test_metrics CLI option (choices: mae, speedup) and plumb it through olive run and the base CLI workflow runner.
Update discrepancy-check pass injection to enable/disable MAE threshold enforcement and speedup timing based on selected metrics.
Add unit tests for metric selection behavior and warning when --test_metrics is provided without --test, plus docs describing usage.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

File	Description
test/cli/test_base.py	Adds tests for metric-selection behavior and unused-flag warning behavior.
olive/cli/run.py	Passes `test_metrics` into discrepancy-check injection and warns when `--test_metrics` is unused.
olive/cli/base.py	Introduces `TEST_METRICS`, `--test_metrics` arg, warning helper, and metric-aware discrepancy pass injection.
docs/source/how-to/cli/cli-fast-test.md	Documents `--test_metrics` and provides an example for speedup-only runs.

xadupre · 2026-06-22T15:57:03Z

@copilot test/cli/test_base.py:360: in test_add_discrepancy_check_pass_default_enables_all_metrics
assert "timing_iterations" not in pass_config
E AssertionError: assert 'timing_iterations' not in {'type': 'OnnxDiscrepancyCheck', 'reference_model_path': 'ref_model', 'report_output_dir': 'out_dir', 'max_mae': 0.1, ...}

Copilot · 2026-06-22T15:58:46Z

@copilot test/cli/test_base.py:360: in test_add_discrepancy_check_pass_default_enables_all_metrics
assert "timing_iterations" not in pass_config
E AssertionError: assert 'timing_iterations' no...

Fixed in the latest commit. The test was asserting the old behavior (default = all metrics, so timing_iterations absent). After the default changed to {"mae"} only, timing_iterations is set to 0. Updated the test name to test_add_discrepancy_check_pass_default_enables_mae_only and changed the assertion to assert pass_config["timing_iterations"] == 0.

xadupre · 2026-06-22T17:14:37Z

@copilot fix test/cli/test_cli.py:178: in test_workflow_run_command_with_test_override
mock_run.assert_called_once_with(
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:961: in assert_called_once_with
return self.assert_called_with(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:949: in assert_called_with
raise AssertionError(_error_message()) from cause
E AssertionError: expected call not found.
E Expected: run({'input_model': {'type': 'HfModel', 'model_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM', 'load_kwargs': {'attn_implementation': 'eager', 'trust_remote_code': False}, 'test_model_config': {'hidden_layers': 2}, 'test_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model'}, 'output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output', 'passes': {'discrepancy_check': {'type': 'OnnxDiscrepancyCheck', 'reference_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model', 'max_mae': 0.1, 'report_output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output'}}}, list_required_packages=False, package_config=None, tempdir=None)
E Actual: run({'input_model': {'type': 'HfModel', 'model_path': 'hf-internal-testing/tiny-random-LlamaForCausalLM', 'load_kwargs': {'attn_implementation': 'eager', 'trust_remote_code': False}, 'test_model_config': {'hidden_layers': 2}, 'test_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model'}, 'output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output', 'passes': {'discrepancy_check': {'type': 'OnnxDiscrepancyCheck', 'reference_model_path': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output/test_model', 'report_output_dir': '/mnt/vss/_work/1/.pytest_basetemp/test_workflow_run_command_with1/output', 'max_mae': 0.1, 'timing_iterations': 0}}}, list_required_packages=False, tempdir=None, package_config=None)

Copilot · 2026-06-22T17:16:03Z

@copilot fix test/cli/test_cli.py:178: in test_workflow_run_command_with_test_override
mock_run.assert_called_once_with(
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:9...

Fixed in the latest commit. Added "timing_iterations": 0 to the expected discrepancy_check pass config in test_workflow_run_command_with_test_override, matching the new default mae-only behavior.

… (with dedicated GGUF conversion pass) (#2548) ## Describe your changes Merges #2536, #2535, #2534. Additionally adds llama.cpp integration and other improvements to `OnnxDiscrepancyCheck` and test-mode workflow handling: - **New `llama_cpp` flag** (`bool`, default `False`) on `OnnxDiscrepancyCheck` — when enabled, compares inference with llama.cpp. - **New `llama_cpp_env_path` parameter** (`Optional[str]`) — path to the `llama_env` virtual environment where `llama-cpp-python` and `convert_hf_to_gguf.py` are installed (defaults to `"llama_env"` relative to cwd). - **New `--test_llama_path` CLI option** — specifies the path to the `llama_env` virtual environment when running with `--test`. Using `--test_llama_path` without `--test` emits a warning. - **New `ConvertHfToGGUF` pass** (`olive/passes/pytorch/convert_hf_to_gguf.py`) — injected when `--test_llama_path` is provided. This pass converts the test HF model to GGUF ahead of discrepancy checking and stores the GGUF path in model attributes for downstream reuse. - **`compare_llama_cpp()` updates** — now reuses a preconverted GGUF when available; otherwise it falls back to in-method HF→GGUF conversion. llama.cpp comparison failures are captured in discrepancy results (status/failures) instead of aborting the whole run, so ONNX generation can still complete. - **Improved `--test_metrics` parsing** — now accepts both space-separated (`--test_metrics mae speedup`) and comma-separated (`--test_metrics mae,speedup`) forms. - **Fixed `add_discrepancy_check_pass` update-in-place** — existing discrepancy-pass config generated by dry-run is updated in-place so current `--test_metrics`, `--output_path`, and llama settings are applied. - **Fixed test model persistence across engine cache hits** — `ModelBuilder` stores a reference HF copy (`reference_hf_model/`) alongside cached ONNX outputs; discrepancy check falls back to this copy if the original test model path is missing. - **New `SaveTestModelConfig` pass** (`olive/passes/pytorch/save_test_model_config.py`) — injected at the start of passes for `--test`; ensures test model config/marker (and random test model persistence path usage) is set up before downstream passes. - **CI workflow** (`test-model-fast.yml`) — includes setup of a llama environment and llama.cpp conversion script dependencies. - **Updated documentation** (`cli-fast-test.md`) — clarifies where layer reduction happens, when test-model directories are created, cache fallback behavior, and llama.cpp test flow including the dedicated GGUF conversion pass. ## Checklist before requesting a review - [ ] Add unit tests for this change. - [ ] Make sure all tests can pass. - [ ] Update documents if necessary. - [ ] Lint and apply fixes to your code by running `lintrunner -a` - [ ] Is this a user-facing change? If yes, give a description of this change to be included in the release notes. ## (Optional) Issue link --------- Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>

extend command line --test to trigger speedup measure

6b5b652

Copilot started work on behalf of xadupre June 22, 2026 14:41 View session

Document --test_metrics speedup usage

15287d8

Copilot finished work on behalf of xadupre June 22, 2026 14:47

xadupre requested a review from Copilot June 22, 2026 15:08

Copilot started reviewing on behalf of xadupre June 22, 2026 15:08 View session

Copilot AI reviewed Jun 22, 2026

View reviewed changes

Comment thread olive/cli/base.py

Copilot started work on behalf of xadupre June 22, 2026 15:23 View session

Fix default test metrics to be mae-only, make speedup opt-in

89d98c4

Copilot finished work on behalf of xadupre June 22, 2026 15:24

xadupre marked this pull request as ready for review June 22, 2026 15:29

Copilot started work on behalf of xadupre June 22, 2026 15:57 View session

Fix test to match new default mae-only behavior

7490e55

Copilot finished work on behalf of xadupre June 22, 2026 15:59

Copilot started work on behalf of xadupre June 22, 2026 17:15 View session

Fix test_cli.py expected pass config to include timing_iterations=0

bf96e3f

Copilot finished work on behalf of xadupre June 22, 2026 17:16

xadupre added 2 commits June 25, 2026 09:45

Merge branch 'main' into xadupre/cmd

1e8c020

Merge branch 'main' into xadupre/cmd

3370626

xadupre mentioned this pull request Jun 29, 2026

Merge 3 existing PR related to OnnxDiscrepancyCheck + llama.cpp integration #2546

Open

5 tasks

Copilot AI mentioned this pull request Jul 1, 2026

Merge 3 existing PRs for OnnxDiscrepancyCheck + llama.cpp integration (with dedicated GGUF conversion pass) #2548

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

extend command line --test to trigger speedup measure#2536

extend command line --test to trigger speedup measure#2536
xadupre wants to merge 7 commits into
mainfrom
xadupre/cmd

xadupre commented Jun 22, 2026

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

xadupre commented Jun 22, 2026

Describe your changes

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

xadupre commented Jun 22, 2026

Uh oh!

Copilot AI commented Jun 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants