Skip to content

refactor(issue-607): extract _state_to_dict and _load_state_dict methods#609

Open
quuger wants to merge 1 commit into
bayesian-optimization:masterfrom
quuger:refactor/extract-state-serialization-methods
Open

refactor(issue-607): extract _state_to_dict and _load_state_dict methods#609
quuger wants to merge 1 commit into
bayesian-optimization:masterfrom
quuger:refactor/extract-state-serialization-methods

Conversation

@quuger
Copy link
Copy Markdown

@quuger quuger commented May 26, 2026

Extracted state serialization logic into separate methods to enable saving/loading optimizer state without using the local file system.

  • Add _state_to_dict() to serialize optimizer state to dictionary
  • Add _load_state_dict() to load optimizer state from dictionary
  • Update save_state() to use _state_to_dict()
  • Update load_state() to use _load_state_dict()
  • Enables JSON-based state manipulation without file I/O

Summary by CodeRabbit

  • Refactor
    • Internal improvements to state serialization and deserialization for optimization save/load functionality, with existing behavior maintained.

Review Change Stack

- Add _state_to_dict() to serialize optimizer state to dictionary
- Add _load_state_dict() to load optimizer state from dictionary
- Update save_state() to use _state_to_dict()
- Update load_state() to use _load_state_dict()
- Enables JSON-based state manipulation without file I/O
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 26, 2026

📝 Walkthrough

Walkthrough

BayesianOptimization refactors state serialization by extracting two internal helpers: _state_to_dict() builds the complete state dictionary, and _load_state_dict() reconstructs the optimizer from that dictionary. The public save_state() and load_state() methods delegate to these helpers while preserving JSON file I/O.

Changes

State Serialization Refactoring

Layer / File(s) Summary
State serialization helper extraction
bayes_opt/bayesian_optimization.py
New internal _state_to_dict() method encapsulates construction of the complete state dictionary, including RNG state, bounds (original and transformed), registered observations, acquisition parameters, GP parameters, and configuration fields.
State deserialization helper extraction
bayes_opt/bayesian_optimization.py
New internal _load_state_dict(state) method reconstructs the optimizer by registering saved observations, restoring acquisition parameters and bounds transformation, recreating the GP kernel and parameters, fitting the model, and restoring RNG state.
Public API refactoring
bayes_opt/bayesian_optimization.py
Public save_state() and load_state() methods refactored to call the new internal helpers while preserving JSON file read/write behavior and maintaining the public interface.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 State, split and sorted clean,
Two helpers now in-between,
Save and load dance light as air,
Refactored with the utmost care!
The rabbit's joy: less code duplication here! ✨

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main refactoring work: extracting state serialization into two new internal methods (_state_to_dict and _load_state_dict).
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
bayes_opt/bayesian_optimization.py (1)

431-449: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Validate the receiving optimizer before replaying saved observations.

_state_to_dict() serializes keys, pbounds, allow_duplicate_points, and transformed_bounds, but _load_state_dict() replays state["params"] straight into the current TargetSpace without checking that this instance was built with matching parameter order, bounds, duplicate-point policy, or transformer configuration. That makes the new dict-based round-trip fragile: a different key order can silently remap coordinates, a stricter duplicate policy can fail during register(), and missing self._bounds_transformer just drops saved transformed bounds.

Suggested guardrail
     def _load_state_dict(self, state: dict[str, Any]) -> None:
+        saved_keys = list(state["keys"])
+        current_keys = list(self._space.keys)
+        if saved_keys != current_keys:
+            raise ValueError("Saved state keys do not match optimizer keys")
+
+        saved_bounds = {k: tuple(v) for k, v in state["pbounds"].items()}
+        current_bounds = {k: tuple(self._space._bounds[i]) for i, k in enumerate(current_keys)}
+        if saved_bounds != current_bounds:
+            raise ValueError("Saved state bounds do not match optimizer bounds")
+
+        if (state["constraint_values"] is not None) != self.is_constrained:
+            raise ValueError("Saved state constraint mode does not match optimizer configuration")
+
+        if state["allow_duplicate_points"] != self._allow_duplicate_points:
+            raise ValueError(
+                "Saved state duplicate-point policy does not match optimizer configuration"
+            )
+
+        if state["transformed_bounds"] is not None and self._bounds_transformer is None:
+            raise ValueError("Saved state requires a bounds transformer")
+
         params_array = np.asarray(state["params"], dtype=np.float64)
         target_array = np.asarray(state["target"], dtype=np.float64)

Also applies to: 451-479

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@bayes_opt/bayesian_optimization.py` around lines 431 - 449, When loading a
saved state in _load_state_dict, first validate that the current
optimizer/TargetSpace matches the saved metadata from _state_to_dict: check that
state["keys"] equals self._space.keys (same order), state["pbounds"] matches
self._space._bounds (or transformed_bounds vs self._bounds_transformer), and
state["allow_duplicate_points"] equals self._allow_duplicate_points; if any
mismatch is detected, raise a clear ValueError (or provide a documented
migration path) instead of blindly calling self._space.register on
state["params"]; also handle the case where state["transformed_bounds"] exists
but self._bounds_transformer is absent by rejecting the load or requiring a
compatible transformer, and include state["random_state"]/acquisition metadata
only after validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@bayes_opt/bayesian_optimization.py`:
- Around line 431-449: When loading a saved state in _load_state_dict, first
validate that the current optimizer/TargetSpace matches the saved metadata from
_state_to_dict: check that state["keys"] equals self._space.keys (same order),
state["pbounds"] matches self._space._bounds (or transformed_bounds vs
self._bounds_transformer), and state["allow_duplicate_points"] equals
self._allow_duplicate_points; if any mismatch is detected, raise a clear
ValueError (or provide a documented migration path) instead of blindly calling
self._space.register on state["params"]; also handle the case where
state["transformed_bounds"] exists but self._bounds_transformer is absent by
rejecting the load or requiring a compatible transformer, and include
state["random_state"]/acquisition metadata only after validation.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7aa478e3-7948-4f39-9744-bc633291d475

📥 Commits

Reviewing files that changed from the base of the PR and between b2258c2 and 1a8143e.

📒 Files selected for processing (1)
  • bayes_opt/bayesian_optimization.py

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant