Skip to content

<fix>[storage]: decouple alive-chain membership from vmState in VolumeTree#4061

Open
MatheMatrix wants to merge 1 commit into
zsv_5.1.0from
sync/tao.gan/fix/ZSV-10538-alive-chain-decouple@@2
Open

<fix>[storage]: decouple alive-chain membership from vmState in VolumeTree#4061
MatheMatrix wants to merge 1 commit into
zsv_5.1.0from
sync/tao.gan/fix/ZSV-10538-alive-chain-decouple@@2

Conversation

@MatheMatrix
Copy link
Copy Markdown
Owner

VolumeTree previously conflated two independent concepts in isOnline /
resolveDirection:
(a) is the snapshot on the volume's live backing chain (a structural
property of the tree)
(b) is the VM currently routed through the hypervisor (Running/Paused)

Two consequences:

  1. In VolumeSnapshotTreeBase.stepDelete, the multi-children "defer
    the alive-chain child to the final round" guard used isOnline,
    which returned false for every child when the VM was Stopped.
    The guard silently disappeared and child selection fell back to
    children.get(0), which could land on the volume's backing chain
    and corrupt the live chain on rebase failure.
  2. resolveDirection's shouldUseCommitStrategy was likewise gated on
    vmState, so direction=Auto + Stopped degraded to Pull (writing N
    copies of the (target - parent) delta into each child file)
    instead of a single-file Commit.

Split into two predicates:

  • isOnAliveChain(snapshotUuid): pure tree-structure query, used by
    stepDelete to identify the alive child regardless of vmState.
  • isHypervisorOperation(vmState): pure run-state predicate, used to
    pick libvirt vs primary-storage agent.

isOnline is rewritten as the compound (treeIsCurrent &&
isHypervisorOperation(vmState) && both endpoints on alive chain),
preserving its existing semantics for current callers.

resolveDirection now derives shouldUseCommitStrategy purely from tree
structure, so Auto + Stopped + alive-chain target now picks Commit.

stepDelete renames onlineChild to aliveChild and queries
isOnAliveChain, so the alive-child deferral protection now applies to
Stopped VMs as well.

Resolves: ZSV-10538

Change-Id: I6e63786d6d6b616f7a766b6c7463617572786969

sync from gitlab !9959

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: ef5e6dd1-bc6c-4b13-87e2-674dfdf80b57

📥 Commits

Reviewing files that changed from the base of the PR and between 19701c4 and 978db42.

📒 Files selected for processing (2)
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
  • storage/src/main/java/org/zstack/storage/snapshot/VolumeTree.java

Walkthrough

将活跃链判定从 VM 运行态判定中拆出(新增 VolumeTree.isOnAliveChain 与 isHypervisorOperation),并在单快照删除流程中基于活跃链选择待删子节点、调整多子节点处理顺序以保护活跃链。

变更

快照删除与活跃链判定

层 / 文件 摘要
VolumeTree中新方法的引入与逻辑拆分
storage/src/main/java/org/zstack/storage/snapshot/VolumeTree.java
新增isOnAliveChain(String snapshotUuid)判定快照是否在活跃链上(与VM状态无关),新增静态方法isHypervisorOperation(VmInstanceState vmState)判定是否为Running/Paused状态;resolveDirection(...)中的shouldUseCommitStrategy从混合条件改为通过这两个方法分别判定;isOnline(...)的实现显式拆分VM状态和活跃链检查。
快照删除流程中子节点选择的适配
storage/src/main/java/org/zstack/storage/snapshot/VolumeSnapshotTreeBase.java
将单个快照删除中的待删除子节点选择从基于 isOnline 的 onlineChild 改为使用 isOnAliveChain() 判定的 aliveChild;在多子节点场景中若首个待处理节点为活跃子节点则先处理另一个子节点再继续 pull(...),其余场景保持原有 commit(...)/pull(...) 处理路径。

代码审查工作量估算

🎯 3 (中等复杂度) | ⏱️ ~22 分钟

诗句

兔兔裁剪了纠缠之线,
活跃链与运态分明各站,
删除之路更显清晰智慧,
快照的守护在细节中绽放。 🐰✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed 标题清晰准确地总结了此次变更的主要目的:将 VolumeTree 中的 alive-chain 成员判定从 vmState 中解耦。
Description check ✅ Passed 描述详细阐明了背景问题、变更目的和具体实现方案,与代码变更内容完全相关并涵盖两个关键修复场景。
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/tao.gan/fix/ZSV-10538-alive-chain-decouple@@2

Comment @coderabbitai help to get the list of available commands and usage tips.

…eTree

VolumeTree previously conflated two independent concepts in isOnline /
resolveDirection:
  (a) is the snapshot on the volume's live backing chain (a structural
      property of the tree)
  (b) is the VM currently routed through the hypervisor (Running/Paused)

Two consequences:
  1. In VolumeSnapshotTreeBase.stepDelete, the multi-children "defer
     the alive-chain child to the final round" guard used isOnline,
     which returned false for every child when the VM was Stopped.
     The guard silently disappeared and child selection fell back to
     children.get(0), which could land on the volume's backing chain
     and corrupt the live chain on rebase failure.
  2. resolveDirection's shouldUseCommitStrategy was likewise gated on
     vmState, so direction=Auto + Stopped degraded to Pull (writing N
     copies of the (target - parent) delta into each child file)
     instead of a single-file Commit.

Split into two predicates:
  - isOnAliveChain(snapshotUuid): pure tree-structure query, used by
    stepDelete to identify the alive child regardless of vmState.
  - isHypervisorOperation(vmState): pure run-state predicate, used to
    pick libvirt vs primary-storage agent.

isOnline is rewritten as the compound (treeIsCurrent &&
isHypervisorOperation(vmState) && both endpoints on alive chain),
preserving its existing semantics for current callers.

resolveDirection now derives shouldUseCommitStrategy purely from tree
structure, so Auto + Stopped + alive-chain target now picks Commit.

stepDelete renames onlineChild to aliveChild and queries
isOnAliveChain, so the alive-child deferral protection now applies to
Stopped VMs as well.

Resolves: ZSV-10538

Change-Id: I6e63786d6d6b616f7a766b6c7463617572786969
@MatheMatrix MatheMatrix force-pushed the sync/tao.gan/fix/ZSV-10538-alive-chain-decouple@@2 branch from 19701c4 to 978db42 Compare May 25, 2026 03:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants