Skip to content

<fix>[header]: add VM migrate host-constraint extension point#4062

Open
zstack-robot-1 wants to merge 1 commit into
5.5.22from
sync/yaohua.wu/bugfix/ZSTAC-85208@@2
Open

<fix>[header]: add VM migrate host-constraint extension point#4062
zstack-robot-1 wants to merge 1 commit into
5.5.22from
sync/yaohua.wu/bugfix/ZSTAC-85208@@2

Conversation

@zstack-robot-1
Copy link
Copy Markdown
Collaborator

ZSTAC-85208

Root Cause

DRS automatic load balancing kept generating migration tasks for VMs that
hold passthrough devices (physical PCI / dGPU / mdev). Such VMs cannot be
live-migrated, so every advised migration was bound to fail. DRS had no
way to learn which VMs are pinned to their current host.

Changes

Module Change
header Add VmMigrateHostConstraintExtensionPoint interface — lets components report the subset of candidate VMs pinned to their host by passthrough devices

Note

This MR adds the extension point interface only. The DRS query logic and
the device-side implementations live in the related premium MR (same
branch bugfix/ZSTAC-85208@@2).

Related: ZSTAC-85208

sync from gitlab !9960

Introduce VmMigrateHostConstraintExtensionPoint so that components
can report VMs pinned to their current host by passthrough devices.

1. Why is this change necessary?
DRS automatic load balancing kept generating migration tasks for VMs
that hold passthrough devices (physical PCI / dGPU / mdev). Such VMs
cannot be live-migrated, so every advised migration was bound to
fail. DRS had no way to learn which VMs are pinned to their host.

2. How does it address the problem?
A new extension point, VmMigrateHostConstraintExtensionPoint, lets
device-owning components report the subset of candidate VMs that
cannot be migrated off their current host. DRS queries it before
building migration advice. VF / vDPA NIC devices are intentionally
excluded since they support migration.

3. Are there any side effects?
None. The interface only adds a new query extension point and does
not change existing behavior on its own.

# Summary of changes (by module):
- header: add VmMigrateHostConstraintExtensionPoint interface

Related: ZSTAC-85208
Change-Id: Ia303bf54f47fdd54a2814d7e6b298c541f4cf4bf
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 22, 2026

Review Change Stack

总体说明

此 PR 新增了一个公共扩展点接口 VmMigrateHostConstraintExtensionPoint,用于 DRS 调度在虚拟机迁移时查询候选 VM 中哪些因宿主机上存在直通设备而被固定、无法进行在线迁移。

变更清单

VM 迁移主机约束接口

文件 / 层级 说明
VmMigrateHostConstraintExtensionPoint 接口
header/src/main/java/org/zstack/header/vm/VmMigrateHostConstraintExtensionPoint.java
新增接口定义 getHostBoundVmUuids(List<String> candidateVmUuids) 方法,返回无法迁移的 VM UUID 子集。接口文档明确说明语义用途,并排除 VF/vDPA NIC 等不应在此约束中报告的设备场景。

代码审查工作量估算

🎯 1 (Trivial) | ⏱️ ~3 分钟

庆祝诗

🐰 新接口轻轻降落,
约束查询从此有迹可循,
DRS 调度更聪慧,
主机约束清晰可见,
直通设备不再困扰!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed 标题完全相关于变更集,清晰地总结了主要变更:添加VM迁移主机约束扩展点接口。
Description check ✅ Passed 描述与变更集相关,包括问题背景、解决方案和变更内容,与新增的扩展点接口直接相关。
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch sync/yaohua.wu/bugfix/ZSTAC-85208@@2

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
header/src/main/java/org/zstack/header/vm/VmMigrateHostConstraintExtensionPoint.java (1)

17-22: ⚡ Quick win

可选建议:增强方法契约文档的完整性

当前的 Javadoc 已经清晰描述了方法的主要用途,但可以考虑补充以下边界情况的说明以提高实现者的清晰度:

  • candidateVmUuids 参数为 null 或包含 null 元素时的预期行为
  • 返回空列表 [] 的明确语义(即没有任何候选 VM 被宿主机约束)
  • 实现是否允许返回不在输入列表中的 UUID
📝 建议的文档增强示例
     /**
      * `@param` candidateVmUuids VM uuids being considered for migration
+     *                         (must not be null; null elements are ignored)
      * `@return` the subset of candidateVmUuids that cannot be migrated off
      *         their current host because of attached passthrough devices
+     *         (returns empty list if none are host-bound; must not return
+     *         UUIDs that are not present in the input list)
      */
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In
`@header/src/main/java/org/zstack/header/vm/VmMigrateHostConstraintExtensionPoint.java`
around lines 17 - 22, Update the Javadoc for
VmMigrateHostConstraintExtensionPoint.getHostBoundVmUuids(List<String>
candidateVmUuids) to explicitly document null/empty handling and return
semantics: state what happens if candidateVmUuids is null or contains null
elements (e.g., throw NPE or treat as empty), clarify that returning an empty
list means no VMs are host-bound, and explicitly require that implementations
only return UUIDs from the provided candidateVmUuids (or document if returning
other UUIDs is allowed); ensure the contract mentions whether the returned list
may be mutable or must be immutable.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In
`@header/src/main/java/org/zstack/header/vm/VmMigrateHostConstraintExtensionPoint.java`:
- Around line 17-22: Update the Javadoc for
VmMigrateHostConstraintExtensionPoint.getHostBoundVmUuids(List<String>
candidateVmUuids) to explicitly document null/empty handling and return
semantics: state what happens if candidateVmUuids is null or contains null
elements (e.g., throw NPE or treat as empty), clarify that returning an empty
list means no VMs are host-bound, and explicitly require that implementations
only return UUIDs from the provided candidateVmUuids (or document if returning
other UUIDs is allowed); ensure the contract mentions whether the returned list
may be mutable or must be immutable.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: http://open.zstack.ai:20001/code-reviews/zstack-cloud.yaml (via .coderabbit.yaml)

Review profile: CHILL

Plan: Pro

Run ID: a5cf15cf-a49d-439c-89f1-2490bcd85b36

📥 Commits

Reviewing files that changed from the base of the PR and between c85d122 and 24acbe0.

📒 Files selected for processing (1)
  • header/src/main/java/org/zstack/header/vm/VmMigrateHostConstraintExtensionPoint.java

@ZStack-Robot
Copy link
Copy Markdown
Collaborator

Comment from yaohua.wu:

Review: MR !9960 + !14039 — ZSTAC-85208

本次 review 覆盖同一分支 bugfix/ZSTAC-85208@@2 的两个关联 MR:

  • zstack !9960 — 新增 VmMigrateHostConstraintExtensionPoint 接口
  • premium !14039 — DRS 候选过滤 + PCI/mdev 实现 + 集成测试

Background (preserved across rounds)

  • Jira: ZSTAC-85208 — [DRS][PCI] DRS 调度带 PCI/GPU 设备云主机迁移失败后错误释放 PCI 设备绑定(P0)
  • Bug summary: 开启 Automatic DRS 的 KVM 集群中,DRS 调度不感知 PCI/GPU/MDEV 直通设备绑定,会对这类无法在线迁移的云主机发起迁移;迁移必然失败,回滚时 PciDeviceManager.failedToMigrateVm() 无条件解绑所有 PCI 设备,导致 ZStack DB 与 QEMU/Guest 实际状态不一致(设备可能被二次分配 → 物理 GPU 双重分配)。
  • Intent & scope: 采用分析建议的「方案 A(预防层)」——让 DRS 在生成迁移建议前感知并跳过被直通设备固定在宿主的云主机。新增 header 扩展点接口,drs 查询扩展点过滤候选池,mevoco 的 PciDeviceManager / MdevDeviceManagerImpl 实现该接口(显式排除 VF/vDPA NIC)。
  • Round 1 initial findings: 0 × P0, 1 × P1, 2 × P2
  • Suggested fix direction: 确认「方案 B(加固 failedToMigrateVm)」的后续跟踪;对扩展点返回值做空值保护;补齐 mdev / VF-NIC 测试覆盖。

总体评价

实现思路与 Jira 技术分析的「方案 A」一致,质量良好:

  • 扩展点接口正确置于 header,javadoc 清晰,明确说明 VF/vDPA NIC 不应上报。
  • getHostBoundVmUuids 的判定语义与 validate(APIMigrateVmMsg) 逐项对齐(dGPU 走 DGpuDeviceVO;物理 PCI 走 PciDeviceVO 并扣除 VmVfNicVO.pciDeviceUuid / secondaryPciDeviceUuid / VmVdpaNicVO.pciDeviceUuid 三类 NIC 背书设备)——已逐行核对,一致。
  • excludeVmsBoundToHostByPassthroughDevice 的调用位置经核实安全:HostNode 负载来自 host 级指标(HostLoad.getUsedCPUPercent/UsedPhysicalMemoryBit),与 vmList 无关,移除 VM 不影响主机负载判定与其余 VM 的均衡,注释「Host load metrics are left untouched」属实。
  • pciDevice.xml 扩展点注册到了正确的两个 bean(PciDeviceManager / MdevDeviceManagerImpl),两类均已 implements 该接口,import 与编译无问题。
  • Upstream freshness:两仓分支 merge-base 均等于各自 upstream/5.5.22 HEAD,无冲突、无需 rebase。

P1 — High

# File:Line Issue Fix Lens Conf Route
1 premium PciDeviceManager.failedToMigrateVm() / MdevDeviceManagerImpl.failedToMigrateVm()(diff 外) 本 MR 只实现了「方案 A(预防层)」,未实现 Jira 分析中标注为「必做」的「方案 B(加固层)」。根因二——迁移失败回滚无条件解绑 PCI/mdev 设备——仍然存在。方案 A 关闭了 DRS 这一触发入口,但回滚缺陷本身仍可达:DRS 在 T0 完成过滤、doMigrateVm 于 T1 真正发出迁移消息之间,若有云主机被挂载 PCI 设备(TOCTOU 窗口),该云主机仍会被迁移→失败→错误解绑;HA 等其他内部 MigrateVmMsg 路径同理需评估。后果是控制面/数据面不一致、物理 GPU 双重分配。 确认方案 B 是否已作为后续 issue 跟踪;建议加固 failedToMigrateVm:仅当 destHostUuid != null && destHostUuid.equals(inv.getHostUuid())(云主机确已落到目标 host)时才执行 detachPciDevicesFromVm(...)MdevDeviceManagerImpl.failedToMigrateVm() 同步处理。 correctness / robustness / adversarial 0.80 advisory → human

P2 — Moderate

# File:Line Issue Fix Lens Conf Route
1 premium DRSBase.java:746-748;接口 !9960 hostBoundVmUuids.addAll(ext.getHostBoundVmUuids(candidateVmUuids)) 未对扩展点返回值做空值保护。当前两个实现不会返回 null,但 VmMigrateHostConstraintExtensionPointheader 中的公共扩展点,javadoc 未约定「不得返回 null」,未来任意实现返回 null 都会让 DRS 调度流程抛 NPE 并整体失败。 addAll 前判空:List<String> r = ext.getHostBoundVmUuids(...); if (r != null) hostBoundVmUuids.addAll(r);;并在本 MR 接口 javadoc 中明确「实现不得返回 null,无匹配时返回空列表」。 robustness 0.85 gated_auto → downstream-resolver
2 premium DRSSkipMigratingPciDeviceVmCase.groovy 集成测试只覆盖「物理 PCI 直通 GPU 云主机被排除 + 普通云主机仍被调度」一条分支。未覆盖:(a) mdev 设备云主机被排除——MdevDeviceManagerImpl.getHostBoundVmUuids 零测试覆盖;(b) 带 VF/vDPA NIC 的云主机不被误过滤——这是分析明确标注的回归风险点,而 getHostBoundVmUuids 中专门的 NIC 排除逻辑目前零覆盖,一旦回归会导致合法可迁移云主机被 DRS 静默跳过。 增补断言/用例:mdev 云主机被排除;带 VF NIC 的云主机仍出现在 DRS 迁移建议中。 testing 0.90 gated_auto → downstream-resolver

Coverage

  • 文件覆盖:!9960 1 文件 + !14039 5 文件(conf / drs / mevoco×2 / test),全部 deep-review;无生成文件,无 diff 截断。
  • 交叉校验:MCP diff 与本地 git diff base..head 的文件数/行数一致(zstack 1 文件 23 行;premium 5 文件 441 行)。
  • 编译检查:VmMigrateHostConstraintExtensionPointorg.zstack.header.vm.* 通配导入)、Collections/HashSet/Setjava.util.*)、DRSBase.pluginRgty 字段均已确认存在,无编译问题。
  • 注释扫描:新增注释无代码内 Jira-key 引用(测试类 javadoc 中的 ZSTAC-85208 属测试用例命名惯例,不计)。

Verdict: APPROVED

当前 MR 在其声明范围(DRS 预防层 / 方案 A)内实现正确、逻辑清晰,无阻塞性(P0)问题,可以合并。但强烈建议:

  1. 确认方案 B 的去向(P1)——根因二的回滚缺陷仍在代码中且仍可达,建议作为后续 issue 跟踪或一并修复;
  2. 合并前补齐 mdev / VF-NIC 测试覆盖(P2-2),并对扩展点返回值加空值保护(P2-1)。

🤖 Robot Reviewer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants