Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 2 additions & 6 deletions compute/src/main/java/org/zstack/compute/host/HostBase.java
Original file line number Diff line number Diff line change
Expand Up @@ -955,7 +955,7 @@ private void handle(final PingHostMsg msg) {
thdf.chainSubmit(new ChainTask(msg) {
@Override
public String getSyncSignature() {
return "do-ping-host";
return String.format("do-ping-host-%s", msg.getHostUuid());
}

@Override
Expand All @@ -979,13 +979,9 @@ public void fail(ErrorCode errorCode) {

@Override
public String getName() {
return String.format("do-ping-host-%s", msg.getHostUuid());
return getSyncSignature();
}

@Override
protected int getSyncLevel() {
return HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.value(Integer.class);
}
});
}

Expand Down
2 changes: 1 addition & 1 deletion conf/globalConfig/host.xml
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@
<config>
<category>host</category>
<name>ping.parallelismDegree</name>
<description>The max hosts management server sends ping command to host in parallel</description>
<description>The max hosts management server sends ping command to host in parallel. With per-host sync signature, this now controls per-host parallelism (one host does not block another).</description>
<defaultValue>100</defaultValue>
<type>java.lang.Integer</type>
</config>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,7 @@ import org.zstack.testlib.SubCase
import org.zstack.utils.FieldUtils
import org.zstack.utils.gson.JSONObjectUtil

import java.util.concurrent.CountDownLatch
import java.util.concurrent.TimeUnit

class KVMPingCase extends SubCase {
Expand Down Expand Up @@ -460,6 +461,50 @@ class KVMPingCase extends SubCase {
}
}

void testDifferentHostPingsNotBlocked() {
canDoReconnectFunc = { HostReconnectTask.CanDoAnswer.Ready }

HostInventory kvm1 = env.inventoryByName("kvm1")
HostInventory kvm2 = env.inventoryByName("kvm2")

waitHostConnected(kvm1.uuid)
waitHostConnected(kvm2.uuid)

// Set parallelism to 1 so the test proves per-host isolation:
// old code (shared do-ping-host + level=1) → kvm2 blocked by kvm1
// new code (per-host do-ping-host-<uuid> + level=1) → kvm2 unblocked
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(1)

CountDownLatch blockKvm1 = new CountDownLatch(1)
CountDownLatch kvm2Pinged = new CountDownLatch(1)

env.simulator(KVMConstant.KVM_PING_PATH) { HttpEntity<String> e, EnvSpec espec ->
KVMAgentCommands.PingCmd cmd = JSONObjectUtil.toObject(e.getBody(), KVMAgentCommands.PingCmd.class)
def rsp = new KVMAgentCommands.PingResponse()

if (cmd.hostUuid == kvm1.uuid) {
blockKvm1.await(30, TimeUnit.SECONDS)
}

if (cmd.hostUuid == kvm2.uuid) {
kvm2Pinged.countDown()
}

rsp.hostUuid = cmd.hostUuid
return rsp
}

try {
assert kvm2Pinged.await(30, TimeUnit.SECONDS)
} finally {
Comment on lines +478 to +499
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

补充对 kvm1 已进入阻塞分支的断言,避免假阳性。

当前只在 Line 498 断言 kvm2 收到 ping;若 kvm1 分支未实际进入阻塞,这个用例仍可能通过,无法证明“一个主机阻塞时另一个主机不受影响”。建议增加一个 kvm1Entered latch,并先断言其触发。

建议修改
         CountDownLatch blockKvm1 = new CountDownLatch(1)
+        CountDownLatch kvm1Entered = new CountDownLatch(1)
         CountDownLatch kvm2Pinged = new CountDownLatch(1)
@@
             if (cmd.hostUuid == kvm1.uuid) {
+                kvm1Entered.countDown()
                 blockKvm1.await(30, TimeUnit.SECONDS)
             }
@@
         try {
+            assert kvm1Entered.await(30, TimeUnit.SECONDS)
             assert kvm2Pinged.await(30, TimeUnit.SECONDS)
         } finally {
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/src/test/groovy/org/zstack/test/integration/kvm/host/KVMPingCase.groovy`
around lines 478 - 499, The test can produce false positives because it never
asserts that kvm1 actually entered the blocking branch; add a new CountDownLatch
(e.g., kvm1Entered) and trigger kvm1Entered.countDown() inside the env.simulator
closure when cmd.hostUuid == kvm1.uuid before calling blockKvm1.await(...), then
assert kvm1Entered.await(30, TimeUnit.SECONDS) prior to asserting kvm2Pinged to
ensure kvm1 truly reached the blocked state; update references to the existing
blockKvm1, kvm2Pinged, kvm1 and the simulator for KVMConstant.KVM_PING_PATH and
KVMAgentCommands.PingCmd so the new latch is created and awaited in the same
test scope.

blockKvm1.countDown()
env.cleanSimulatorHandlers()
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(100)
recoverHostToConnected(kvm1.uuid)
Comment on lines +476 to +503
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

避免硬编码恢复 HOST_TRACK_PARALLELISM_DEGREE

Line 502 把并行度固定恢复为 100,会覆盖测试前真实配置,增加后续用例被污染和顺序相关失败的风险。建议在 Line 476 前保存原值,并在 finally 中恢复原值。

建议修改
+        Integer originalParallelism = HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.value(Integer.class)
         HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(1)
@@
-            HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(100)
+            HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(originalParallelism)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(1)
CountDownLatch blockKvm1 = new CountDownLatch(1)
CountDownLatch kvm2Pinged = new CountDownLatch(1)
env.simulator(KVMConstant.KVM_PING_PATH) { HttpEntity<String> e, EnvSpec espec ->
KVMAgentCommands.PingCmd cmd = JSONObjectUtil.toObject(e.getBody(), KVMAgentCommands.PingCmd.class)
def rsp = new KVMAgentCommands.PingResponse()
if (cmd.hostUuid == kvm1.uuid) {
blockKvm1.await(30, TimeUnit.SECONDS)
}
if (cmd.hostUuid == kvm2.uuid) {
kvm2Pinged.countDown()
}
rsp.hostUuid = cmd.hostUuid
return rsp
}
try {
assert kvm2Pinged.await(30, TimeUnit.SECONDS)
} finally {
blockKvm1.countDown()
env.cleanSimulatorHandlers()
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(100)
recoverHostToConnected(kvm1.uuid)
Integer originalParallelism = HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.value(Integer.class)
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(1)
CountDownLatch blockKvm1 = new CountDownLatch(1)
CountDownLatch kvm2Pinged = new CountDownLatch(1)
env.simulator(KVMConstant.KVM_PING_PATH) { HttpEntity<String> e, EnvSpec espec ->
KVMAgentCommands.PingCmd cmd = JSONObjectUtil.toObject(e.getBody(), KVMAgentCommands.PingCmd.class)
def rsp = new KVMAgentCommands.PingResponse()
if (cmd.hostUuid == kvm1.uuid) {
blockKvm1.await(30, TimeUnit.SECONDS)
}
if (cmd.hostUuid == kvm2.uuid) {
kvm2Pinged.countDown()
}
rsp.hostUuid = cmd.hostUuid
return rsp
}
try {
assert kvm2Pinged.await(30, TimeUnit.SECONDS)
} finally {
blockKvm1.countDown()
env.cleanSimulatorHandlers()
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(originalParallelism)
recoverHostToConnected(kvm1.uuid)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/src/test/groovy/org/zstack/test/integration/kvm/host/KVMPingCase.groovy`
around lines 476 - 503, The test currently hardcodes restoring
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE to 100 in the finally block,
which can pollute other tests; capture the original value before calling
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(1) (e.g. store it in
a local variable like originalDegree) and in the finally block call
HostGlobalConfig.HOST_TRACK_PARALLELISM_DEGREE.updateValue(originalDegree)
instead of updateValue(100), leaving the rest of the test (blockKvm1,
kvm2Pinged, simulators and recoverHostToConnected) unchanged.

recoverHostToConnected(kvm2.uuid)
}
}

@Override
void test() {
bus = bean(CloudBus.class)
Expand Down Expand Up @@ -490,6 +535,7 @@ class KVMPingCase extends SubCase {
testHostReconnectAfterPingFailure()
testContinuePingIfHostNoReconnect()
testNoPingIfHostNotReadyToReconnect()
testDifferentHostPingsNotBlocked()
testManagementNodeReadyConnectAllHost()
testPingConnectingHost()

Expand Down