[client] Fix stale metadata on readOnlyGateway by adding RetryableGatewayClientProxy#3390
[client] Fix stale metadata on readOnlyGateway by adding RetryableGatewayClientProxy#3390loserwang1024 wants to merge 2 commits into
Conversation
fresh-borzoni
left a comment
There was a problem hiding this comment.
@loserwang1024 Thank you for the very important PR, left some comments, PTAL
| private final AdminReadOnlyGateway readOnlyGateway; | ||
| private final MetadataUpdater metadataUpdater; | ||
|
|
||
| private static final int READ_ONLY_GATEWAY_MAX_RETRIES = 3; |
There was a problem hiding this comment.
With maxRetries=3, bootstrap reinit needs 4 refreshes. You only get 3 per request.
Shall we loop inside updateMetadata until either success or null-triggered bootstrap?
| cause); | ||
| // Run metadata refresh and retry on a separate thread to avoid | ||
| // blocking Netty IO threads that may complete the failed future. | ||
| CompletableFuture.runAsync( |
There was a problem hiding this comment.
do we want some backoff?
I mean 3 retries fire in milliseconds, seems wasteful on slow DNS or restarting pods.
|
@fresh-borzoni , I've revised the design: instead of retrying 3 times, it now rebuilds metadata via refreshClusterUntilAvailable until either some IP becomes available or it falls back to bootstrap. No backoff for now. Keep refreshClusterUntilAvailable purely "loop until available or bootstrap" to avoid over-engineering. The two existing layers (connection timeout + bootstrap exponential backoff) already provide sufficient throttling. |
Purpose
Linked issue: close #3389
Brief change log
Tests
API and Format
Documentation