Skip to content

[server]: remove orphan files and directories when tablet starts up#3388

Open
gyang94 wants to merge 1 commit into
apache:mainfrom
gyang94:fix-ts-orphan-files
Open

[server]: remove orphan files and directories when tablet starts up#3388
gyang94 wants to merge 1 commit into
apache:mainfrom
gyang94:fix-ts-orphan-files

Conversation

@gyang94
Copy link
Copy Markdown
Contributor

@gyang94 gyang94 commented May 27, 2026

Purpose

Linked issue: close #3387

Brief change log

Two changes, one in each cleanup path:

1. ReplicaManager: handle NoneReplica with deleteLocal=true

When the NoneReplica branch receives deleteLocal=true, look up the bucket in LogManager.currentLogs. If present, the log tablet was loaded at startup but never registered in allReplicas — it is an orphan. Drop the log via logManager.dropLog(), delete the sibling KV tablet directory (via kvManager.dropKv() if loaded, otherwise direct FileUtils.deleteDirectory), update disk usage accounting, and record the parent directory for empty-dir cleanup.

This handles the partition deletion scenario.

2. LogManager: clean up empty parent directories in SchemaNotExistException handler

After the existing handler deletes log-N/ and kv-N/, check whether the parent directory is empty and delete it. For partitioned tables, also check the grandparent (table directory). For non-partitioned tables, the grandparent is the database directory and must NOT be deleted. This is safe under parallel loadAllLogs execution — deleteDirectoryQuietly tolerates races.

This handles the table deletion scenario.

Tests

API and Format

Documentation

@gyang94 gyang94 changed the title fix: remote orphan files and directories when tablet starts up [server]: remove orphan files and directories when tablet starts up May 27, 2026
@wuchong wuchong requested a review from Copilot May 31, 2026 09:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes orphan replica directory leaks on TabletServer restart during table/partition deletion (issue #3387). Adds two cleanup paths corresponding to the two scenarios in the issue: a startup-time empty parent/table-dir sweep in LogManager's SchemaNotExistException handler (table-deletion case) and a new sweepOrphanTabletDirs invoked from ReplicaManager.stopReplicas when a NoneReplica receives deleteLocal=true (partition-deletion case).

Changes:

  • ReplicaManager: new sweepOrphanTabletDirs drops the orphan log via LogManager, removes the sibling KV tablet (via KvManager.dropKv or direct FileUtils.deleteDirectory), updates LocalDiskManager accounting, and registers the parent dir for empty-dir cleanup.
  • LogManager: after deleting log-N/ and kv-N/ on SchemaNotExistException, also remove the now-empty parent dir, and for partitioned tables additionally remove the empty grandparent (table) dir.
  • StopReplicaITCase: two new IT tests covering table-drop (startup cleanup) and partition-drop (orphan-sweep on stopReplica) while a TabletServer is offline.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

File Description
fluss-server/src/main/java/org/apache/fluss/server/replica/ReplicaManager.java Handles NoneReplica + deleteLocal=true by sweeping orphan log/KV dirs loaded at startup.
fluss-server/src/main/java/org/apache/fluss/server/log/LogManager.java Cleans up empty partition/table parent directories after residual tablet dirs are deleted.
fluss-server/src/test/java/org/apache/fluss/server/coordinator/StopReplicaITCase.java New IT tests for the two orphan-cleanup paths during offline drop.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread fluss-server/src/main/java/org/apache/fluss/server/log/LogManager.java Outdated
Comment thread fluss-server/src/main/java/org/apache/fluss/server/replica/ReplicaManager.java Outdated
@gyang94 gyang94 force-pushed the fix-ts-orphan-files branch from 7757a66 to 1bf4582 Compare June 1, 2026 02:22
Copy link
Copy Markdown
Contributor

@swuferhong swuferhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @gyang94. Thanks for your great work. I left some comments, pls take a look.

The commit msg has a typo: "remote orphan files" — should be "remove orphan files".

// database dir — must NOT remove. Safe under parallel
// execution: the last job to finish finds the dir
// empty and removes it; deleteDirectoryQuietly
// tolerates races.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spotless the docs.

Comment on lines +577 to +578
if (parentDir != null && FileUtils.isDirectoryEmpty(parentDir)) {
FileUtils.deleteDirectoryQuietly(parentDir);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a TOCTOU race: between the isDirectoryEmpty check and deleteDirectoryQuietly, another thread/loadLog task could create a file. Consider using Files.delete(path) which fails on non-empty dirs, making this truly safe under concurrency.

FileUtils.deleteDirectoryQuietly(parentDir);
if (isPartitioned) {
File tableDir = parentDir.getParentFile();
if (tableDir != null && FileUtils.isDirectoryEmpty(tableDir)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the table drop case, the startup cleanup via SchemaNotExistException works well.

However, for the partition drop case (table still exists, only partition dropped while TS was offline), the orphan files won't be auto-cleaned. When the TS restarts, loadLog() succeeds because the table schema still exists in ZK, so the log is loaded normally as a NoneReplica. The sweepOrphanTabletDirs mechanism requires a stopReplica(delete=true) to trigger, but the Coordinator won't re-send it — by the time the TS reconnects, the Coordinator has already marked the deletion as "successful" (after retry exhaustion) and removed all tracking metadata for that partition's replicas.

I think you need consider adding a startupcheck for partition existence (similar to the schema check) to cover this gap.

@gyang94 gyang94 force-pushed the fix-ts-orphan-files branch from 1bf4582 to ef8f194 Compare June 1, 2026 08:24
@gyang94 gyang94 closed this Jun 1, 2026
@gyang94 gyang94 reopened this Jun 1, 2026
Copy link
Copy Markdown
Contributor

@swuferhong swuferhong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gyang94 . LGTM allover, I Left some comments.

} catch (Exception kvDeleteException) {

boolean isPartitioned = pathAndBucket.f0.getPartitionName() != null;
File parentDir = tabletDir.getParentFile();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recommend named to partitionDir.

// do nothing fort this case.
if (data.isDeleteLocal()) {
try {
sweepOrphanTabletDirs(tb, deletedTableIds, deletedPartitionIds);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No test to cover this method. Both new tests drop before the TS restart, so ZK metadata is already absent at startup — this only exercises path A (SchemaNotExistException handler). The newly added ReplicaManager.sweepOrphanTabletDirs (path B) has no test coverage.

Consider adding a case that hits the NoneReplica + deleteLocal branch: restart the TS first (metadata still present → log loaded but no replica), then drop the table/partition.

// Table schema exists, but the partition may have been dropped while
// this TS was offline. Check partition registration in ZK.
String partitionName = physicalTablePath.getPartitionName();
if (partitionName != null && !zkClient.getPartition(tablePath, partitionName).isPresent()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Validates only partitionName against ZK, not partitionId.

If a partition is dropped and recreated under the same name (new partitionId), the stale on-disk directory for the old partitionId passes the check and gets loaded as valid data, while its TableBucket still carries the old partitionId. Validating partitionId (e.g. comparing the ZK PartitionRegistration.getPartitionId() with the one parsed from the dir) would be more robust.

@gyang94 gyang94 force-pushed the fix-ts-orphan-files branch from ef8f194 to 2193cdf Compare June 2, 2026 08:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[server] TabletServer leaves orphan replica directories after restart during table/partition deletion

3 participants