db replicator race leaves temporary database in tmp dir

Bug #1691566 reported by clayg
14
This bug affects 2 people
Affects Status Importance Assigned to Milestone
OpenStack Object Storage (swift)
Confirmed
Medium
Unassigned

Bug Description

There's probably more than one reason a remote container-replicator might fail to call the REPLICATE RPC after rsyncing a database over (either the ReplicatorRpc complete_rsync or rsync_then_merge) which could cause a temporary db to stack up (and waste disk space) in the temporary dir.

But one common way seems to be a race when two (or more) remote container-replicators are both trying to complete_rsync a remote db onto a new node after rebalance.

If the database is large and the network busy - it's not uncommon to hit such a wide race.

When it does then the loser will miss some cleanup code:

https://github.com/openstack/swift/blob/6e893e228840bc42cfd13546245438832bc2bb46/swift/common/db_replicator.py#L820

While it's probably reasonable to avoid some sort of sync/merge and return the 404 error - before doing so the local container server should cleanup the temporary db which the remote is trying to tell us about.

Otherwise it *will* get reaped if it's older than a reclaim age (lp bug #1691565)

Tim Burke (1-tim-z)
description: updated
clayg (clay-gerrard)
Changed in swift:
importance: Undecided → Medium
status: New → Confirmed
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.