OpenStack Object Storage (swift)

db replicator race leaves temporary database in tmp dir

Bug #1691566 reported by clayg on 2017-05-17

This bug affects 2 people

Affects		Status	Importance	Assigned to	Milestone
	OpenStack Object Storage (swift)	Confirmed	Medium	Unassigned

Bug Description

There's probably more than one reason a remote container-replicator might fail to call the REPLICATE RPC after rsyncing a database over (either the ReplicatorRpc complete_rsync or rsync_then_merge) which could cause a temporary db to stack up (and waste disk space) in the temporary dir.

But one common way seems to be a race when two (or more) remote container-replicators are both trying to complete_rsync a remote db onto a new node after rebalance.

If the database is large and the network busy - it's not uncommon to hit such a wide race.

When it does then the loser will miss some cleanup code:

https://github.com/openstack/swift/blob/6e893e228840bc42cfd13546245438832bc2bb46/swift/common/db_replicator.py#L820

While it's probably reasonable to avoid some sort of sync/merge and return the 404 error - before doing so the local container server should cleanup the temporary db which the remote is trying to tell us about.

Otherwise it *will* get reaped if it's older than a reclaim age (lp bug #1691565)

See original description

Tim Burke (1-tim-z) on 2017-05-17

description:

updated

clayg (clay-gerrard) on 2017-12-28

Changed in swift:
importance:	Undecided → Medium
status:	New → Confirmed

Report a bug

This report contains Public information

Everyone can see this information.

You are

Subscribing...

Edit bug mail

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.