empty 5GB container DB
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
OpenStack Object Storage (swift) |
Confirmed
|
Medium
|
Unassigned |
Bug Description
The frequent PUT/DELETE container size keeps growing. The size remain big even all objects were DELETED (empty container). Should replicator vacuum it according specific conditions ?
[root@prdd1slzs
SQLite version 3.6.20
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> select * from object;
sqlite>
[root@prdd1slzs
HTTP/1.1 204 No Content
X-Backend-
X-Container-
X-Put-Timestamp: 1414598357.76549
X-Backend-
### Original empty DB size ###
-rw------- 1 root root 5.7G May 17 21:42 94a91c0b60f945b
### Vacuum Test ###
-rw------- 1 root root 19K May 17 21:44 94a91c0b60f945b
* Without Vacuum brings several potential issues. Includes https:/
* Wasting network bandwidth.
* Longer DB lock time.
Hugo
Changed in swift: | |
importance: | Undecided → Medium |
status: | New → Confirmed |
The idea of vacuuming has come up before. In the past we haven't bothered with vacuum because the normal life cycle of a Swift container in most clusters is heavy PUT focused and vacumming was an overhead that isn't worth it.
I don't think anyone is against it per say.. but it would be nice to find out how much overhead we get, and see if the benefits was worth it.
Having said that, if we were to do it, doing it only when we are going to already need to do a bunch of I/O rather then ticking on some timer would be better in my opinion. Like you say, it would be nice to send a vacuumed database when we need to push the whole database to a new node, say on a rebalance, or on a rsync_then_merge. If we do it before we send however, I could imagine an IO spike on the node replicating the DB (vacuuming before we send) and the when writing the DB on the recipient's end, so that would be the overhead I'm speaking of.
I would think it would look something like the attached patch
NOTE: the patch is untested, seems to run, but is as a demonstration.. but could be starting point to test the effects of vacuuming on a cluster. (you can turn it on/off on db replicaors).