pack-on-the-fly heuristic probably inaccurate for chk groups

Bug #476118 reported by John A Meinel
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Confirmed
High
Unassigned
Breezy
Triaged
Medium
Unassigned

Bug Description

Somewhat related to bug #402662.

The current 'insert_record_stream' code monitors the incoming groups, and for groups that don't seem 'full enough' it tries to repack them. However, for chk groups we currently intentionally split them early, so that we only cluster similar nodes together. (We don't expect to get much, if any, cross-group compression, so it helps ensure that offset references stay small.)

The heuristic is a bit weak, though, as it assumes that any group that isn't larger than X should be repacked. I *think* this causes us to often repack all chk pages during fetch. At least, I think I've seen a lot of 'bzr branch' times that I thought would be bandwidth bound, but showed up as CPU bound.

I haven't dug deep enough to fully confirm this bug, but I wanted to make sure not to forget about it.

There are also probably a few possible fixes
1) Work on bug #402662 to pack more chk nodes in a single group, thus working with the current heuristic.
2) Postpone the recompression until the *next* group is read. This will help when we have the sub-stream which has a single group in it (for all related chk pages). We don't need to recompress because we don't have any other data to *put* into that group. However, it is slightly at odds with the "don't buffer" desire for 'get_record_stream'.
3) Come up with some other heuristic for chk nodes.

Jelmer Vernooij (jelmer)
tags: added: packs
Jelmer Vernooij (jelmer)
tags: added: check-for-breezy
Jelmer Vernooij (jelmer)
tags: added: performance
removed: check-for-breezy
Changed in brz:
status: New → Triaged
importance: Undecided → Medium
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.