Batch fine generator benefits from low-hanging optimization
Affects | Status | Importance | Assigned to | Milestone | |
---|---|---|---|---|---|
Evergreen |
New
|
Wishlist
|
Unassigned |
Bug Description
Evergreen 2.12 / Wishlist
The batch fine generator could be smarter about caching org unit settings. It makes 3 to 5 org unit setting lookup calls per transaction. On my concerto-based server, I see 1830 org setting lookups for 394 circs. On a large data set, this can have a significant impact on performance. With effective caching, this number can be dropped to 3 to 5 org setting lookups per circ lib instead of per transaction.
To leverage the caching, the fine generator has to collect and process parallel batches within the API (a la hold targeter v2) instead of having the caller collect the batches and run a single API call per transaction.
Patch en route. Running this patch in a test cluster with a sizable data set shaved 25%-30% off the batch generator run time and, perhaps more importantly, avoids several million duplicate API/database calls.
Note the proposed changes have no impact on single-transaction fine generation.
http:// git.evergreen- ils.org/ ?p=working/ Evergreen. git;a=shortlog; h=refs/ heads/user/ berick/ lp1705728- fine-gen- cache-parallel
From the commit:
* Fine generator caches org unit setting values per instance. Once cached, the number of cstore calls per transaction is reduced by 3 to 5 calls, depending on context / settings.
* Fine generator disconnects from cstore after processing each transaction giving cstores a chance to recycle and avoid memory gobbling on huge batches.
* Fine generator now collects parallel batches of transactions directly within the server-side generator API instead of requiring the caller to collect transactions up front for individual processing. This lets us take advantage of the org setting caching.
* Fine generator script improvements:
** Arguments are passed via GetOpt, with support for legacy-style opensrf config and lockfile name passing (with warnings).
** supports a --parallel option to override the value from opensrf.xml.
** Sets OSRF_LOG_CLIENT for log tracing.
==
Unclear if release notes or tests are required, since there are no functionality changes to the generator itself and all fine generator script changes are backwards compatible. Testing essentially boils down to "does the fine generator still work, a little faster".
For further data, running the patches on my concerto-based VM shaves 48% off the run time.