Search suggestions can make searches very slow in 3.11

Bug #2038472 reported by Jeff Davis
8
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Evergreen
Confirmed
High
Jeff Davis
3.12
New
Undecided
Unassigned

Bug Description

EG 3.11

Searches can be extremely slow on systems with many bib records, due to the time required to process a large number of potential search suggestions.

On our 3.11.1 test server with 4 million bib records, a keyword search for "rainbew" takes >23 seconds to complete. Almost all of that search time is due to the search.symspell_lookup call in search.symspell_suggest, which has to process 62,000 potential search suggestions in order to return the 3 suggestions we actually want.

This test environment has 7.2 million search.symspell_dictionary entries which were freshly generated using the sideload process. It uses the default values for internal flags (symspell.prefix_length = 6, symspell.max_edit_distance = 3), and the keyword metabib class has max_suggestions = 3.

Revision history for this message
Galen Charlton (gmc) wrote (last edit ):

Confirmed in a test database of 1.9 million bibs, where the search.symspell_lookup() time is about 10 times that of a 3.9 system with the same dataset.

Changed in evergreen:
status: New → Confirmed
Revision history for this message
Galen Charlton (gmc) wrote :

This looks like in part a regression on bug 1931162; as a workaround, dropping symspell.max_edit_distance down to 2 should reduce the number of short prefixes with large number of suggestions that get checked.

Galen Charlton (gmc)
Changed in evergreen:
importance: Undecided → High
tags: added: performance search
Revision history for this message
Mike Rylander (mrylander) wrote (last edit ):

Galen is correct, the core "smarts" of the optimizations embodied in the 1282 upgrade script -- in-function caching, and better operation ordering -- were lost during later development, either by starting that development on top of the version of the code immediately prior to 1282 or during merge conflict cleanup later on.

A branch is available at https://git.evergreen-ils.org/?p=working/Evergreen.git;a=shortlog;h=refs/heads/user/miker/lp-2038472-DYM-optimization-regression with (another) drop-in replacement for search.symspell_lookup that reintroduces those optimizations that were lost.

tags: added: pullrequest
removed: performance
Changed in evergreen:
milestone: none → 3.11.2
tags: added: performance
Galen Charlton (gmc)
Changed in evergreen:
assignee: nobody → Galen Charlton (gmc)
Revision history for this message
Galen Charlton (gmc) wrote :

Tested and a signoff branch pushed to working/user/gmcharlt/lp2038472_signoff

Jeff: I'm curious how this does on your test system.

Changed in evergreen:
assignee: Galen Charlton (gmc) → nobody
tags: added: signedoff
Changed in evergreen:
assignee: nobody → Jeff Davis (jdavis-sitka)
no longer affects: evergreen/3.2
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.