memory leak when C apps respond to client

Bug #1974193 reported by Galen Charlton
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
OpenSRF
Fix Released
High
Unassigned

Bug Description

When C apps prepare messages to send back to the client, the content of jsonObjects get leaked during the process of converting an osrfMessage to a JSON string under three circumstances:

- sending a non-chunked response
- adding a response to a bundle to send later
- sending a request complete status message

This can add up for long-lived C servers.

OpenSRF master

Revision history for this message
Galen Charlton (gmc) wrote :
tags: added: leaks pullrequest
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I have seen C services using a lot of memory in the past, mostly cstore and pcrud from Evergreen. One work around that I've used is to lower the number of connections before the service will recycle itself.

Changed in opensrf:
status: New → Confirmed
importance: Undecided → High
milestone: none → 3.2.3
assignee: nobody → Jason Stephenson (jstephenson)
Revision history for this message
Jason Stephenson (jstephenson) wrote :

Testing with the following request in srfsh reveals the memory leak on an unpatched system:

request open-ils.pcrud open-ils.pcrud.search.bre "AUTHTOKEN" {"id":1}, {"flesh":1,"flesh_fields":{"bre":["call_numbers"]}}

Using a script to make that request about 100 times on a test VM running a Concerto dataset on Ubuntu 20.04 causes the pcrud drone's size to increase between 3,000 to 5,000 bytes or there abouts for each time the script is run.

The query from https://bugs.launchpad.net/evergreen/+bug/1974195/comments/2 causes an increase of between 25K and 40K to the pcrud drone each time it is run.

After applying this patch and restarting services, etc., the memory increase is less, particularly after the first run. The srfsh script running the first query 100 times only increase the drone size between 640 and 3,000 bytes, with the highest number after the first time it is run. The second query seems to be less affected by this patch as I saw increases as high as 25K, though it looks like it drops to 6K increases after the first couple of runs.

My testing, both before and after the patch went like this:

1. Start OpenSRF services.
2. pgrep -af pcrud to find the open-ils.pcrud listener and drones.
3. ps -o rsz,vsz $PID_of_pcrud_drone to get the initial size of the drone.
4. Login with srfsh.
5. Replace AUTHTOKEN with the authtoken from the above in a script with the first query in it on 100+ lines.
6. Run the script.
7. Repeat 3 to get the new size of the pcrud drone.
8. Repeat 6 and 7 a few more times.
9. Run the search from bug 1974195 in srfsh
10. Repeat 3.

At one point, a new pcrud drone was spawned, so I did step 3 on both drone PIDs when I became aware of the change. I mainly looked at the rsz number (resident set size) to track changes in the size of the executable.

Since the patch works well enough for me, I've pushed a signoff branch to the working repository: user/dyrcona/lp1974193_fix_memory_leak-signoff.

https://git.evergreen-ils.org/?p=working/OpenSRF.git;a=shortlog;h=refs/heads/user/dyrcona/lp1974193_fix_memory_leak-signoff

tags: added: signedoff
Changed in opensrf:
assignee: Jason Stephenson (jstephenson) → nobody
Revision history for this message
Jason Stephenson (jstephenson) wrote :

I was hoping we would get more eyes on this, which is why I didn't push it right away.

I've been running this with OpenSRF master and 3.2 on all of my test VMs for the past week and everything looks good.

Unless anyone objects, or beats me to it, I'll push this on Tuesday after pushing the 3.2 branch from bug 1827055.

Revision history for this message
Jason Stephenson (jstephenson) wrote :

As promised, the patch was pushed to master and rel_3_2.

Changed in opensrf:
status: Confirmed → Fix Committed
Changed in opensrf:
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.