openssl: backport to jammy "clear method store / query cache confusion"

Bug #2033422 reported by Adrien Nader
12
This bug affects 1 person
Affects Status Importance Assigned to Milestone
openssl (Ubuntu)
New
Undecided
Unassigned
Jammy
In Progress
Medium
Adrien Nader
Lunar
Fix Released
Undecided
Unassigned

Bug Description

=== SRU information ===
[ATTENTION]
This SRU contains THREE changes which are listed in the section below.

[Meta]
This bug is part of a series of three bugs for a single SRU.
This ( #2033422 ) is the "central" bug with the global information and debdiff.

This SRU addresses three issues with Jammy's openssl version:
- http://pad.lv/1994165: ignored SMIME signature errors
- http://pad.lv/2023545: imbca engine dumps core
- http://pad.lv/2033422: very high CPU usage for concurrent TLS connections (this one)

The SRU information has been added to the three bug reports and I am attaching the debdiff here only for all three.

All the patches have been included in subsequent openssl 3.0.x releases which in turn have been included in subsequent Ubuntu releases. There has been no report of issues when updating to these Ubuntu releases.

I have rebuilt the openssl versions and used abi-compliance-checker to compare the ABIs of the libraries in jammy and the one for the SRU. Both matched completely (FYI, mantic's matched completely too).

I have also pushed the code to git (without any attempt to make it git-ubuntu friendly).

https://code.launchpad.net/~adrien-n/ubuntu/+source/openssl/+git/openssl/+ref/jammy-sru

I asked Brian Murray about phasing speed and he concurs a slow roll-out is probably better for openssl. There is a small uncertainty because a security update could come before the phasing is over, effectively fast-forwarding the SRU. Still, unless there is already a current pre-advisory, this is probably better than a 10% phasing which is over after only a couple days anyway.
NB: at the moment openssl doesn't phase slowly so this needs to be implemented.

[Impact]
Severely degraded performance for concurrent operations compared to openssl 1.1. The performance is so degraded that some workloads fail due to timeouts or insufficient resources (noone magically has 5 times more machines). As a consequence, a number of people use openssl 1.1 instead and do not get security updates.

[Test plan]
Rafael Lopez has shared a simple benchmarks in http://pad.lv/2009544 with https://bugs.launchpad.net/ubuntu/+source/openssl/+bug/2009544/+attachment/5690224/+files/main.py .

To test, follow these steps:
- run "time python3 main.py" # using the aforementioned main.py script
- apt install -t jammy-proposed libssl3
- run "time python3 main.py"
- compare the runtimes for the two main.py runs

You can run this on x86_64, Raspberry Pi 4 or any machine, and get a very large speed-up in all cases. The improvements are not architecture-dependant.

Using this changeset, I get the following numbers for ten runs on my laptop:

3.0.2:
    real 2m5.567s
    user 4m3.948s
    sys 2m0.233s

this SRU:
    real 0m23.966s
    user 2m35.687s
    sys 0m1.920s

As can be easily seen, the speed-up is massive: system time is divided by 60 and overall wall clock time is roughly five times lower.

In http://pad.lv/2009544 , Rafael also shared his performance numbers and they are relatable to these. He used slightly different versions (upstreams rather than patched with cherry-picks) but at least one of the version used does not include other performance change. He also used different hardware and this performance issue seems to depend on the number of CPUs available but also obtained a performance several times better. Results on a given machine vary also very little across runs (less than 2% variation on runs of size 10). They are also very similar on a Raspberry Pi 4 (8GB).

The benchmark uses https://www.google.com/humans.txt which takes around 130ms to download on my machine but I modified the script to download something only 20ms away. Results are so close to the ones using humans.txt that they are within the error margin. This is consistent with the high-concurrency in the benchmark which both saturates CPU, and "hides" latencies that are relatively low.

Finally, there are positive reports on github. Unfortunately they are not always completely targeted at these patches only and therefore I will not link directly to them but they have also been encouraging.

[Where problems could occur]
The change is spread over several patches which touch the internals of openssl. As such, the engine and provider functionality could be broken by these changes. Fortunately, in addition to upstream's code review, these patches are included in openssl 3.0.4 (iirc) and therefore in kinetic. No issue related to these changes was reported on launchpad or upstream.

However, it is possible that there were more patch dependencies than these in either 3.0.3 or 3.0.4. In that case there could be problems.

[Patches]
The patches come directly from upstream and apply cleanly.

https://github.com/openssl/openssl/pull/18151#issuecomment-1118535602

* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0001-Drop-ossl_provider_clear_all_operation_bits-and-all-.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0002-Refactor-method-construction-pre-and-post-condition.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0003-Don-t-empty-the-method-store-when-flushing-the-query.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0004-Make-it-possible-to-remove-methods-by-the-provider-t.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0005-Complete-the-cleanup-of-an-algorithm-in-OSSL_METHOD_.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0006-For-child-libctx-provider-don-t-count-self-reference.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0
* https://git.launchpad.net/~adrien-n/ubuntu/+source/openssl/tree/debian/patches/jammy-sru-0007-Add-method-store-cache-flush-and-method-removal-to-n.patch?h=jammy-sru&id=04ef023920ab08fba214817523fba897527dfff0

=== Original description ===

This is about SRU'ing to Jammy the patches at https://github.com/openssl/openssl/pull/18151#issuecomment-1118535602 . They're purely performance but their impact is large. They have been released as part of openssl 3.0.4 (they're among the first after 3.0.3) which has been included in Kinetic.

Adrien Nader (adrien-n)
Changed in openssl (Ubuntu Lunar):
assignee: nobody → Adrien Nader (adrien-n)
Changed in openssl (Ubuntu Jammy):
milestone: none → ubuntu-22.04.4
milestone: ubuntu-22.04.4 → jammy-updates
Changed in openssl (Ubuntu Lunar):
assignee: Adrien Nader (adrien-n) → nobody
Changed in openssl (Ubuntu Jammy):
assignee: nobody → Adrien Nader (adrien-n)
importance: Undecided → Medium
status: New → In Progress
Changed in openssl (Ubuntu Lunar):
status: New → Fix Released
Revision history for this message
Adrien Nader (adrien-n) wrote :

I've created a PPA for Jammy that incorporates the fix mentionned. The details are available at https://launchpad.net/~adrien-n/+archive/ubuntu/openssl-jammy-sru . Could you test it and confirm your issue is solved?

Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
description: updated
description: updated
Revision history for this message
Adrien Nader (adrien-n) wrote :

Attaching debdiff for openssl from 3.0.2-0ubuntu1.10 to 3.0.2-0ubuntu1.11

description: updated
Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
description: updated
Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
Revision history for this message
Sergio Durigan Junior (sergiodj) wrote :

Thanks for the contribution, Adrien.

I find the naming scheme you chose for the patches a bit confusing. For example, you're using the prefix "jammy-sru-0001-" on several patches that are actually not strictly related. You also don't mention any patch explicitly in the d/changelog entry, which forces the reader to open d/p/series and look at the comments there. Moreover, the patches are missing DEP-3 headers (which, in this case, would be very useful when trying to understand the context when looking at a single patch).

Could you please address the concerns above before we proceed with the upload?

Thanks.

Revision history for this message
Adrien Nader (adrien-n) wrote :

(did my mail answer from yesterday get eaten by launchpad?)

Here's an updated debdiff that:
- renames files using the lpXXXX- prefix,
- reworks the changelog to a more typical format:
    * what (LP: #XXXX)
      - ${file}
- adds DEP-3 to the patches

I've pushed an updated build on LP at https://launchpad.net/~adrien-n/+archive/ubuntu/openssl-jammy-sru/+packages

It's still building unfortunately and I noticed typos in the changelog which I corrected but didn't upload to the PPA due to how long it takes to build. The differences are very minor (first level of the lsit in d/changelog used - rather than *).

Revision history for this message
Adrien Nader (adrien-n) wrote :

Removed ~ubuntu-sponsors for a few days while a few things settle.

Adrien Nader (adrien-n)
description: updated
Revision history for this message
Simon Chopin (schopin) wrote :

I am not going to upload this, because I'm widely uncomfortable with bug 1990216 (details there).

In addition, I have a few superficial suggestions for aesthetics:

* Use lpXXXXX subdirectories in d/patches rather than add a prefix to the patchname
* Add a Bug-Ubuntu field with a LP URL to each patch to make it even easier to go back to the relevant bug

Revision history for this message
Simon Chopin (schopin) wrote :

In addition, could the test plan maybe be edited to answer the following question?

"How can one validate that the package in -proposed addresses the issue?"

Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
description: updated
Revision history for this message
Adrien Nader (adrien-n) wrote :

Forgot to upload the latest debdiff.

Adrien Nader (adrien-n)
description: updated
Revision history for this message
Simon Chopin (schopin) wrote :

A version containing a fix for this has been uploaded to the Jammy queue to be processed by the SRU team. Thanks, Adrien :)

Adrien Nader (adrien-n)
description: updated
Adrien Nader (adrien-n)
description: updated
description: updated
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.