rsyslog FTBFS due to gzip different behavior with hw acceleration on s390x
Affects | Status | Importance | Assigned to | Milestone | ||
---|---|---|---|---|---|---|
Ubuntu on IBM z Systems |
Confirmed
|
High
|
bugproxy | |||
gzip (Ubuntu) | Status tracked in Plucky | |||||
Focal |
New
|
Undecided
|
Unassigned | |||
Jammy |
Confirmed
|
Undecided
|
Unassigned | |||
Noble |
Fix Committed
|
Undecided
|
Andreas Hasenack | |||
Oracular |
Fix Committed
|
Undecided
|
Andreas Hasenack | |||
Plucky |
Fix Released
|
Undecided
|
Andreas Hasenack | |||
rsyslog (Ubuntu) | Status tracked in Plucky | |||||
Focal |
New
|
Undecided
|
Unassigned | |||
Jammy |
New
|
Undecided
|
Unassigned | |||
Noble |
Invalid
|
Undecided
|
Unassigned | |||
Oracular |
Invalid
|
Undecided
|
Unassigned | |||
Plucky |
Invalid
|
Undecided
|
Unassigned | |||
zlib (Ubuntu) | Status tracked in Plucky | |||||
Focal |
New
|
Undecided
|
Unassigned | |||
Jammy |
New
|
Undecided
|
Unassigned | |||
Noble |
Incomplete
|
Undecided
|
Unassigned | |||
Oracular |
Incomplete
|
Undecided
|
Unassigned | |||
Plucky |
Incomplete
|
Undecided
|
Unassigned |
Bug Description
[ Impact ]
This affects s390x only, and causes an FTBFS in rsyslog in that architecture.
For some time, the gzip package has been carrying a patch to use s390x-specific hardware acceleration, if available. That was never the case it seems, in the ubuntu s390x infrastructure.
At some point in the past year, the ubuntu s390x infrastructure got upgraded and the new hardware had support for this code that was dormant in gzip until then. This surfaced a difference in implementation between the hardware-backed code, and the software-only code, and was enough to trigger a test failure in the rsyslog package, creating an FTBFS there on s390x.
[ Test Plan ]
The test plan consists in rebuilding rsyslog on s390x, and asserting that its build-time tests pass.
Without the gzip fix, the rsyslog tests that fail are:
FAIL: gzipwr_
FAIL: gzipwr_flushOnTXEnd
The full log can be seen in the original description of this bug.
With the gzip fix, the rsyslog tests all pass.
[ Where problems could occur ]
The fix is only affecting s390x-specific code, and only gets used on s390x hardware that has support for the used functions. If there were to be a regression, it would manifest itself on this specific hardware, and could cause all sorts of problems, as the gzip tool is used all over the place. Therefore, we also expect any regressions to be quickly found. The gzip test suite of course still passes after these changes.
[ Other Info ]
Due to the interdependency of this SRU with the rsyslog ones, the gzip package must be accepted and published before rsyslog. Specifically, these rsyslog bugs must only be accepted after gzip is accepted and published:
- https:/
- https:/
- https:/
If this is not observed, then rsyslog will fail to build on s390x, until the new gzip package is in proposed and published.
Upstream gzip is still discussing[1] details of this bug, tests, RFCs, and other places where the hardware implementation differs from the software one. Depending on the outcome and impact, we may SRU follow-up fixes down the road.
1. https:/
[ Original Description ]
During an archive rebuild, rsyslog FTBFS on s390x only: https:/
The build fails due to two tests:
FAIL: gzipwr_
=======
testbench: TZ env var not set, setting it to UTC
-------
08:47:04[0] Test: ./gzipwr_
-------
config rstb_216690_
1 module(
2 global(
3 default.
4 default.
5 # use legacy-style for the following settings so that we can override if needed
6 $MainmsgQueueTi
7 $MainmsgQueueTi
8 $IMDiagListenPo
9 $IMDiagServerRun 0
10 $IMDiagAbortTimeout 580
11
12 :syslogtag, contains, "rsyslogd" ./rstb_
13 ###### end of testbench instrumentation part, test conf follows:
14
15 module(
16 input(type="imtcp" port="0" listenPortFileN
17
18 template(
19 :msg, contains, "msgnum:" action(
20 zipLevel="6" ioBufferSize="256k"
21 flushOnTXEnd="off" flushInterval="1"
22 asyncWriting="on"
23 file="rstb_
rsyslogd: NOTE: RSYSLOG_
main Q:Reg: worker start requested, num workers currently 0
main Q:Reg: wrkr start initiated with state 0, num workers now 1
rsyslog debug: main Q:Reg: worker 0x2aa0873c810 started
rsyslog debug: main Q:Reg: started with state 3, num workers now 1
08:47:04[0] rstb_216690_
08:47:04[0] rsyslogd startup msg seen, pid 158166
waiting for file rstb_216690_
imdiag port: 35391
waiting for file rstb_216690_
TCPFLOOD_PORT now: 32793
starting run 1
Sending 2500 messages.
00002500 messages sent
runtime: 0.005
End of tcpflood Run
gzip: rstb_216690_
scanf error in index i=0
gzip: rstb_216690_
sequence error detected in rstb_216690_
number of lines in file: 0 rstb_216690_
sorted data has been placed in error.log, first 10 lines are:
1 scanf error in index i=0
---last 10 lines are:
1 scanf error in index i=0
UNSORTED data, first 10 lines are:
1 scanf error in index i=0
---last 10 lines are:
1 scanf error in index i=0
not reporting failure as RSYSLOG_STATSURL is not set
rsyslog pid file still exists, trying to shutdown...
rsyslogd debug: info: trying to cooperatively stop input ../plugins/
rsyslogd debug: info: trying to cooperatively stop input imtcp, timeout 60000 ms
rsyslog debug: main Q:Reg/w0: enter WrkrExecCleanup
rsyslog debug: 0x2aa0873c990: worker exiting
rsyslog debug: main Q:Reg/w0: thread joined
08:47:09[5] FAIL: Test ./gzipwr_
FAIL gzipwr_
FAIL: gzipwr_flushOnTXEnd
=======
testbench: TZ env var not set, setting it to UTC
-------
08:47:04[0] Test: ./gzipwr_
-------
config rstb_586738_
1 module(
2 global(
3 default.
4 default.
5 # use legacy-style for the following settings so that we can override if needed
6 $MainmsgQueueTi
7 $MainmsgQueueTi
8 $IMDiagListenPo
9 $IMDiagServerRun 0
10 $IMDiagAbortTimeout 580
11
12 :syslogtag, contains, "rsyslogd" ./rstb_
13 ###### end of testbench instrumentation part, test conf follows:
14
15 module(
16 input(type="imtcp" port="0" listenPortFileN
17
18 template(
19 :msg, contains, "msgnum:" { action(
20 zipLevel="6" ioBufferSize="256k"
21 flushOnTXEnd="on"
22 asyncWriting="on"
23 file="rstb_
24 action(
25 }
rsyslogd: NOTE: RSYSLOG_
main Q:Reg: worker start requested, num workers currently 0
main Q:Reg: wrkr start initiated with state 0, num workers now 1
rsyslog debug: main Q:Reg: worker 0x2aa18a89a50 started
rsyslog debug: main Q:Reg: started with state 3, num workers now 1
08:47:04[0] rstb_586738_
08:47:04[0] rsyslogd startup msg seen, pid 158888
waiting for file rstb_586738_
imdiag port: 35511
waiting for file rstb_586738_
TCPFLOOD_PORT now: 39421
starting run 1
Sending 2500 messages.
00002500 messages sent
runtime: 0.001
End of tcpflood Run
imdiag: wait q_empty: qsize 1210 nempty 0
imdiag: wait q_empty: qsize 0 nempty 1
imdiag[35511]: mainqueue empty
test 1
wait_file_lines success, have 2500 lines, took 0 seconds, file rstb_586738_
-rw-r--r-- 1 buildd buildd 4841 Sep 29 08:47 rstb_586738_
gzip: stdin: invalid compressed data--format violated
chkseq: start 0, end 2499
scanf error in index i=0
sequence error detected
not reporting failure as RSYSLOG_STATSURL is not set
rsyslog pid file still exists, trying to shutdown...
rsyslogd debug: info: trying to cooperatively stop input ../plugins/
rsyslogd debug: info: trying to cooperatively stop input imtcp, timeout 60000 ms
rsyslog debug: main Q:Reg/w0: enter WrkrExecCleanup
rsyslog debug: 0x2aa18a89bd0: worker exiting
rsyslog debug: main Q:Reg/w0: thread joined
08:47:05[1] FAIL: Test ./gzipwr_
FAIL gzipwr_
--
Since these are both gzip related, I looked at zlib and noticed that there are s390x-specific optimization patches for that package: https:/
In a PPA build, I re-built zlib without these s390x patches, and re-built rsyslog against that version. In that case, the build succeeded: https:/
Therefore, I believe the cause of this FTBFS is related the s390x-specific patches in zlib. This needs investigating by someone more familiar with s390x and/or these patches.
Related branches
- git-ubuntu bot: Approve
- Bryce Harrington (community): Approve
- Canonical Server Reporter: Pending requested
-
Diff: 90 lines (+68/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/0001-maint-fix-s390-buffer-flushes.patch (+59/-0)
debian/patches/series (+1/-0)
- git-ubuntu bot: Approve
- Bryce Harrington (community): Approve
- Canonical Server Reporter: Pending requested
-
Diff: 90 lines (+68/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/0001-maint-fix-s390-buffer-flushes.patch (+59/-0)
debian/patches/series (+1/-0)
- git-ubuntu bot: Approve
- Utkarsh Gupta (community): Approve
- Canonical Server Reporter: Pending requested
-
Diff: 91 lines (+69/-0)3 files modifieddebian/changelog (+8/-0)
debian/patches/0001-maint-fix-s390-buffer-flushes.patch (+60/-0)
debian/patches/series (+1/-0)
description: | updated |
Changed in ubuntu-z-systems: | |
status: | New → Confirmed |
assignee: | nobody → bugproxy (bugproxy) |
importance: | Undecided → High |
tags: | added: s390x |
tags: | added: architecture-s39064 bugnameltc-210346 severity-high targetmilestone-inin--- |
Changed in gzip (Ubuntu Plucky): | |
status: | New → Confirmed |
summary: |
- rsyslog FTBFS (s390x only) against zlib 1:1.3.dfsg+really1.3.1-1ubuntu1 + rsyslog FTBFS (s390x only) due to gzip different behavior with hw + acceleration on s390x |
summary: |
- rsyslog FTBFS (s390x only) due to gzip different behavior with hw - acceleration on s390x + rsyslog FTBFS due to gzip different behavior with hw acceleration on + s390x |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
description: | updated |
Changed in gzip (Ubuntu Jammy): | |
status: | New → Confirmed |
I had filed https:/ /bugs.launchpad .net/ubuntu/ +source/ rsyslog/ +bug/2083526 previously, and I couldn't reproduce the build/test failures on a s390x vm, just in a ppa.