snapd 2.26.8+17.10 ADT test failure with linux 4.12.0-6.7

Bug #1704158 reported by Seth Forshee
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
linux (Ubuntu)
Fix Released
High
Seth Forshee
Seth Forshee (sforshee)
tags: added: kernel-adt-failure
Revision history for this message
Seth Forshee (sforshee) wrote :

Both look like they hit timeouts.

On i386:

2017-07-12 20:41:37 Error executing autopkgtest:ubuntu-17.10-i386:tests/main/econnreset :
-----
+ echo 'Downloading a large snap in the background'
Downloading a large snap in the background
+ echo 'Wait until it actually got some data for the snap'
Wait until it actually got some data for the snap
+ grep -q 'Retrying.*download-snap/.*\.snap, attempt 1' snap-download.log
grep: + su -c '/usr/bin/env SNAPD_DEBUG=1 snap download --edge test-snapd-huge 2>snap-download.log' test
snap-download.log: No such file or directory
+ sleep 1
...
test-snapd-huge 511.30 MB / 512.00 MB 99.86% 8.13 MB/s 0s
+ grep -q 'Retrying.*download-snap/.*\.snap, attempt 1' snap-download.log
+ sleep 1
+ grep -q 'Retrying.*download-snap/.*\.snap, attempt 1' snap-download.log
+ sleep 1
<above 2 lines repeated many times>

<kill-timeout reached>
-----
...
2017-07-12 20:41:37 Restoring autopkgtest:ubuntu-17.10-i386:tests/main/econnreset...
2017-07-12 20:41:38 Error restoring autopkgtest:ubuntu-17.10-i386:tests/main/econnreset :
-----
+ echo 'Remove the firewall rule again'
Remove the firewall rule again
++ id -u test
+ iptables -D OUTPUT -m owner --uid-owner 12345 -j REJECT -p tcp --reject-with tcp-reset
iptables: No chain/target/match by that name.
-----

This error removing the firewall rule has been seen previously (bug #1691752) but was fixed, in this case I'm just assuming the rule wasn't actually added due to the test aborting.

On ppc64el:

2017-07-12 20:41:03 Error preparing autopkgtest:ubuntu-17.10-ppc64el:tests/main/classic-custom-device-reg :
-----
...
+ snap download --edge core
Fetching snap "core"

core 0 B / 77.89 MB 0.00%
core 47.41 KB / 77.89 MB 0.06% 236.15 KB/s 5m37s
...
core 18.84 MB / 77.89 MB 24.19% 111.07 KB/s 9m4s
<kill-timeout reached>
-----

This has been happening consistently with 4.12 kernels on ppc64el. I see one result where this happened with 4.11, but later tests passed, so possibly it's a kernel regression.

Revision history for this message
Seth Forshee (sforshee) wrote :

The i386 failure looks like a test problem. It's waiting for a string matching "Retrying.*download-snap/.*\.snap, attempt 1" to appear in snap-download.log to indicate that the download has begun. In this case it never sees the string despite the fact that the download has clearly started.

Revision history for this message
Seth Forshee (sforshee) wrote :

Subsequent i386 tests with 4.12 have shown timeouts similar to those seen with ppc, e.g. https://objectstorage.prodstack4-5.canonical.com/v1/AUTH_77e2ada1e7a84929a74ba3b87153c0ac/autopkgtest-artful-canonical-kernel-team-unstable/artful/i386/s/snapd/20170725_190404_e7461@/log.gz.

I'm wondering if there's some kind of network performance regression here. Some naive testing of 4.12 in a vm doesn't show any regressions over virtio. I'll test some more to see if I can find anything, any suggestions for things to test or other possible causes for the failures are welcome.

Revision history for this message
Seth Forshee (sforshee) wrote :

This does look to be a regression in virtio-net starting in 4.12 and still unfixed upstream (as of 4.13-rc2). I'll chase this down.

Revision history for this message
Seth Forshee (sforshee) wrote :

Oddly there were two performance regressions in virtio_net introduced in 4.12-rc1, one of which has been fixed. The other one doesn't affect everything, in fact the only thing I've found that it does affect so far is downloading snaps. After working out how to get the anonymous download url from the snap store I confirmed that I see the same regression using curl.

I've reported this to the virtio_net maintainers, waiting for a response.

Seth Forshee (sforshee)
affects: snapd (Ubuntu) → linux (Ubuntu)
Changed in linux (Ubuntu):
assignee: nobody → Seth Forshee (sforshee)
importance: Undecided → High
status: New → In Progress
Revision history for this message
Seth Forshee (sforshee) wrote :

Committed a revert to our 4.12 and 4.13 trees until we have a proper fix.

Changed in linux (Ubuntu):
status: In Progress → Fix Committed
Revision history for this message
Launchpad Janitor (janitor) wrote :

This bug was fixed in the package linux - 4.12.0-11.12

---------------
linux (4.12.0-11.12) artful; urgency=low

  * linux: 4.12.0-11.12 -proposed tracker (LP: #1709929)

  * CVE-2017-1000111
    - packet: fix tp_reserve race in packet_set_ring

  * CVE-2017-1000112
    - udp: consistently apply ufo or fragmentation

  * Please only recommend or suggest initramfs-tools | linux-initramfs-tool for
    kernels able to boot without initramfs (LP: #1700972)
    - Revert "UBUNTU: [Debian] Don't depend on initramfs-tools"
    - [Debian] Don't depend on initramfs-tools

  * Miscellaneous Ubuntu changes
    - SAUCE: (noup) Update spl to 0.6.5.11-ubuntu1, zfs to 0.6.5.11-1ubuntu3
    - SAUCE: powerpc: Always initialize input array when calling epapr_hypercall()

  * Miscellaneous upstream changes
    - selftests: typo correction for memory-hotplug test
    - selftests: check hot-pluggagble memory for memory-hotplug test
    - selftests: check percentage range for memory-hotplug test
    - selftests: add missing test name in memory-hotplug test
    - selftests: fix memory-hotplug test

 -- Seth Forshee <email address hidden> Thu, 10 Aug 2017 13:37:00 -0500

Changed in linux (Ubuntu):
status: Fix Committed → Fix Released
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.