OpenSSH server config broken on unattended update

Bug #2087551 reported by Chris Leonard

This bug report was marked for expiration 26 days ago. (find out why)

12
This bug affects 2 people
Affects Status Importance Assigned to Milestone
openssh (Ubuntu)
Incomplete
Critical
Unassigned

Bug Description

My server performed unattended update of openssh-server from 1:9.6p1-3ubuntu13.5 to 1:9.6p1-3ubuntu13.7, and after this I could not access ssh anymore, connection refused.

Following the steps at the bottom of this post to use non-socket-based-activation has allowed me to connect to the server again:

https://discourse.ubuntu.com/t/sshd-now-uses-socket-based-activation-ubuntu-22-10-and-later/30189

I suspect this is related to using a non-default port, although the systemd socket configuration appeared to exist with correct values, as well as the custom port value in sshd_config, before making the above change.

ProblemType: Bug
DistroRelease: Ubuntu 24.04
Package: openssh-server 1:9.6p1-3ubuntu13.7
ProcVersionSignature: Ubuntu 6.8.0-48.48-generic 6.8.12
Uname: Linux 6.8.0-48-generic x86_64
ApportVersion: 2.28.1-0ubuntu3.1
Architecture: amd64
CasperMD5CheckResult: unknown
CloudArchitecture: x86_64
CloudBuildName: server
CloudID: configdrive
CloudName: configdrive
CloudPlatform: configdrive
CloudSerial: 20231014
CloudSubPlatform: config-disk (/dev/vdb)
Date: Fri Nov 8 13:13:51 2024
ProcEnviron:
 LANG=C.UTF-8
 PATH=(custom, no user)
 SHELL=/bin/bash
 TERM=tmux-256color
SourcePackage: openssh
UpgradeStatus: Upgraded to noble on 2024-06-04 (157 days ago)

Revision history for this message
Chris Leonard (veltas) wrote :
Steve Langasek (vorlon)
tags: added: regression-update
Changed in openssh (Ubuntu):
importance: Undecided → Critical
Revision history for this message
Nick Rosbrook (enr0n) wrote :

Could you please share at least the relevant parts of your sshd_config? Anything that configures Port, ListenAddress, or AddressFamily? Feel free to modify the actual port and listen address values of course; I just need to know where you are using non-default configuration.

Changed in openssh (Ubuntu):
status: New → Incomplete
Revision history for this message
Chris Leonard (veltas) wrote :

# This is the sshd server system-wide configuration file. See
# sshd_config(5) for more information.

# This sshd was compiled with PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games

# The strategy used for options in the default sshd_config shipped with
# OpenSSH is to specify options with their default value where
# possible, but leave them commented. Uncommented options override the
# default value.

Include /etc/ssh/sshd_config.d/*.conf

# Port and ListenAddress options are not used when sshd is socket-activated,
# which is now the default in Ubuntu. See sshd_config(5) and
# /usr/share/doc/openssh-server/README.Debian.gz for details.
Port <...redacted...>
#AddressFamily any
#ListenAddress 0.0.0.0
#ListenAddress ::

#HostKey /etc/ssh/ssh_host_rsa_key
#HostKey /etc/ssh/ssh_host_ecdsa_key
#HostKey /etc/ssh/ssh_host_ed25519_key

# Ciphers and keying
#RekeyLimit default none

# Logging
#SyslogFacility AUTH
#LogLevel INFO

# Authentication:

Revision history for this message
Chris Leonard (veltas) wrote :

/etc/ssh/sshd_config.d/ is empty

As I've said, my config is working after following the linked steps, but unfortunately that means I don't have the systemd socket configuration files anymore.

I'm hoping the fact that this specific update seemed to stop my ssh service can help narrow this down.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Okay - nothing in /etc/ssh/sshd_config.d/? You just have a non-default port configured?

Was ssh.socket failing to start after the upgrade? E.g. does journalctl -u ssh.socket show errors around that time (or ssh.service for that matter)?

Revision history for this message
Nick Rosbrook (enr0n) wrote :

So far I cannot reproduce anything like you are describing. Any other logs from ssh.service and ssh.socket would be helpful.

Since it appears you have another way to gain shell access besides ssh, are you able to test if the problem persists if you attempt to restore the socket activation configuration?

Revision history for this message
Chris Leonard (veltas) wrote :

Update was at 6:48 local time today.

journalctl -u ssh.socket shows the socket deactivated and came back, no more activity until I started trying to fix things at 8:13:

Nov 08 06:48:27 www-veltas systemd[1]: ssh.socket: Deactivated successfully.
Nov 08 06:48:27 www-veltas systemd[1]: Closed ssh.socket - OpenBSD Secure Shell server socket.
Nov 08 06:48:30 www-veltas systemd[1]: Listening on ssh.socket - OpenBSD Secure Shell server socket.

Revision history for this message
Chris Leonard (veltas) wrote :

This is shown corresponding to that unattended update with -u ssh.service:

Nov 08 06:48:27 www-veltas sshd[1102]: Received signal 15; terminating.
Nov 08 06:48:27 www-veltas systemd[1]: Stopping ssh.service - OpenBSD Secure Shell server...
Nov 08 06:48:27 www-veltas systemd[1]: ssh.service: Deactivated successfully.
Nov 08 06:48:27 www-veltas systemd[1]: Stopped ssh.service - OpenBSD Secure Shell server.
Nov 08 06:48:27 www-veltas systemd[1]: ssh.service: Consumed 2min 19.476s CPU time, 20.1M memory peak, 1.1M memory swap peak.

Revision history for this message
Chris Leonard (veltas) wrote :

And in case it was missed: /etc/ssh/sshd_config.d/ is empty.

I can try breaking ssh again to reproduce, but I don't know sockets too well, could you point me to some info on how to enable this? I tried looking and can only find instructions to go the other way, apologies!

Revision history for this message
Nick Rosbrook (enr0n) wrote :

No problem. You should be able to restore socket activation with:

# This removes the symlink to mask the generator, if needed.
$ rm -f /etc/systemd/system-generators/sshd-socket-generator
$ systemctl daemon-reload
$ systemctl disable --now ssh.service
$ systemctl enable --now ssh.socket

Revision history for this message
Chris Leonard (veltas) wrote :

Thanks.

This has restored socket-based activation, but with a different configuration than I had earlier, and has not reproduced the problem.

I've confirmed that this really is socket-based activation by stopping the ssh service in the recovery console, confirming sshd is dead, and reconnecting, which worked fine.

Unfortunately I didn't save the old socket configuration.

Revision history for this message
Timo Aaltonen (tjaalton) wrote :

I've reverted the update, 13.7 got removed from noble-updates, 13.5 put back

Revision history for this message
Nick Rosbrook (enr0n) wrote :

> but with a different configuration than I had earlier

Different in what way?

I am glad this restored socket-activation for you. I am not sure how to further investigate the bug. If you are able to find some error in the journal, or evidence that ssh.socket was listening on the wrong port etc., please share it.

Did you happen to check systemctl status ssh.socket ssh.service (or something) before making the change to non-socket activation?

Revision history for this message
Chris Leonard (veltas) wrote :

> If you are able to find some error in the journal, or evidence that ssh.socket was listening on the wrong port etc., please share it.

Sorry I have nothing, just know that it was refusing connection, and that trying to start sshd manually with `systemctl start ssh.service` didn't work even though it said it was listening on the port in `systemctl status ssh.service` after that.

Sorry I can't be more help.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

> Sorry I can't be more help.

No worries. Thanks for all the information you provided.

Revision history for this message
Kevin Butter (kbutter) wrote :

I just wanted to add to this thread that I experienced the same issue after our server performed the unattended update of openssh-server from 1:9.6p1-3ubuntu13.5 to 1:9.6p1-3ubuntu13.7. We had a custom port for SSH, and our attempts to switch back to default so far have been fruitless.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Kevin - can you provide any additional information about what's not working? If you otherwise have access to the server, are ssh.service or ssh.socket reporting errors?

Revision history for this message
Kevin Butter (kbutter) wrote :

Nick-
We had the same config as OP on his first report.
We did what OP did by rolling back to the non-socket based SSH. We have our connections back on 22. There were zero errors in the logs to report, and when we were originally trying to ssh with --verbose the only message shown was "connection refused"- We dug around in the logs and nothing was showing as an issue, which led us to this thread.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Okay, thanks for following up.

If anyone finds a way to reliably reproduce this, please let me know. So far I have tried:

- adding a custom port (both directly in /etc/ssh/sshd_config, and using /etc/ssh/sshd_config.d/port.conf)
- upgrading openssh-server (both manually, and by running unattended-upgrades)

In any case, I do not experience any connection refused errors, nor do I see any errors on the server side ssh units (ssh.service, ssh.socket).

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I would also throw in some testing around address families, ipv4/ipv6, maybe there is a regression in https://bugs.launchpad.net/ubuntu/+source/openssh/+bug/2080216

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Thanks for the suggestion Andreas. I have tested some scenarios like that and still cannot reproduce.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Chris and Kevin (or anyone else affected) - Can you please attach the logs in /var/log/unattended-upgrades/? Maybe they will give a clue to anything unusual that happened during the upgrade itself.

tags: added: rls-nn-incoming
Nick Rosbrook (enr0n)
tags: removed: rls-nn-incoming
Revision history for this message
John Anderson (janderson73) wrote :

I had this happen to me as well as the original poster. I also disabled socket authentication using the steps from the thread attached by the original poster to get access to the server again via SSH. Here is the unattended-upgrades log:

Log started: 2024-11-09 06:52:08
Preconfiguring packages ...
Preconfiguring packages ...
(Reading database ... 121348 files and directories currently installed.)
Preparing to unpack .../openssh-sftp-server_1%3a9.6p1-3ubuntu13.7_amd64.deb ...
Unpacking openssh-sftp-server (1:9.6p1-3ubuntu13.7) over (1:9.6p1-3ubuntu13.5) ...
Preparing to unpack .../openssh-server_1%3a9.6p1-3ubuntu13.7_amd64.deb ...
Unpacking openssh-server (1:9.6p1-3ubuntu13.7) over (1:9.6p1-3ubuntu13.5) ...
Preparing to unpack .../openssh-client_1%3a9.6p1-3ubuntu13.7_amd64.deb ...
Unpacking openssh-client (1:9.6p1-3ubuntu13.7) over (1:9.6p1-3ubuntu13.5) ...
Setting up openssh-client (1:9.6p1-3ubuntu13.7) ...
Setting up openssh-sftp-server (1:9.6p1-3ubuntu13.7) ...
Setting up openssh-server (1:9.6p1-3ubuntu13.7) ...
Processing triggers for man-db (2.12.0-4build2) ...
Processing triggers for ufw (0.36.2-6) ...

Restarting services...

Service restarts being deferred:
 /etc/needrestart/restart.d/dbus.service
 systemctl restart <email address hidden>
 systemctl restart <email address hidden>
 systemctl restart systemd-logind.service
 systemctl restart unattended-upgrades.service

No containers need to be restarted.

No user sessions are running outdated binaries.

No VM guests are running outdated hypervisor (qemu) binaries on this host.
Log ended: 2024-11-09 06:52:14

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Thanks, John. That all looks normal.

Can you share the relevant parts of your configuration? I.e.:

sshd -T | grep -E '^port|^listenaddress|^addressfamily'

Again, the specific values are not so important, but I would like to know if they differ from the defaults. Is there any thing else special about your configuration?

Can you also provide any relevant logs from ssh.service and ssh.socket during the time you could not connect? And, what was the exact error you saw on the client side?

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Chris, Kevin, John, or anyone else affected:

Is it possible that you changed your sshd_config without actually reloading and restarting ssh.socket, a while (hours, days, etc.) before the upgrade occurred? I am wondering if it's possible that the configuration was already broken by a local change, and the upgrade (which will call systemctl daemon-reload and systemctl restart ssh.socket) just surfaced the problem.

Revision history for this message
Nick Rosbrook (enr0n) wrote :

Also, can you all try to restore socket activation, and see if the problem persists? You can restore socket activation by running:

# This removes the symlink to mask the generator, if needed.
$ rm -f /etc/systemd/system-generators/sshd-socket-generator
$ systemctl daemon-reload
$ systemctl disable --now ssh.service
$ systemctl enable --now ssh.socket

Revision history for this message
Andreas Hasenack (ahasenack) wrote (last edit ):

Is there any chance that the people affected by this bug perhaps had an upgrade policy such that the new config file shipped with openssh-server would be taken as-is, instead of keeping the local changes? The telltale sign would be a message like "Replacing config file /etc/ssh/sshd_config" in the terminal output during the upgrade. I don't see that in comment #23, so that case was not this.

If you changed the port, and then the upgrade was allowed to install the new config file (because there were changes in the new config file shipped in openssh-server), then the port would be restored to 22.

So when you guys experienced "connection refused", was that on the custom port you had, and you didn't try port 22? Or was port 22 also closed?

Were there any packaging-driven backup files created in /etc/ssh? Like sshd_config.ucf-old, for example

Revision history for this message
Andreas Hasenack (ahasenack) wrote :

I tried many other things to reproduce this bug:

- looks like the reporter had this happen in a Digital Ocean VM. I tried that too, going through the openssh upgrades all the way to 13.7, changing the port to 2240, and it just worked
- tried ipv4 and ipv6
- then noted I was doing this all via ssh, which could interfere with the troubleshooting. Went back to local lxc and used "lxc console" instead of an ssh connection. It also worked
- then I used unattended-upgrades itself. I configured the system to bump the priority of openssh in noble-proposed, and configured unattended-upgrades to also consider proposed. It upgraded openssh-server without issues, on the different port, and I could ssh in after
- finally, same as above, but I did not restart openssh (or the socket) after changing the port to 2240. I let unattended-upgrades do it, to the version in proposed. It also worked.

I'm out of ideas here. The only case where I could reproduce something similar to what was reported here is if I let the new configuration file from the package overwrite my local changes, but even then, all that would happen is ssh/systemd listening again on port 22 instead of my custom port. If you guys had a firewall on port 22 or something like that, it could explain the system no longer being reachable, but the log from comment #23 disproves that theory for that user at least.

Revision history for this message
Launchpad Janitor (janitor) wrote :

[Expired for openssh (Ubuntu) because there has been no activity for 60 days.]

Changed in openssh (Ubuntu):
status: Incomplete → Expired
Bryce Harrington (bryce)
Changed in openssh (Ubuntu):
status: Expired → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.