Comment 2 for bug 1956975

Revision history for this message
Heather Lanigan (hmlanigan) wrote (last edit ): Re: Agent lost

From the unit and machine logs, it looks like the connectivity issues started on 11/14. Continued when the model was upgraded from 2.8.7 to 2.9.18 on 11/15. It's unknown when the controller was upgraded.

2021-11-14 23:47:09 ERROR juju.worker.dependency engine.go:671 "leadership-tracker" manifold worker returned unexpected error: error while sso-wsgi/3 waiting for sso-wsgi leadership release: error blocking on leadership release: lease manager stopped
2021-11-14 23:47:09 ERROR juju.worker.dependency engine.go:671 "log-sender" manifold worker returned unexpected error: cannot send log message: tls: use of closed connection
2021-11-14 23:47:09 ERROR juju.worker.dependency engine.go:671 "api-caller" manifold worker returned unexpected error: api connection broken unexpectedly
2021-11-14 23:47:09 ERROR juju.worker.uniter agent.go:31 resolver loop error: committing operation "accept leadership" for sso-wsgi/3: writing state: connection is shut down
2021-11-14 23:47:09 ERROR juju.worker.uniter agent.go:34 updating agent status: connection is shut down
2021-11-14 23:47:09 INFO juju.worker.uniter uniter.go:286 unit "sso-wsgi/3" shutting down: committing operation "accept leadership" for sso-wsgi/3: writing state: connection is shut down
2021-11-14 23:47:09 ERROR juju.worker.uniter.metrics listener.go:52 failed to close the collect-metrics listener: close unix /var/lib/juju/agents/unit-sso-wsgi-3/622713258/s: use of closed network connection
2021-11-14 23:47:09 INFO juju.worker.logger logger.go:136 logger worker stopped
2021-11-14 23:47:27 ERROR juju.worker.dependency engine.go:671 "api-caller" manifold worker returned unexpected error: [f2483f] "unit-sso-wsgi-3" cannot open api: unable to connect to API: read tcp x.x.x.x:36728->x.x.x.x:17070: read: connection reset by peer

According to the agent.conf, the current version of juju in the model is 2.9.22. The machine upgraded to 2.9.21 on 12/8/2022.

The unit restarts a few times a day from 11/14 to 11/18. Starting after:
2021-11-16 08:50:51 ERROR juju.worker.dependency engine.go:676 "leadership-tracker" manifold worker returned unexpected error: error while sso-wsgi/3 waiting for sso-wsgi leadership release: error blocking on leadership release: lease manager stopped

No further unit logs in given data after:
2021-11-18 23:40:54 ERROR juju.worker.apicaller connect.go:204 Failed to connect to controller: invalid entity name or password (unauthorized access)

The machine agent has been trying to restart the unit since, however the current logs have no mention of what failed.

@laurant, can you please provide the /var/log/juju/logsink.log from machine-1 if it exists. As well as controller logs from 11/10 to 11/19.