repeatedly opening bzr+ssh connections to LP instances hangs

Bug #1018477 reported by Aaron Bentley
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
Bazaar
Incomplete
Medium
Unassigned
Launchpad itself
Triaged
High
Unassigned

Bug Description

If transports are not reused, the third attempt to open an LP branch hangs. This happens with local dev branches and with qastaging, but does not happen with production, or local non-launchpad.dev branches.

The issue is that bzrlib.transport.ssh._close_ssh_proc hangs in suprocess.Popen.wait forever.

It may be worth noting that _close_ssh_proc does not send any signal to the process. Instead, it closes the process's stdin, stdout and socket.

For me, forking before opening the branch was an effective work-around. Supplying possible_transports to Branch.open may also be suitable.

Script to reproduce:

from bzrlib.branch import Branch
from bzrlib.errors import NotBranchError
for x in range(3):
    try:
        Branch.open('bzr+ssh://bazaar.launchpad.dev/moo')
    except NotBranchError:
        pass

Tags: launchpad
Revision history for this message
John A Meinel (jameinel) wrote : Re: [Bug 1018477] [NEW] repeatedly opening bzr+ssh connections to LP instances hangs

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 6/27/2012 6:16 PM, Aaron Bentley wrote:
> Public bug reported:
>
> If transports are not reused, the third attempt to open an LP
> branch hangs. This happens with local dev branches and with
> qastaging, but does not happen with production, or local
> non-launchpad.dev branches.
>
> The issue is that bzrlib.transport.ssh._close_ssh_proc hangs in
> suprocess.Popen.wait forever.
>
> It may be worth noting that _close_ssh_proc does not send any
> signal to the process. Instead, it closes the process's stdin,
> stdout and socket.
>
> For me, forking before opening the branch was an effective
> work-around. Supplying possible_transports to Branch.open may also
> be suitable.
>
> Script to reproduce:
>
> from bzrlib.branch import Branch from bzrlib.errors import
> NotBranchError for x in range(3): try:
> Branch.open('bzr+ssh://bazaar.launchpad.dev/moo') except
> NotBranchError: pass
>
> ** Affects: bzr Importance: Undecided Status: New
>

This may be related to the forking-server code. It is enabled on
staging/qastaging/and dev. It is not enabled in production. It may be
the reason we were seeing connection limits reached in the past.

As such, I'm guessing this is a Launchpad issue, and not a bzr one.

You can try looking in the launchpad config file and disable
'use_forking_server = True' setting it to False (it has been a while
since I worked on it, though.)

 affects: launchpad

John
=:->

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Cygwin)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/rakUACgkQJdeBCYSNAANgIQCgv+B34eA/QXtOu152gZUULL0A
qsYAn05xGl63SPdvqeYLp430m+6ChUsM
=exY9
-----END PGP SIGNATURE-----

Revision history for this message
Aaron Bentley (abentley) wrote :

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 12-06-27 04:17 PM, John A Meinel wrote:
> This may be related to the forking-server code. It is enabled on
> staging/qastaging/and dev.

My tests confirm that disabling the forking-server code avoids the
problem.

> As such, I'm guessing this is a Launchpad issue, and not a bzr
> one.

We need more information about what condition causes the hang. I'll
call it condition A.

I think there are two issues:
1. Under condition A, bzr(lib) cannot shut down an SSH connection, and
even ^C ing it is painful.
2. The forking-server code creates condition A.

Until we know what condition A is, we don't know whether it's
legitimate for Launchpad to create it or for bzr to hang in it.

Aaron
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.11 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAk/rcvcACgkQ0F+nu1YWqI1c3QCfXmnxZDQ5paKeDXBWARyNh1O1
fUIAn2Q6NXTcmZbbatgnoOvxanHZGRUF
=GQJb
-----END PGP SIGNATURE-----

tags: added: launchpad
Changed in launchpad:
status: New → Triaged
importance: Undecided → High
Revision history for this message
Martin Packman (gz) wrote :

I can't reproduce this hang with the script given against live launchpad with current bazaar. It's possible something's been fixed, or the symptom isn't being tickled, or I'm just doing something wrong.

Changed in bzr:
importance: Undecided → Medium
status: New → Incomplete
To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.