Activity log for bug #616693

Date Who What changed Old value New value Message
2010-08-12 10:31:36 Julian Edwards bug added bug
2010-08-12 10:35:16 Julian Edwards soyuz: status New Triaged
2010-08-12 10:35:18 Julian Edwards soyuz: importance Undecided High
2010-08-12 10:35:26 Julian Edwards tags buildd-manager
2010-08-12 11:51:04 Julian Edwards soyuz: assignee Julian Edwards (julian-edwards)
2010-08-12 11:51:07 Julian Edwards soyuz: status Triaged In Progress
2010-08-12 17:34:30 Launchpad Janitor branch linked lp:~julian-edwards/launchpad/catch-eintr-bug-616693
2010-08-12 19:57:18 Cody A.W. Somerville tags buildd-manager buildd-manager oem-services
2010-08-12 20:16:32 Francis J. Lacoste soyuz: importance High Critical
2010-08-12 20:49:02 Robert Collins description In the new buildd manager, which now interleaves Popen with other builder comms, we are getting "Interrupted system call" errors when polling builders. This is most likely because the reset script for another builder just finished and we get a SIGCHLD, which terminates the socket receive op. {{{ 2010-08-12 11:05:38+0100 [-] Disabling builder: http://yellow.buildd:8221/ -- (4 , 'Interrupted system call') 2010-08-12 11:05:38+0100 [-] Traceback (most recent call last): 2010-08-12 11:05:38+0100 [-] File "/srv/launchpad.net/codelines/soyuz-producti on-rev-9648/lib/lp/buildmaster/model/builder.py", line 204, in updateBuilderStat us 2010-08-12 11:05:38+0100 [-] builder.checkSlaveAlive() 2010-08-12 11:05:38+0100 [-] File "/srv/launchpad.net/codelines/soyuz-producti on-rev-9648/lib/lp/buildmaster/model/builder.py", line 286, in checkSlaveAlive 2010-08-12 11:05:38+0100 [-] if self.slave.echo("Test")[0] != "Test": 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1147 , in __call__ 2010-08-12 11:05:38+0100 [-] return self.__send(self.__name, args) 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1437, in __request 2010-08-12 11:05:38+0100 [-] verbose=self.__verbose 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1185, in request 2010-08-12 11:05:38+0100 [-] errcode, errmsg, headers = h.getreply() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 1199, in getreply 2010-08-12 11:05:38+0100 [-] response = self._conn.getresponse() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 928, in getresponse 2010-08-12 11:05:38+0100 [-] response.begin() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 385, in begin 2010-08-12 11:05:38+0100 [-] version, status, reason = self._read_status() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 343, in _read_status 2010-08-12 11:05:38+0100 [-] line = self.fp.readline() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/socket.py", line 331, in readline 2010-08-12 11:05:38+0100 [-] data = recv(1) 2010-08-12 11:05:38+0100 [-] error: (4, 'Interrupted system call') 2010-08-12 11:05:39+0100 [-] yellow was made unavailable, resetting attached job 2010-08-12 11:05:40+0100 [-] Dispatching: <lawrencium:http://lawrencium.ppa:8221/> }}} In the new buildd manager, which now interleaves subprocesses via spawnProcess with other builder comms, we are getting "Interrupted system call" errors when polling builders. This is most likely because the reset script for another builder just finished and we get a SIGCHLD, which terminates the socket receive op. {{{ 2010-08-12 11:05:38+0100 [-] Disabling builder: http://yellow.buildd:8221/ -- (4 , 'Interrupted system call') 2010-08-12 11:05:38+0100 [-] Traceback (most recent call last): 2010-08-12 11:05:38+0100 [-] File "/srv/launchpad.net/codelines/soyuz-producti on-rev-9648/lib/lp/buildmaster/model/builder.py", line 204, in updateBuilderStat us 2010-08-12 11:05:38+0100 [-] builder.checkSlaveAlive() 2010-08-12 11:05:38+0100 [-] File "/srv/launchpad.net/codelines/soyuz-producti on-rev-9648/lib/lp/buildmaster/model/builder.py", line 286, in checkSlaveAlive 2010-08-12 11:05:38+0100 [-] if self.slave.echo("Test")[0] != "Test": 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1147 , in __call__ 2010-08-12 11:05:38+0100 [-] return self.__send(self.__name, args) 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1437, in __request 2010-08-12 11:05:38+0100 [-] verbose=self.__verbose 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/xmlrpclib.py", line 1185, in request 2010-08-12 11:05:38+0100 [-] errcode, errmsg, headers = h.getreply() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 1199, in getreply 2010-08-12 11:05:38+0100 [-] response = self._conn.getresponse() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 928, in getresponse 2010-08-12 11:05:38+0100 [-] response.begin() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 385, in begin 2010-08-12 11:05:38+0100 [-] version, status, reason = self._read_status() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/httplib.py", line 343, in _read_status 2010-08-12 11:05:38+0100 [-] line = self.fp.readline() 2010-08-12 11:05:38+0100 [-] File "/usr/lib/python2.5/socket.py", line 331, in readline 2010-08-12 11:05:38+0100 [-] data = recv(1) 2010-08-12 11:05:38+0100 [-] error: (4, 'Interrupted system call') 2010-08-12 11:05:39+0100 [-] yellow was made unavailable, resetting attached job 2010-08-12 11:05:40+0100 [-] Dispatching: <lawrencium:http://lawrencium.ppa:8221/> }}}
2010-08-17 07:31:30 Launchpad QA Bot tags buildd-manager oem-services buildd-manager oem-services qa-needstesting
2010-08-17 10:07:47 Julian Edwards soyuz: status In Progress Fix Released
2010-08-17 10:07:57 Julian Edwards tags buildd-manager oem-services qa-needstesting buildd-manager oem-services qa-ok
2010-08-17 12:49:27 Ursula Junque soyuz: milestone 10.09