iocLogServer connection problem on win32-x86 platform

Bug #1188026 reported by Janez Golob
6
This bug affects 1 person
Affects Status Importance Assigned to Milestone
EPICS Base
Incomplete
Low
mdavidsaver

Bug Description

I am running iocLogServer on the win32-x86 platform (base 3.14.12, VS 2010 Express, Windows 7).

If the connection to the log server is inactive for more than 90 second (more precisely, if there are no messages logged) the following error is reported by the IOC (IOC log client):

epics> log client: lost contact with log server at "127.0.0.1:7111" because "An existing connection was forcibly closed by the remote host. "

I was digging a bit in to the iocLogServer.c source code and it seems to me the following statement causes the problem:

494 status = shutdown(pclient->insock, SHUT_WR);

It seems to me that shutting down one half of the connection can also close the opposite half. On the socket level the below two short python programs can be used to reproduce the behaviour. Since the shutdown function is used to close the connection gracefully I think the line 494 should be removed.

Also the IOC log client code uses the shutdown function to shut down the receive part of the socket. To my experience this works fine on all the platforms but it still might be a good idea to remove it as well. At least on win32 platform no resources are freed until close function is called.

Regards,
Janez

Server side:

import socket

HOST = ''
PORT = 50007

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.bind((HOST, PORT))
s.listen(1)
conn, addr = s.accept()
print 'Connected by', addr
conn.shutdown(socket.SHUT_WR)
while 1:
    data = conn.recv(1024)
    if not data: break
    print '"{}"'.format(data)
conn.close()

Client side:

import socket
import time

HOST = 'localhost'
PORT = 50007

s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.send('This should succeed')
time.sleep(100)
s.send('This shuld fail')

Revision history for this message
Jeff Hill (johill-lanl) wrote :

Presumably the shutdown call can be removed. The main issue here will be to make certain that any changes do not negatively impact other OS.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

Can someone with a windows system replicate this issue? The provided test scripts work on Linux (both lines are received by the server).

Changed in epics-base:
status: New → Incomplete
importance: Undecided → Low
assignee: nobody → mdavidsaver (mdavidsaver)
Revision history for this message
Janez Golob (janez-golob) wrote :

According to the Python documentation shooting down one half of the connection is platform specific:

Shut down one or both halves of the connection. If how is SHUT_RD, further receives are disallowed. If how is SHUT_WR, further sends are disallowed. If how is SHUT_RDWR, further sends and receives are disallowed. Depending on the platform, shutting down one half of the connection can also close the opposite half (e.g. on Mac OS X, shutdown(SHUT_WR) does not allow further reads on the other end of the connection).

See https://docs.python.org/2/library/socket.html.

I am not an expert in this area so it might be that this behavior can be prevented by changing some registers or similar (at least on Windows platform) but in general the purpose of shutdown function is to terminate TCP connection cleanly (or if you wish gracefully) before close function is called:

1) Close one half of the connection (send part)
2) Wait until the recv function completes with success and indicates that zero bytes were received
3) Close socket

Jeff commented that closing one half of the connection might potentially free some resources but at least for Windows I can tell that resources are freed after close function is called. For other platforms I don't know.

Revision history for this message
mdavidsaver (mdavidsaver) wrote :

I'm asking if this issue can be reproduced because I suspect this might be an artifact of a site policy of pruning "idle" connections rather than something intrinsic to the OS. Knowing that it has been observed on two differently administered networks would decide this.

Also, your original report doesn't actually say if you tried removing the shutdown(SHUT_WR) call to see if the connection is maintained. If your analysis is correct we may also need to remove shutdown(SHUT_RD) from logClient.c.

Just to be clear, the use of shutdown() here is an optimization. The theoretical savings in memory comes since the read/write buffer associated with a socket could be free'd when that side is shut down. This is probably on the order of 8KB per socket, which even on the server end doesn't add up to much. And that's assuming the OS actually free()s, which isn't specified.

If this is a Windows specific default behavior, then the call to shutdown() can simply be #ifdef'd out of that OS. I'd just like to be certain that the underlying problem has been fully understood.

Revision history for this message
Janez Golob (janez-golob) wrote :

I ended up removing shutdown functions on both ends and I haven't tried the other two possible combinations.

To post a comment you must log in.
This report contains Public information  
Everyone can see this information.

Other bug subscribers

Remote bug watches

Bug watches keep track of this bug in other bug trackers.