Thanks to Nickolai Zeldovich <firstname.lastname@example.org> and Richard Sullivan
<RSullivan@refco.com> for their responses on this.
The problem actually was that the poll system call was indicating that the
socket was connected when it really wasn't. Our programmer was ultimately able
to solve the problem by using getpeername() to test the socket. Failure of
getpeername() would indicate that the socket was in fact not connected.
On Feb 15, 18:39, Bill Fenwick wrote:
>Apologies in advance for the unclear explanation... one of our programmers has
>run into a problem with sockets and C libraries, and I'm relaying his
>explanation second-hand without a very clear understanding of C or sockets...
>We've got an Ultra 1 and an Ultra 2, both running Solaris 2.6 5/98. Our
>application is using the Ultra 1 as a server and the Ultra 2 as a client. The
>application needs to set up a socket on the server which can be accessed from
>the client, so our programmer does the following:
>s = socket()
>bind (s,7777) # binds the socket to 7777
>listen(s) # listen for requests
>c = accept(s) # blocks it
>s = socket()
>connect(s,server/7777) # send a connect request to that port
>This works correctly if the server is online and has the socket set up and
>bound (immediate acceptance), or if the server is online but does not have the
>socket bound (immediate refusal), or if the server has been offline for a
>(immediate refusal). However, it does not work if the server was recently
>online and the client still has information on it. In that case, the connect
>request hangs for several minutes before coming back with a refusal. I gather
>this is normal.
>To get around this, our programmer changed the client's procedure to this:
>s = socket()
>fcntl(s,NONBLOCK) # set the socket to be non-blocked
># At this point, we get an immediate return of EINPROGRESS, indicating that
># connection is in progress, though not necessarily accepted or declined yet
>k = select(s,writing,1000) # Try to open the connection for a write for 1000
> # milliseconds
>if k then # If we get a response, then connection is up
> # connected
> else # Otherwise, close the connection
>Now, with this modification, if the server is online, we get a pretty much
>immediate acceptance. If the server is offline, we get a refusal after one
>The problem is, if the server is online but the socket is not bound, we still
>get an almost immediate acceptance. Then when the application tries to use
>that socket, it of course dies.
>The interesting thing is, if we use a Solaris 7 machine as a client, the
>modified client procedure works perfectly; acceptance if the socket is bound,
>refusal if the socket is not bound or the server is offline.
>So I'm wondering if there is a patch on the Solaris 2.6 side that addresses
>this problem. I searched the archives and SunSolve and found a few patches
>that we didn't have, but nothing that solved this problem. Of course, being a
>socket and C novice, I'm probably looking for the wrong things.
>Help, anybody? If this explanation is hopelessly garbled, let me know and
>try to get the programmer to knock some sense into me.
-- Bill Fenwick Email: email@example.com Digicomp Research Voice: (607) 273-5900 ext 32
This archive was generated by hypermail 2.1.2 : Fri Sep 28 2001 - 23:14:03 CDT