Re: IIOP fix because unix io sucks



Hi,

> From: Michael Rumpf <michael rumpfonline de>
> To: orbit-list gnome org
> Subject: Re: question object servers going down
> Date: Wed, 13 Sep 2000 23:31:07 +0200
>
> Hi,
>
> I just tested this with the "echo-server" and the > "echo-client IOR_of_Server 
> 10000" and took the server down while the client > was running. The client 
> just hangs without any error message.

I had the same problem that the client just hangs after the server is
down.  

gdb says:

#0  0x40157dd3 in __writev (fd=7, vector=0x804e988, count=11) at
../sysdeps/unix/sysv/linux/writev.c:50
#1  0x400202a0 in giop_send_buffer_write () from
/usr/lib/libIIOP.so.0

...

And then I was refered to file
ORBit-0.5.7/src/IIOP/giop-msg-buffer.c .
After carefully reading through this file, I think inside it there is a
bug.  

The first argument of writev() is a file descriptor fd.  Certainly fd
could be invalid sometime.  For instance, in the above echo example the
server is down, then there is only one half of TCP connection
termincation and nothing happens at the client.  If such event happens,
SIGPIPE will be throwed out.  Unfortunately, giop_send_buffer_write()
doesn't handle such event.

The attached is my patch to fix this bug.  Is this  a reported bug?

Regards,
Yonghong

BTW: patch rule:
	cd ORBit-0.5.7
	patch -p1 <ORBit-0.5.7.patch
--- ORBit-0.5.7/src/IIOP/giop-msg-buffer.c	Mon Nov 20 10:34:04 2000
+++ ORBit-0.5.7/src/IIOP/giop-msg-buffer.c	Thu Mar  8 14:05:50 2001
@@ -21,6 +21,7 @@
 #include <string.h>
 #include <unistd.h>
 #include <stdio.h>
+#include <signal.h>
 #include <errno.h>
 #include <sys/types.h>
 #include <fcntl.h>
@@ -173,7 +174,7 @@
 giop_send_buffer_write(GIOPSendBuffer *send_buffer)
 {
   gulong nvecs;
-  glong res, sum, t;
+  glong res=-1, sum, t;
   struct iovec *curvec;
   int fd;
   GIOPConnection *cnx;
@@ -197,19 +198,68 @@
 	    sum);
   }
 #endif
+  /* If the peer process goes down while the local process is running,  
+   * there is just only one half of TCP connection termincation and
+   * nothing happens at this side.  Then writev, actually write, will
+   * throw a SIGPIPE event.  If SIGPIPE is ignored, errno is set as 
+   * EPIPE.  I use this way to fix this bug. 
+   */
+  signal(SIGPIPE, SIG_IGN);
   res = writev(fd, curvec, nvecs);
+  signal(SIGPIPE, SIG_DFL);
 
-  sum = (GIOP_MESSAGE_BUFFER(send_buffer)->message_header.message_size + sizeof(GIOPMessageHeader));
-  if(res < sum) {
-    if(res < 0) {
-      if(errno != EAGAIN) {
-	giop_main_handle_connection_exception(cnx);
-	goto out;
+  /* It makes more sense to check the writev return value right here
+   * instead of later. */
+  if(res < 0) {
+    if(errno == EPIPE) {
+//      giop_main_handle_connection_exception(cnx);
+	/* Sounds like giop_main_handle_connection_exception do too much,
+	 * if it is directly called, CORBA_Enviorment ev's _major will
+	 * be set as CORBA_NO_EXCEPTION
+	 */
+
+      /* The following statements are just a copy of the original 
+       * giop_main_handle_connection_exception but I comment some out
+       */
+      giop_connection_ref(cnx);
+
+//      giop_connection_remove_from_list(cnx);
+	/* Sounds like if this connection cnx is removed from the global
+	 * giop_connection_list, CORBA_Enviorment ev's _major will be
+	 * set as CORBA_NO_EXCEPTION.  Then I have to comment it out.
+	 */
+
+      shutdown(fd, 2);
+//      close(fd);
+//      GIOP_CONNECTION(cnx)->fd = -1;
+	/* UNIX docments say after close a fd by close(), the fd
+	 * couldn't be used any more in that process.  
+	 * Since this connection has not been removed from the global
+	 * list, this fd has been used later at somewhere.
+	 * Actually, using close(), I get some warning message.
+	 * On the other hand, shutdown() already closes this connection.
+	 */
+      cnx->is_valid = FALSE;
+
+      if(cnx->incoming_msg) {
+	giop_recv_buffer_unuse(cnx->incoming_msg);
+	cnx->incoming_msg = NULL;
       }
 
-      res = 0;
+      giop_connection_unref(cnx); 
+      /* End of giop_main_handle_connection_exception. */
+
+      goto out;
+    } else if(errno != EAGAIN) {
+      giop_main_handle_connection_exception(cnx);
+      goto out;
     }
 
+    res = 0;
+  }
+
+  sum = (GIOP_MESSAGE_BUFFER(send_buffer)->message_header.message_size + sizeof(GIOPMessageHeader));
+  if(res < sum) {
     /* wrote 7, iovecs 3, 2, 2, 4:
        0 + 3 !> 7
        3 + 2 !> 7


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]