Re: Outbox traffic jam?



On Thu, 28 June 11:30 Carlos Morgado wrote:
| 
| On 2001.06.28 10:41:03 +0100 Brian Stafford wrote:
| > On Thu, 28 June 10:21 Carlos Morgado wrote:

| > | i believe the next step is to build a debug libesmtp :)
| > 
| > Actually libESMTP should have built with the -g option anyway.
| > Gdb should be able to be used no problem.
| > 
| i would expect it. must check my rpm
| 
| > Useful info, this is not where I suspected unwanted blocking.
| > I'll check over this for obvious bugs ... watch this space!
| > 
| 
| wrings a bell ? :)

I thought i'd found a problem in the program logic but there wasn't.
:-(

The only issue I found with raw_read and raw_write is that the calls
to poll() dont handle errno == EINTR.  The following patch applied
to siobuf.c will address this.

--------------------------------
--- siobuf.c.orig       Thu Jun 28 12:37:21 2001
+++ siobuf.c    Thu Jun 28 11:21:08 2001
@@ -324,7 +324,7 @@
 static void
 raw_write (struct siobuf *sio, const char *buf, int len)
 {
-  int n, total;
+  int n, total, status;
   struct pollfd pollfd;
 
   assert (sio != NULL && buf != NULL);
@@ -358,8 +358,11 @@
              return;
 
            pollfd.revents = 0;
-           n = poll (&pollfd, 1, sio->milliseconds);
-           if (n <= 0 || !(pollfd.revents & POLLOUT))
+           errno = 0;
+           while ((status = poll (&pollfd, 1, sio->milliseconds)) < 0)
+             if (errno != EINTR)
+               return;
+           if (status <= 0 || !(pollfd.revents & POLLOUT))
              return;
          }
       }
@@ -429,7 +432,7 @@
 static int
 raw_read (struct siobuf *sio, char *buf, int len)
 {
-  int n;
+  int n, status;
   struct pollfd pollfd;
 
   assert (sio != NULL && buf != NULL && len > 0);
@@ -460,9 +463,12 @@
            return 0;
 
          pollfd.revents = 0;
-         n = poll (&pollfd, 1, sio->milliseconds);
-         if (n <= 0 || !(pollfd.revents & POLLIN))
-           return 0;
+         errno = 0;
+         while ((status = poll (&pollfd, 1, sio->milliseconds)) < 0)
+           if (errno != EINTR)
+             return -1;
+         if (status <= 0 || !(pollfd.revents & POLLIN))
+           return -1;
        }
     }
   return n;
--------------------------------

Folks,

Try as I might, I cannot make either Balsa or the example program
block in the poll() in raw_write() except when it is legitimately
waiting for the server to accept more data.

I have sent messages up to about 8.5Mb using both balsa and the
example program, with and without using strace on both and via
three different MTAs (qmail, sendmail and M$ Exchange).  All messages
were either delivered or bounced after handing off to the remote
MTA.  In all cases the MTA responded to DATA with success.

I cannot find a path through the code that could lead to deadlock
while copying the message to the MTA, either by single stepping under
gdb or by inspection.

Right now I suspect there isn't actually a problem in libESMTP at all
unless some strange interaction with signals.  For the time being
I am not putting any more effort into resolving this issue.

To progress this any further I need solid documented evidence of
deadlock - i.e. server and client trying to write to each other
simultaneously or trying to read each other simultaneously.  I will
also need a method by which I can repeat the problem consistently.

Brian





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]