Re: [Evolution] kernel 2.5 and evolution



On Tue, 2003-01-07 at 02:57, Chris Toshok wrote:
Attached is a program that runs fine on both freebsd 5.0 rc2 and linux
2.4.18-3, and gives the same output.  When run like so:

$ sun_test 2 & sun_test 1

you'll see:

sock2's peer = '/tmp/sock1'
sock1's peer = '/tmp/sock2'

I can't speak to what 2.5.x's output is for this program, but i bet it's
the same.  If it's not it must be a bug in my code :P

If you comment out the bind(2) call before sock1's connect(2) call then
sock2's peer shows up as "", but that's because it's not being bound.

You're right about the purpose for clobbering the path (to force ORBit
to create a new socket), but it's a horrible bandaid to work around a
kernel bug, and a patch that shouldn't be blindly applied (much less
rolled into .rpm's or .deb's).  It's almost like finding a double free
in your program and fixing it by recompiling with "#define free(x)": 
"There, the problem's gone, so the fix must be good.  All hail
stability."

I suggest you run an strace on evolution, just to actually see what's
going on. 
I don't understand what's the big deal anyway, Gnome 1.X is depreciated
from my point of view, Gnome 2.X is a rewrite in that place, and in a
year we have programs which can generate applications like Gnome in a
month.
Ronald

Chris

On Mon, 2003-01-06 at 23:27, Mika Liljeberg wrote:
If I recall the earlier discussion, linux 2.4 did not correctly return
the peername from an accepted unix socket but instead returned "". This
was apparently fixed in linux 2.5.x. Ronald's patch emulates the 2.4
behaviour.

My guess (from a VERY cursory glance at the ORBIT code) is that
globbering the path prevents the ORB from reusing an accepted connection
for sending GIOP requests the other way and instead forces the ORB to
create a new client connection. Why unix sockets wouldn't wouldn't work
for bi-direction GIOP is anybody's guess. I don't think anyone has
bothered to find out, since everyone is convinced the bug is someone
else's problem.

Cheers,

    MikaL

On Tue, 2003-01-07 at 04:04, Jeffrey Stedfast wrote:
If the problem is that the usock.sun_path char array isn't
nul-terminated anymore (and it used to be), then the correct fix would
be something more like:

if (getpeername (GIOP_CONNECTION_GET_FD(fd_cnx),
                 (struct sockaddr *)&fd_cnx->u.usock, &n)) < 0)
        fn_cnx->u.usock.sun_path[0] = '\0';
else
        ((char *) &fn_cnx->u.usock)[n] = '\0';

or, to be safer... perhaps zero the struct using memset before calling
getpeername?

either way Ron's fix is wrong as it negates the sun_path variable
completely, making the call to getpeername utterly worthless.

Jeff

On Mon, 2003-01-06 at 20:29, Chris Toshok wrote:
I must be missing something, but what on earth is this fix supposed to
accomplish?  It actually fixes something?

I can see doing something like:

if (getpeername (GIOP_CONNECTION_GET_FD(fd_cnx),
                 (struct sockaddr *)&fd_cnx->u.usock, &n)) < 0)
        fn_cnx->u.usock.sun_path[0] = '\0';

but blindly clobbering the path to the socket on the other end seems
like a bad idea.

Chris

On Sun, 2003-01-05 at 21:43, Ronald Kuetemeier wrote:
On Sun, Jan 05, 2003 at 10:41:43PM +0200, Mika Liljeberg wrote:
By the way, your patch is included in Debian unstable
[liborbit0-0.5.17-5]:


--- orbit-0.5.17.orig/src/IIOP/connection.c
+++ orbit-0.5.17/src/IIOP/connection.c
@@ -459,6 +459,7 @@
     fd_cnx->u.usock.sun_family = AF_UNIX;
     getpeername(GIOP_CONNECTION_GET_FD(fd_cnx),
        (struct sockaddr *)&fd_cnx->u.usock, &n);
+    fd_cnx->u.usock.sun_path[0] = '\0';
     break;

 #ifdef HAVE_IPV6

It may not be the correct fix but at least it solves the immediate
problem.
It is the only place to fix it without interfering with other programs,
that's why I did fixed it there.
Good to see that some distributions prefer a stable system.
Ronald

Cheers,

    MikaL


On Sun, 2003-01-05 at 20:23, Ronald Kuetemeier wrote:
On Mon, Jan 06, 2003 at 05:40:45PM +0100, Joaquim Fellmann wrote:
On Sat, 2003-01-04 at 02:13, Ronald Kuetemeier wrote:

Sorry but _NO_ it's not 2.5, Gnome is broken,
you can read all about it and get a patch on the evolution-hackers list.

Wrong too.
It seems to be Orbit assuming a kernel routine to return some value but
receiving something else. 
Actually it was a kernel bug (that got fixed) on which Orbit was
relying.
Problem is that Orbit didn't get fixed.
Maybe you should read the thread on evolution-hackers,and then contact some kernel hackers, 
Alan, Dave and Al come to mind.
My patch resets new 2.5 behavior for/in Orbit to 2.4 behavior. But the real problem is within 
Gnome, so far I only hear from the Gnome/Orbit maintainers it's the Kernel without any proof.  
Just saying so is not enough, I know it's kind of hard to find a problem in a few hundred 
thousand kernel and Gnome/evolution source lines. Been there done that.  And if you take a look 
at the Gnome 2.X source you might find that it's mood to talk about this any further, if you 
understand the problem.
Ronald 


A message on the linux kernel mailing list is refering to a bitkeeper
changeset at the origin of the "evolution case". Before this changeset
Evolution is dealing right with kernel 2.5 and after this changeset it
doesn't work anymore.


See http://www.cs.helsinki.fi/linux/linux-kernel/2002-41/0444.html


Regards



-- 

Joaquim Fellmann <mljf altern org>


_______________________________________________
evolution maillist  -  evolution ximian com
http://lists.ximian.com/mailman/listinfo/evolution

_______________________________________________
evolution maillist  -  evolution ximian com
http://lists.ximian.com/mailman/listinfo/evolution

_______________________________________________
evolution maillist  -  evolution ximian com
http://lists.ximian.com/mailman/listinfo/evolution

_______________________________________________
evolution maillist  -  evolution ximian com
http://lists.ximian.com/mailman/listinfo/evolution





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]