Attached is a program that runs fine on both freebsd 5.0 rc2 and linux 2.4.18-3, and gives the same output. When run like so: $ sun_test 2 & sun_test 1 you'll see: sock2's peer = '/tmp/sock1' sock1's peer = '/tmp/sock2' I can't speak to what 2.5.x's output is for this program, but i bet it's the same. If it's not it must be a bug in my code :P If you comment out the bind(2) call before sock1's connect(2) call then sock2's peer shows up as "", but that's because it's not being bound. You're right about the purpose for clobbering the path (to force ORBit to create a new socket), but it's a horrible bandaid to work around a kernel bug, and a patch that shouldn't be blindly applied (much less rolled into .rpm's or .deb's). It's almost like finding a double free in your program and fixing it by recompiling with "#define free(x)": "There, the problem's gone, so the fix must be good. All hail stability." Chris On Mon, 2003-01-06 at 23:27, Mika Liljeberg wrote:
If I recall the earlier discussion, linux 2.4 did not correctly return the peername from an accepted unix socket but instead returned "". This was apparently fixed in linux 2.5.x. Ronald's patch emulates the 2.4 behaviour. My guess (from a VERY cursory glance at the ORBIT code) is that globbering the path prevents the ORB from reusing an accepted connection for sending GIOP requests the other way and instead forces the ORB to create a new client connection. Why unix sockets wouldn't wouldn't work for bi-direction GIOP is anybody's guess. I don't think anyone has bothered to find out, since everyone is convinced the bug is someone else's problem. Cheers, MikaL On Tue, 2003-01-07 at 04:04, Jeffrey Stedfast wrote:If the problem is that the usock.sun_path char array isn't nul-terminated anymore (and it used to be), then the correct fix would be something more like: if (getpeername (GIOP_CONNECTION_GET_FD(fd_cnx), (struct sockaddr *)&fd_cnx->u.usock, &n)) < 0) fn_cnx->u.usock.sun_path[0] = '\0'; else ((char *) &fn_cnx->u.usock)[n] = '\0'; or, to be safer... perhaps zero the struct using memset before calling getpeername? either way Ron's fix is wrong as it negates the sun_path variable completely, making the call to getpeername utterly worthless. Jeff On Mon, 2003-01-06 at 20:29, Chris Toshok wrote:I must be missing something, but what on earth is this fix supposed to accomplish? It actually fixes something? I can see doing something like: if (getpeername (GIOP_CONNECTION_GET_FD(fd_cnx), (struct sockaddr *)&fd_cnx->u.usock, &n)) < 0) fn_cnx->u.usock.sun_path[0] = '\0'; but blindly clobbering the path to the socket on the other end seems like a bad idea. Chris On Sun, 2003-01-05 at 21:43, Ronald Kuetemeier wrote:On Sun, Jan 05, 2003 at 10:41:43PM +0200, Mika Liljeberg wrote:By the way, your patch is included in Debian unstable [liborbit0-0.5.17-5]: --- orbit-0.5.17.orig/src/IIOP/connection.c +++ orbit-0.5.17/src/IIOP/connection.c @@ -459,6 +459,7 @@ fd_cnx->u.usock.sun_family = AF_UNIX; getpeername(GIOP_CONNECTION_GET_FD(fd_cnx), (struct sockaddr *)&fd_cnx->u.usock, &n); + fd_cnx->u.usock.sun_path[0] = '\0'; break; #ifdef HAVE_IPV6 It may not be the correct fix but at least it solves the immediate problem.It is the only place to fix it without interfering with other programs, that's why I did fixed it there. Good to see that some distributions prefer a stable system. RonaldCheers, MikaL On Sun, 2003-01-05 at 20:23, Ronald Kuetemeier wrote:On Mon, Jan 06, 2003 at 05:40:45PM +0100, Joaquim Fellmann wrote:On Sat, 2003-01-04 at 02:13, Ronald Kuetemeier wrote:Sorry but _NO_ it's not 2.5, Gnome is broken, you can read all about it and get a patch on the evolution-hackers list.Wrong too. It seems to be Orbit assuming a kernel routine to return some value but receiving something else. Actually it was a kernel bug (that got fixed) on which Orbit was relying. Problem is that Orbit didn't get fixed.Maybe you should read the thread on evolution-hackers,and then contact some kernel hackers, Alan, Dave and Al come to mind. My patch resets new 2.5 behavior for/in Orbit to 2.4 behavior. But the real problem is within Gnome, so far I only hear from the Gnome/Orbit maintainers it's the Kernel without any proof. Just saying so is not enough, I know it's kind of hard to find a problem in a few hundred thousand kernel and Gnome/evolution source lines. Been there done that. And if you take a look at the Gnome 2.X source you might find that it's mood to talk about this any further, if you understand the problem. RonaldA message on the linux kernel mailing list is refering to a bitkeeper changeset at the origin of the "evolution case". Before this changeset Evolution is dealing right with kernel 2.5 and after this changeset it doesn't work anymore. See http://www.cs.helsinki.fi/linux/linux-kernel/2002-41/0444.html Regards -- Joaquim Fellmann <mljf altern org> _______________________________________________ evolution maillist - evolution ximian com http://lists.ximian.com/mailman/listinfo/evolution_______________________________________________ evolution maillist - evolution ximian com http://lists.ximian.com/mailman/listinfo/evolution_______________________________________________ evolution maillist - evolution ximian com http://lists.ximian.com/mailman/listinfo/evolution_______________________________________________ evolution maillist - evolution ximian com http://lists.ximian.com/mailman/listinfo/evolution
Attachment:
sun_test.c
Description: Text Data