comm_failure for large sequences using giop between two machines
- From: Bowie Owens <bowie owens csiro au>
- To: orbit-list gnome org
- Subject: comm_failure for large sequences using giop between two machines
- Date: Thu, 12 Jun 2003 10:24:19 +1000
Hi,
I am getting COMM_FAILURE exceptions when I try and send large sequences
(large messages, say a sequence of 100,000 doubles) using GIOP from a
server to a client on separate machines. I am using ORBit 2.7.1. I have
tracked the problem as far as giop_recv_buffer_get. Everything looks
fine to begin with (ent->cnx is valid pointer). giop_thread_self returns
a null pointer so the program takes the non-threaded branch. However,
after performing a single iteration of linc_main_iteration ent-cnx is
then set to null. So the loop exits and the function returns ent->buffer
which is also null. orbit_small_demarshal then exits with a return code
to signal an exception. The client debug trace:messages:giop output is
as follows (with a few little additions to help me find where things are
going wrong):
p58551 : ([140124e60])->new_set_variable_seq (Align = 28
Marshal: id 0x1fffb730
't' : kind - 17, i 0, 'b' : kind - 17, i 1, 'len' : kind - 5, i
0xd2f)Outgoing IIOP data:
0x0000: 47 49 4f 50 01 02 01 00 70 00 00 00 XX XX XX XX |
GIOP....p...****
---
0x000c: 30 b7 ff 1f 03 00 00 00 00 00 00 00 1c 00 00 00 |
0...............
0x001c: 00 00 00 00 7f 7c 87 b6 6c d1 6e d3 09 c0 6a d6 |
.....|..l.n...j.
0x002c: c3 35 e3 f1 01 00 00 00 cf e4 b0 18 15 00 00 00 |
.5..............
0x003c: 6e 65 77 5f 73 65 74 5f 76 61 72 69 61 62 6c 65 |
new_set_variable
0x004c: 5f 73 65 71 00 00 00 00 01 00 00 00 01 00 00 00 |
_seq............
0x005c: 0c 00 00 00 01 01 01 01 01 00 01 05 09 01 01 00 |
................
0x006c: 20 20 20 20 00 00 00 00 01 00 00 00 2f 0d 00 00 |
............/...
---
giop_recv_buffer_get ent->cnx 1401d9480
perform linc_main_iteration
Incoming IIOP header:
0x0000: 47 49 4f 50 01 02 01 01 64 fb 13 00 XX XX XX XX |
GIOP....d...****
---
!ent->cnx
No recv buffer ...
Sys exception incomplete on id 0x1fffb730
[System exception comm failure in ORBit_small_invoke_stub] )
Some of the server debug traces:messages output is :
p 9948 : ([0x816ec48])->new_set_variable_seq (0, 1, 0xd2f) =>;
seq[3375]={ very large number of object references }
Skimming the huge giop trace at the server end, it looks like server is
sending the message properly. Any insights into what is going wrong (or
how to track it down) would be most appreciated.
--
Bowie Owens
CSIRO Mathematical & Information Sciences
phone : +61 3 9545 8055
fax : +61 3 9545 8080
mobile : 0425 729 875
email : Bowie.Owens@csiro.au
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]