Re: DBus performance with file-data / thoughts on a future VFS



nf2 wrote:
As Gnome-VFS has switched to D-Bus - and some modules (like SMB) use D-Bus to transport file-data i would like to raise the question if D-Bus is a good choice for this purpose...

Here is a comparison of D-Bus with a simple IPC message protocol (vio_trans*)), sending 1 GB of data with one-way messages (4096 byte data chunks) over sockets (from client to server)...

D-Bus:
client:
16.60user 2.10system 1:16.05elapsed 24%CPU
server:
49.39user 2.93system 1:18.58elapsed 66%CPU

vio_trans:
client:
0.38user 2.08system 0:06.70elapsed 36%CPU
server:
0.52user 2.01system 0:09.76elapsed 26%CPU

--> D-Bus is more than 10 times slower. Of course D-Bus performance for byte-arrays could be improved, but the protocol overhead will remain. I guess D-Bus is just not optimal for this particular purpose...

How I tested:

client test:
$ dd if=/dev/zero bs=1000000000 count=1 | time vio_trans_streamtest client /tmp/testsock

server test:
$ time vio_trans_streamtest server /tmp/testsock | wc -c

A simple IPC protocol specially designed for VFS file-operations could be pretty fast. Perhaps all protocol-handlers could run inside the VFS daemon and the clients connect via a socket. It's easier to provide an async client interface this way - you don't need threads in the client...

Also, the VFS client library can be pretty lightweight, perhaps event without a glib dependency and therefore be more attractive to be used by everyone (even KIO?). Another advantage of moving protocol handling to a daemon is that buggy protocol handlers cannot crash the client application...

In the VFS-daemon there would be one thread per client connection. Because protocol-handlers (modules) are synchronous, the client has to open multiple connections for concurrent file-operations...

*) The vio_trans library and the tests can be downloaded from here:
http://www.scheinwelt.at/~norbertf/dadapt/files/vio_trans/


May I attach another thought?

One problem of the design of Gnome-VFS - especially for the async operations - or running modules behind an IPC or thread bridge in general - is that every data-chunk or dir-entry*) requires a context-switch.

For instance if you read a file:

[client] <--> [VFS module]

-- open -->
<-- open response --
-- read -->
<-- data chunk --
-- read -->
<-- data chunk --
-- read -->
<-- data chunk --
...
-- close -->
<-- close response --

A lot faster (I guess) would be a model, where file-data chunks (or dir-entries) are just pushed through a socket or pipe with one-way messages.

[client] <--> [VFS module]

-- open (read) -->
<-- data chunk --
<-- data chunk --
<-- data chunk --
...
<-- finished --

The protocol handler will read ahead a number of data-chunks and place them as messages in the in the socket buffer, probably until the buffer is full. The context switch can happen at any time and then the client will read a number of data-chunk messages...

I think that's why streaming files through pipes on the command line with '|' is so fast. The context-switch can happen at any time and there is always something to do for either the receiving or the sending process.
(and no idle time to wait for the next context-switch)

I guess the only difficulty with such a 'one-way' message model is seek, tell and random file access.

Please correct me if i'm wrong...

Norbert

*) I know that gnome_vfs_async_load_directory () has a 'items_per_notification' arg, but that looks a bit inconvenient to me - how should the application know the optimal number of dir-entries to read at once...







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]