Re: [PATCH] Vastly improve GnomeVFS xfer performance



On Mon, 2006-03-13 at 12:49 +0100, Christian Neumair wrote:
> Am Montag, den 13.03.2006, 10:22 +0100 schrieb Alexander Larsson:
> > On Sun, 2006-03-12 at 16:12 +0100, Christian Neumair wrote:
> > > The attached patch vastly improves the Xfer perfomance of GnomeVFS, in
> > > particular the sftp performance.
> > > 
> > > Changes
> > > 
> > > * When copying file data, consider both the source and the target I/O
> > > block size, reading with source block-sized buffer, writing with target
> > > block-sized buffer. It could be further refined with odd I/O block sizes
> > > (cf. the FIXME comment).
> > 
> > The block size is not in any way a limit, its more like a hint to the
> > minimal size and good block size multiple to the app doing i/o, so I
> > believe just using MAX(source-size, target-size) as one buffer is as
> > good as using both sizes in more complicated ways. 
> > 
> > Also, the way you loop reading from the source in order to get at least
> > isn't ideal. If you're reading from a slow source you'd block reading
> > for quite some time before even starting to write to the target. What
> > you want to do is to request enough data
> 
> How do you determine what "enough" means? For now, it's MAX
> (source_block_size, target_block_size), so it's still sort of a lower
> boundary. Of course, because we map the whole buffer in-memory, passing
> a huge size may not be wise either, but was told that write() and read()
> should receive the max. possible size. I'm not sure how relevant all
> these considerations are for network I/O, maybe somebody knows good
> literature on this, which provides statistical evidence of buffer size
> and other design decisions under concrete scenarios, instead of mere
> thumb rules and guesses.

I commited a patch based on yours.

The ideal solution wouldn't work the way the current loop does where we
first read a large chunk, then write it. The ideal system would have
parallel reading and writing into a queue, where we limit the reads if
the buffer is full. With the simpler system we have now we get pretty
good results if you tweak the buffer sizes a bit though.

> I'd also like to replace some of the really stupid g_return_if_fail
> macros in the modules by either g_asserts, or remove them completely.,
> or enclose them in some DEBUG statement.
>
> For instance, we're initializing, clearing, reading, writing, sending
> and freeing buffers all the time in the sftp module, and all of them
> have g_return_if_fail macros. Besides, the error conditions are really
> braindead and just hint at you having misused a buffer, or at a failed
> malloc. We should rather check for NULL mallocs in the realloc/init
> function. I also noted that the _init function uses g_new0, but during
> realloc we'll have invalid bytes at the end anyway, and we always write
> to the buffer before reading.

I think g_return_if_fail macros should only be used in functions that
are called from external code, to protect users passing strange
arguments. Doing it in functions that are internally called only makes
things slow, in that case they should be asserts if we need them at all.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
 Alexander Larsson                                            Red Hat, Inc 
                   alexl redhat com    alla lysator liu se 
He's a hate-fuelled zombie cyborg possessed of the uncanny powers of an 
insect. She's a beautiful red-headed mermaid who inherited a spooky stately 
manor from her late maiden aunt. They fight crime! 




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]