Re: Notes when applying IO channel patch



> From: Owen Taylor <otaylor redhat com>
>
>
> OK, I've just finished going over your latest version of the
> IO channel code, and in fact, am about to apply the patch to CVS.
> I've done no testing at all of the changes I've made other
> than they compile, so things are most likely quite broken. But
> I wanted to get it into CVS, so we could concentrate
> on finishing up the API.

Just a few notes on your comments. I'll try to get a patch on the
API/urgent-de-break-ifying stuff ASAP.


> * Empty strings are not the same as NULL - if string->len == 0,
>   made g_io_channel_read_line g_io_channel_read_to_end()
>   return an empty string,

This is correct for g_io_channel_read_line. In general,
g_io_channel_read_to_end does not return a string
(as it can be used to read binary data with the default
encoding), so it should return NULL when 0 bytes are read.

> * Removed G_IO_CHANNEL_DEFAULT_LINE_TERM ... in my opinion,
>   files with the wrong termination on the wrong platform
>   are common enough that it is better to make a single
>   cross-platform choice and encourage people to autodetect
>   if they don't have a definite idea.

This isn't for reading, it's for writing, for example:

g_io_channel_write_chars (channel, string, -1, &bytes, NULL);
g_io_channel_write_chars (channel, G_IO_CHANNEL_DEFAULT_LINE_TERM, -1, &bytes, NULL);

will write a string and add a sensible choice for the line terminator.
I'd like to reinstate this, but if you don't feel it's appropriate
that's okay.

> * I'm not happy with the mode specification for g_io_channel_new_file().
>   I'm thinking that it would be better to either have:
>
>    g_io_channel_new_file (fname, "w+")

This is the way Toshi was doing it originally. I'll change it back.

> * What should be the handling of EPIPE: is it really an
>   error not a status? (should EOF by recycled?). Would
>   it make sense to use send() instead of write to 
>   avoid SIGPIPE? Should that be optional?

Mabye define G_IO_STATUS_ILLEGAL_SEEK? It could be defined to
be the same as G_IO_STATUS_EOF, but that seems unnecessary.

> * The set of errors in GIOChannelError seems a bit random - 
>   contains obscure errors like EPERM, ENOLCK, EDEADLK, but certainly
>   not nearly every error that could occur from open().

I think when I wrote this it was only being used by
read/write/seek, and I added it to g_io_channel_new_file() later.
I'll need to go through open an add all the errors. Are there
any particular errors you wanted removed?

> * Now that G_IO_STATUS_INTR is gone, is it needed or useful
>   to add a timed-write facility?

Could you explain more? I don't know what a timed-write facility is.

>
> * Need to make a read-only-returns-whole-characters guarantee
>   for reading for UTF-8, and validate incoming data.

What is a good way for finding the ends of wide characters
in a UTF-8 string?

> * I think we need to hide partial chars when writing from
>   the user - the total possible partial char size is only
>   known and is 5 bytes because it is UTF-8. (4 if we assume
>   that it UTF-8 encoding valid unicode characters)

If I recall correctly, g_io_channel_write_chars() only returns
G_IO_STATUS_PARTIAL_CHARS if the first character in the buffer
is partial. Otherwise, it writes as many full characters
as it can and returns normally.

> * g_io_channel_set_encoding() leaves IO channel in inconsistent
>   state on error. Would be better to do nothing if converters
>   couldn't be opened.

I don't see this. As near as I can tell, the current
implementation fails cleanly if it doesn't create one of the converters.

>
> * seek position doesn't seem to properly handle offset < 0
>   from SEEK_CUR, or seeks of less than the buffer length.

Once again, I don't see this. All that happens for SEEK_CUR is
that the offset is shifted by the length of the buffer, so
the seek is relative to the current position of the file pointer
instead of the head of the buffer.

> * Set a flag to catch reads/writes on closed channels cleanly?

This is currently being done by a call to g_io_channel_get_flags()
in g_io_channel_write_chars(). It is unnecessary for reads as
long as the buffer is flushed when the channel is closed,
as the first thing that will happen in a read is an
attempt to fill the buffer that will fail with EBADF.

If we cache the read-only channel flags, as you reccomend later,
that will also work. Those flags will need to be changed when
the user calls one of:

close()		as long as they use the API version instead of the native system
		call we're fine

shutdown()	terminate read and/or write for sockets, we probably need to
		catch this in any watch or something

> * Calling g_io_channel_seek() in g_io_channel_write_chars()
>   is rather odd.

This has to do with mixing read/write on a buffered channel.

If a write is followed by a read, the contents of the write buffer
need to be flushed to disk before the read, so that the read starts
after the last character written to the buffer.

Similarly, if a read is followed by a write, the read buffer
needs to be "flushed" so that the write starts immediately
after the last character read from the buffer. This
is done by calling seek() to reposition the file pointer
at the head of the buffer. This is also why mixing
reading and writing doesn't work for encoded channels
(which can't use SEEK_CUR).

> * There is no reason (IMO) to expand the write buffer arbitrarily
>   when writing characters - just flush out each chunk as 
>   it converted. 

With the current implementation, the buffer does not grow
arbitrarily, it is flushed whenever it is larger than
channel->buf_size. Do you mean that the buffer should be
flushed whenever is contains _any_ characters? This
would make g_io_channel_flush() sort of pointless.

>
> * If the previous write flushed the entire buffer, get_flags() is 
>   called again by g_io_channel_write_chars(), this could be
>   expensive? If someone is writing to a non-writeable buffer,
>   this is a programming error, so we dont' need to try
>   and catch it cleanly.

See comments above about checking for closed files.

> * Add recognition of unicode paragraph separator to
>   autodetect routine?

What is the byte value of this character?

> * Except for O_NONBLOCK, O_APPEND flags should be cached
>   rather than recomputed in get_flags(). The checking
>   can be avoided entirely for new_file().

See above comments about flags.

> * O_BINARY handling on win32... should it be automatic?

I don't understand the question.



I may have missed a few points, but I think everything else is
either straightforward or deals only with the internal
implementation.

Ron




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]