Notes when applying IO channel patch



OK, I've just finished going over your latest version of the
IO channel code, and in fact, am about to apply the patch to CVS.
I've done no testing at all of the changes I've made other
than they compile, so things are most likely quite broken. But
I wanted to get it into CVS, so we could concentrate
on finishing up the API.

(The win32 stuff is definitely not going to compile.)

The following are my notes about things I changed, things
I think need (or might need) changing, and things that 
still need to be looked at some more.

There are lots of notes below, and I also took a fairly
liberal hand at fixing up things to work the way I thought 
they should work, but in general, the patch was in 
extremely good shape - clean code, and carefully thought
through.

I'll mail you the diff between what I'm committing and your
last patch separately - I doubt it is of much interest
to anybody else.

Regards,
                                        Owen

Changes made
============

* I think g_string_append_len (string, NULL, 200) is rather bizarre.
  Instead I went with g_string_set_size (string, str->len + 200)
  to be consistent with GArray.

* I exposed GString->alloc, since I think it is necessary for 
  efficient use of a GString as a buffer. Otherwise, you are
  unnecessarily often using smaller buffer sizes than the 
  space you have available.

  I haven't actually fixed up the code to take advantage of this
  yet.

* There were a few general style isseus:
   'func()'     => 'func ()'.
   !strcmp(a,b)    strcmp(a,b) == 0
   Continuation lines should obey parentheses (see GNU coding standards,
    though we often _don't_ obey the coding standards about where
    to put the operator on a line break.)
   Don't need return blah; break;

* Added boolean return to g_io_channel_set_encoding for convenience

  if (!g_io_channel_set_encoding (channel, "ISO-8859-1"))j
    {
       /* Handle error */
    }

* Empty strings are not the same as NULL - if string->len == 0,
  made g_io_channel_read_line g_io_channel_read_to_end()
  return an empty string,

* Replace g_memmove with memcopy for non-overlapping buffers.

* lseek can't return EINTR, EGAIN, neither can fcntl, fstat.

* G_IO_FILE_MODE_APPEND should be "a", not "a+"

* Change GChannelError to GIOChannelError and GChannelStatus 
  to GIOStatus

* Removed G_IO_STATUS_BAD_INPUT, and simply used G_IO_STATUS_ERROR
  The reason here is that it never would make sense for a 
  program to check for G_IO_STATUS_BAD_INPUT; G_IO_STATUS_ERROR
  may cause a segfault when the caller tries to check for
  error->message, but that is OK - nothing is supposed to
  work after a g_return_if_fail().

* Made g_io_channel_get_line_term() g_io_channel_get_encoding()
  be G_CONST_RETURN. If we ever want to worry about thread 
  safety, with changes from multiple threads, we can
  use quarks internally and return g_quark_to_string(value) - 
  the possible set of strings is limited.

* For a couple of reasons, get_line() functions need to return
  the line separator in the line returned:

   * Consistency with standard functions such as fgets(),
    
   * So you can read and write a file and keep it the same
   
  In order to accomodate this, I removed the neat
  dos_mac_line_term_check hack and simply made it block 
  until a complete line was received. As a consequence of this,
  you must not use autodetection when using GIOChannel in
  a line-oriented protocol. (But such a protocol should be 
  specifying the terminator anyways.)

* To make it a bit more convenient to handle the case where
  you don't want the separator, (especially when doing   
  autodetection), added an extra out parameter to 
  g_io_channel_read_line[_string]: 'gsize *terminator_pos'
  which (if non-%NULL) is a location to store the position
  of the terminator. Setting result[terminator_pos] = '\0'
  strips off the terminator.

* Removed return value and GError * for g_io_channel_get_flags() -
  there is absolutely nothing a program could do useful if 
  this fails, so implementations should make sure that it never
  fails.

* Changed guint => gsize, offset for seek to long for 64 bit
  cleanliness throughout.

* Allowed g_io_channel_write_chars() to take a negative length
  to indicate that the buffer is nul-terminated.

* Removed G_IO_STATUS_INTR and simply made the system automatically
  retry. GIOChannel is supposed to be an abstraction, and
  EINTR is a ugly detail of Unix.

* Added error return and code to ->io_close virtual function,
  since the syscall close is allowed to fail and checking
  that can be interesting. Deprecated g_io_close() and
  replaced with g_io_shutdown() that can report errorrs,
  and also has a 'flush' parameter so the caller can
  close a file descriptor such as a socket without waiting
  for it to flush.

* Reordered GIOStatus enumeration so that for functions
  where you only care ERROR/NOT, you can write 
  if (!blah()) to check for error. (Just a hack for
  C convenience)

* Removed G_IO_CHANNEL_DEFAULT_LINE_TERM ... in my opinion,
  files with the wrong termination on the wrong platform
  are common enough that it is better to make a single
  cross-platform choice and encourage people to autodetect
  if they don't have a definite idea.

* Mask out non-set-flags in g_io_channel_set_flags() rather
  than g_return_if_failing - this allows 
  set_flags (get_flags | foo).


More or less API Issues
=======================

* I'm not happy with the mode specification for g_io_channel_new_file().
  I'm thinking that it would be better to either have:

   g_io_channel_new_file (fname, "
w+")

  or:

   g_io_channel_new_file (fname, G_IO_FILE_MODE_READWRITE,
                          G_IO_FILE_CREATE | G_IO_FILE_TRUNCATE);

   The latter is clearer unless you are familiar with fopen's 
   oddities, but the need to specify CREATE | TRUNCATE for "w"
   makes it rather verbose.

* Mode handling should either be:
   - Exactly (name, string) same as fopen
   - In the open model with or'red flags

* Should functions like g_io_channel_set_flags(),
  g_io_channel_seek(), g_io_channel_set_encoding(), 
  g_io_channel_close(), which can only return NORMAL/ERROR
  really return a boolean with %TRUE for success?
  (Change to make ERROR numerically 0 probably good 
  enough here)

* What should be the handling of EPIPE: is it really an
  error not a status? (should EOF by recycled?). Would
  it make sense to use send() instead of write to 
  avoid SIGPIPE? Should that be optional?

* The set of errors in GIOChannelError seems a bit random - 
  contains obscure errors like EPERM, ENOLCK, EDEADLK, but certainly
  not nearly every error that could occur from open().

* Now that G_IO_STATUS_INTR is gone, is it needed or useful
  to add a timed-write facility?

* Need to make a read-only-returns-whole-characters guarantee
  for reading for UTF-8, and validate incoming data.

* I think we need to hide partial chars when writing from
  the user - the total possible partial char size is only
  known and is 5 bytes because it is UTF-8. (4 if we assume
  that it UTF-8 encoding valid unicode characters)


Implementatation issues still to fix
====================================

* g_io_channel_set_encoding() leaves IO channel in inconsistent
  state on error. Would be better to do nothing if converters
  couldn't be opened.

* g_io_channel_set_encoding() should unset channel->read_cd,
  channel->write_cd when not doing conversion afterwards.
  (Probably to (GIconv)-1.

* seek position doesn't seem to properly handle offset < 0
  from SEEK_CUR, or seeks of less than the buffer length.

* Needs some comments for all the flags in the GIOChannel
  structure - they are a bit confusing.

* There is no reason ever to make outbytes_left smaller
  than the available space in the output buffer when
  calling iconv(). Also, might want to use a "low water
  mark" of buf_size / 2 to avoid expanding the buffer
  by small amounts. Exposing string->alloc should help
  here.

* g_io_channel_get_buffer_condition should not return
  G_IO_IN if there isn't at least a whole character.
  G_IO_OUT is completely wrong - should be something
  like channel->write_buf->len < channel->buf_size.

  Question - how do you detect EOF on a buffered channel?

* g_io_unix_get_flags() doesn't look close to right for
  anything but files.


Possible implementation issues
============================

* Almost certainly want to lazily allocate read and write buffers.

* Should lazily open read and write converters? (Error 
  reporting a bit of a problem - you could just always open
  _one_ (say the read converter)

* Set a flag to catch reads/writes on closed channels cleanly?

* g_io_channel_read_line() should probably try to keep the buffer
  returned as small as possible

* Need to use open(), not fopen() for g_io_channel_new_file(),
  I think.

* Calling g_io_channel_seek() in g_io_channel_write_chars()
  is rather odd.

* There is no reason (IMO) to expand the write buffer arbitrarily
  when writing characters - just flush out each chunk as 
  it converted. 

* If the previous write flushed the entire buffer, get_flags() is 
  called again by g_io_channel_write_chars(), this could be
  expensive? If someone is writing to a non-writeable buffer,
  this is a programming error, so we dont' need to try
  and catch it cleanly.

* g_io_channel_read_line_string() should use the supplied
  buffer possible to avoid excessive copying around. 
  If this is too hard or doesn't make sense, then 
  g_io_channel_read_line_string() and g_io_channel_read_line()
  should share a common backend to avoid having to create
  an extraneous GString.

* g_io_channel_read_line_string() should not use strchr()
  twice - should have a custom routine that loops manually.

* Add recognition of unicode paragraph separator to
  autodetect routine?

* Except for O_NONBLOCK, O_APPEND flags should be cached
  rather than recomputed in get_flags(). The checking
  can be avoided entirely for new_file().

* O_BINARY handling on win32... should it be automatic?


Things I haven't really fully reviewed yet
==========================================

* Interaction of all the code with non-blocking streams

* Interaction of the code with IO watches - what would 
  you do for instance, if you wanted to set up 
  a calllback to be called on each incoming line 
  as it arrived.

* Handling recovery of some sort on encoding failure
  in the middle of a text file.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]