Re[3]: [PATCH] edit.c, init_dynamic_edit_buffers (was Re[2]: CR/LF translation)

From: Pavel Tsekov <ptsekov gmx net>
To: Pavel Roskin <proski gnu org>
Cc: mc-devel gnome org
Subject: Re[3]: [PATCH] edit.c, init_dynamic_edit_buffers (was Re[2]: CR/LF translation)
Date: Fri, 23 Aug 2002 11:43:40 +0200
Hello, Pavel! :)

Friday, August 23, 2002, 2:50:13 AM, you wrote:

>> Should be fixed now.

PR> Yes, everything compiles now.  It even work for me.

Wow! :)

PR> As we discussed before, I would prever the editor not to rely on data from 
PR> stat(), but read the file until EOF.  Please consider if it can be part of 
PR> your patch, or it's better to fix separately.

I missed that point, sorry! :) But read below (your comment on
mc_read) and see why we can't replace it easily.

PR>     int crlf_maybe = FALSE;    /* See below. */

PR> That's hard to read.  I would prefer an enum with 3 values: CRLF_TRUE, 
PR> CRLF_FALSE and CRLF_MAYBE.  gdb understands enums, so it will be easier to 
PR> debug.

Ok, you right. But I think it is possible that you missed the point of
that flag. It is used only in a specific situation. When there is an
'\r' right at the end of the input buffer. So I set the flag if this
is the case and a check is made when the next block is read.

But.. good that you remind me of that one because it is not handled
properly in the current patch - it should read like this:

        /* Handle a case where a '\r' can be found at the last byte in
           the previous buffer and the '\n' in the first byte of the
           current buffer. */
        if (crlf_maybe == TRUE) { <- This line is wrong, should be..
        if (crlf_maybe == TRUE && *curp == '\n')
            *(putp - 1) = '\n';
            curp++;
            crlf_maybe = FALSE;
        }

PR>     size_t nblkout = 0;                /* The number of output blockd 

PR> I think you mean "blocks".

Obviously :) I stand corrected. I try to correct all my typings as
I note them.

PR>                        /* We can use memcpy () here. */ 
PR>                        memcpy (putp, getp, nbytes);

PR> The comment is unclear.  Actually, we should use memcpy() instead of 
PR> strcpy(), because we want to support binary zeroes.

Yes - it is not informative. The meaning is " We can use it instead of
memmove ()". Some comments I put so they remind me of something - I'll
be stripping such so that it doesn't puzzle other readers.


PR>                        /* Ugly! */ 
PR>                        size_t nbytes_sav = nbytes;

Again - a comment for me.

PR>            if (mc_read (file, (char *) edit->buffers2[j], nread) !=
PR>                        nread) { 
PR>                /* TODO: Add some message here - I dunno how :( */ 

PR> mc_read() is usually read(), i.e. low-level read().  Unlike fread() which 
PR> is buffered in libc, it's common for read() to return less bytes than 
PR> requested.  From info libc:

[snip]

You don't think that I'll provide a patch if I don't know that, do you ?
:)

Ok let me explain - there is a reason. MC editor splits the file in
multiple blocks each of size EDIT_BUF_SIZE and from what I get it wont
work properly if you have a block with less data in it - except if it
is the first block. Here is how it loads the blocks:

1. Get the number of full blocks (EDIT_BUF_SIZE) this file can be
separated in (size_of_file / EDIT_BUF_SIZE).
2. Get the remainder (size_of_file % EDIT_BUF_SIZE)
3. Load 'remainder' bytes from the beginning of the file into the
edit->buffer2 array (the offset in the array is index size_of_file / EDIT_BUF_SIZE)
4. Load the remaining data which is now multiple of EDIT_BUF_SIZE into
block with indices from 0 to size_of_file / EDIT_BUF_SIZE

I hope this help. It explains many things - i think. I stated that
this patch tries to work with the current code without interfering.
It is so 'thick' because it must think of various things.

PR> Looking at the size of the new code, I start wondering if it's really
PR> worth the trouble to write all that just to skip CR characters from input.

Well, again - it is big because it has to adapt to the code :) It can
be made smaller though with some rearranging of the code which may
require using 'goto' constructs. And some lines of code may be
stripped - though right now I only one place comes to mind ... there is
a "optimisation" for multiple \r\n following one after the other. This
can be scratched saving some lines. Some comment can easily go also
and it'll become smaller.

Hey thats why I wanted to see it before I go further.

PR> Do you expect any other positive effects from your patch other than CRLF
PR> suppport?

What other effects should there be ? :) I decided to to this based on
the original discussion we're following. The original patch from
Alexander has it own code which does CRLF->LF translation on the
input, but it was not acceptable. I thought this code can help pieces
of his code to get in.

>> Btw there are some unpleasent bugs in the editor - and I dont really
>> think that it can open gigabyte files. Actually it is limited to 65Mb
>> from what I can see.

PR> That's right.  But I think it should be possible to make it work with
PR> large files, even with files over 4G on 32-bit systems.  That's because we
PR> have two-level addressing - buffer number and byte number in the buffer.

Ok - you were refering to the technology used. But is sounded like it
works right now though :)

Thanks! :)

Pavel Tsekov
References:
- Re[2]: [PATCH] edit.c, init_dynamic_edit_buffers (was Re[2]: CR/LF translation)
  - From: Pavel Roskin
[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]