Re: iterator resolution




> 
> Hi,
> 
> For people that care this is what I ended up doing.

Is this checked in yet?  I was looking it over but couldn't
find where stuff was fully defined.  Also I could not build it
because the makefile was depending on GNOME being installed which
I don't have.

[...]
> So I did the following with iterators:
> 
>  - iterators are static objects you can place on the stack; 
>    no memory management

Yeah!!
 
>  - iterators become invalid if any indexable segments are added 
>    or removed (i.e. if the number of valid indexes changes). 
>    A magic stamp integer as described by Karl is incremented on 
>    indexable segment modification; this is used to reliably 
>    detect invalid iterators. Invalid iterators should generate
>    a g_warning() 100% of the time, they will never simply segfault.

That sounds quite cool.

>  - This means that non-indexable segments like tags and marks can be
>    applied/removed at will without breaking your iterators.
 
>  - the iterator uses a second magic stamp integer to cache a pointer
>    to the segment being indexed; this avoids a lot of linear searching
>    into lines and makes the iterators quite fast. I also have the 
>    iterators caching some other stuff right now, such as their 
>    character index into the btree; we can tune exactly what gets
>    cached and what gets calculated as we get more empirical use
>    data.
> 
>    Anyway, the upshot is that operator++ and operator* should both
>    be O(1) operations. operator== continues to be O(1). operator-- 
>    is perhaps a bit sluggish since everything is singly-linked, 
>    but I don't think it's worth the memory overhead to make it 
>    faster. It is O(1) as long as you are moving back within a 
>    single character segment, it just requires a search from the
>    front of the line when you need to go to the previous segment.
 
operator -- being slow is not that big of deal.  I suspect line
up and line down are equally slow.
 
>  - iterator objects are kind of large, something like 40 bytes;
>    I don't think this matters, usually you only have 1 or 2 of them
>    at a time.

Likely some memory can be shaved off of that, but I will
have to see it first.
 
>  - Another suggestion from Karl: froo_tkxt_buffer_insert(),
>    froo_tkxt_buffer_delete(), and other functions when appropriate
>    will have the convenience feature that they re-initialize
>    the iterators they have just invalidated.
> 
>    That is, say you do this:
> 
>     FrooTkxtIter iter;
>     froo_tkxt_buffer_get_iter_at_char(buffer, &iter, 0);
>     froo_tkxt_buffer_insert(buffer, &iter, "blah", -1);
>     
>    The insertion operation invalidates iter because the magic stamp
>    was incremented when you changed some indexable segments. However,
>    the insert function resets iter so it's still valid (other 
>    outstanding iterators have all become invalid, though). 
>    
>    Basically this allows you to insert a bunch of stuff at a single
>    point:
>      froo_tkxt_buffer_insert(buffer, &iter, "blah", -1);
>      froo_tkxt_buffer_insert(buffer, &iter, "blah blah", -1);
>      froo_tkxt_buffer_insert(buffer, &iter, "blah blah blah", -1);
>   
>    It's very easy to efficiently re-init the iterator after the 
>    insert, so there's no real cost to this feature.
> 
>    An outstanding issue is what we re-init the iterator to point to
>    (start or end of inserted text, end seems like it might be more
>    useful but start seems more intuitive).

end actually makes a lot more sense.  And iterator is just
like a cursor so where does one expect a cursor after you add
an item?  Intuitively it should be at the end.  If you want 
to keep in a spot, just add a mark and use it to return to that
spot.

  FrooTkxtIndex iter;
  FrooTkxtMark  *mark;
  ...
   
  mark=froo_tkxt_mark_new_index(&iter);
  for (i=0; i<10; i++)
    {
      froo_tkxt_insert_text(&iter, "hello");
      froo_tkxt_index_at_mark(&iter,mark);
    }
  froo_tkxt_mark_unref(mark);


>  - I am going to make marks easier to use, specifically by allowing
>    unnamed anonymous marks (right now the Tcl legacy requires you 
>    to refer to marks by a string name). Thus there are at least 
>    three ways to keep a text position that persists across buffer
>    mutations:
>      - setting a mark
>      - keep the character index (like GtkText)
>      - keep the line number and character offset into the line
>  
>    There are easy functions to convert iterators to/from any of these
>    representations. So if you need a persistent position reference
>    it should never be a big problem.

That sounds good.  You should be able to create a named mark from
a iterator as well as an anonymous mark.

>  - The iterator does contain a pointer to the buffer, and you can 
>    froo_tkxt_iter_get_buffer(). Right now I still have some functions
>    that are buffer methods like this:
> 
>      froo_tkxt_buffer_insert(buffer, &iter, "blah");
> 
>    These could be:
>     
>      froo_tkxt_iter_insert(&iter, "blah");
> 
>    I consider this irrelevant for language bindings (language binders
>    could easily wrap things differently, whichever way we choose to 
>    do it), but it is an outstanding C API issue.

It isn't relevant to the bindings, but isn't it cumbersome for
C.   After all even in GList where the list functions as the
iterator, passing the list separately is not required.

After all doing things like
  mark=froo_tkxt_index_mark_new(iter);

works just fine without the buffer.  

>    The iter_insert() is shorter, but perhaps confusing, since it 
>    will result in a signal being emitted on the buffer, and is 
>    basically a buffer operation.

What is the value added for requiring the buffer?

--Karl



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]