Re: New tile unref logic and various leak fixes



HI Pierre-Luc,

On Fri, Mar 19, 2010 at 16:54, Pierre-Luc Beaudoin
<pierre-luc pierlux com> wrote:
> Hi Jiř,
>
> Here are my comments about your changes on the unref branch.
>
> Please reinstate view->priv instead of GET_PRIVATE in
> aa04b92d25592044c31775f16027f7c10349fa7e. Accessing a pointer is faster
> than the G_TYPE_INSTANCE_GET_PRIVATE macro.  The priv pointer cannot be
> read outside of champlain-view.c because the definition of the struct is
> undefied outsite it (see the typedefs).
> Inheriting from ClutterGroup is not a good solution: the Clutter devs
> have said we should rather implement a ClutterActor that implements the
> ClutterContainer interface that does the layouting.
> The rest of the changes in this commit are good.
>

Yes, I agree the complete removal of priv was unnecessary and I've
fixed that. I have also removed the last of the "public private"
members (in ChamplainBaseMarker), which were originally the main
reason for the change (it's evil!).

Concerning ClutterGroup inheritance I agree in principle - when we
inherit from it, we inherit also all the container methods, which,
however, shouldn't be used by the user. On the other hand, it's quite
a lot of work to do and I'd postpone the correct implementation for
0.8. And in fact, I haven't introduced this to libchamplain - it was
already there - the following classes were derived from ChamplainGroup
before my changes:

ChamplainBaseMarker
ChamplainView
ChamplainLayer

So I'll change my classes after you fix your classes ;-)

> in 7cdde727495633976ffd210c45903ab5f9aeb7c2, the coding style we use
> (Telepathy's) define headers this way:
> +static void
> +fade_in_completed (ClutterAnimation *animation,
> +    ChamplainTile *self)
> This may be odd, but that's this way for now every where in the code :)
>

You know how to bore someone to death - you are responsible for the
most boring hour in my life fixing that.

> In 2efa6379de623c173d825bc404e92b4ceaac1c58, if you are removeing the
> {}, you need to move the var declaration at the top of the function :) I
> lazily accepted the commit as is because it had the curly brackets :)
>

Done.

> In 98ab7c94afd0edcc854102d75db57e2d0cdacf43: cool addition! :D  It'd be
> even nicer with a go_to ;-)

Not really, I have tried it. the reason is that the speed of go_to()
depends on the distance where you go - and with gps, the distance on
the map is usually small, so the speed is low as well. This means that
the point starts moving from the center until it gets far enough to
reach equilibrium so the speed equals the speed of the point - that's
not very nice though. We'd need something like go_to_with_speed() to
make it look good. For GPS navigation I also wanted to avoid problems
when you don't have signal for a while so the point gets stuck on one
place and then, when you get the signal again 100km far from the place
where you lost the signal with go_to _all_ the tiles in between would
have to be filled - a bit problematic.

>
> Regarding the comment in ad4686924e08d024c0208652706fa38e176867a5:
> viewport_pos_changed_cb has a view_load_visible_tiles, it should
> download the tiles.  But I like your change anyway, it's cleaner.

I tested it and the original version didn't work - but maybe it was
due to some changes I made in between.

>
> In 0f32705faafff12c439c5b4bd3eb83504ac6e98d, this condition was meant to
> save a lot of looping.  This callback is called really often with the
> same values.  I think it should stay.

Ah, no, you missed my point (or I didn't explain it clearly enough).
Look at the original code:

  if (x == priv->viewport_size.x &&
      y == priv->viewport_size.y)
      return;

  if (fabs (x - priv->viewport_size.x) < 100 &&
      fabs (y - priv->viewport_size.y) < 100)
      return;

Do you see it? The second condition catches also the case when x ==
priv->viewport_size.x because 0 < 100. I just removed the unnecessary
condition that was handled by the more general condition below.

(by the way, the alignment of return should be only 2 spaces here ;-)

>
> In 2942103542c53a93ac0351183cb93250f81ce05c, these calls will make
> features like "show the currently cached tiles on zoom out" hard to
> do :)

Not only these, I have more comments below.

>
> In 26fa47ea332c5e3261671778f2d860eb0c6eca80, I believe a more static
> array for tile_map would save a lot of CPU.  It could be resized when
> needed, but not reallocated on every run of that function, which is
> quite often.

Well, I don't think it will make a noticeable difference. If you
consider that you do this allocation once and inside the loop you have
to dynamically create multiple tiles, for each of the loaded tiles you
have to create (quite a huge) buffer into which the data is loaded
either from hard disk or network (which adds noticeable slowdown by
itself) and then pass the buffer into the graphics pipeline, the
allocation of small amount of memory is totally insignificant compared
to this and the optimisation wouldn't bring any speed up.

>
> In 1bd883f7c19432f59818e1b63927f03409e0030b, it used to be in a idle
> callback until the animation-marker demo was created.  The animations
> seems to eat the idle's time and tiles were sometimes not loaded.  Did
> you experience that?

I experienced that clutter operates somewhere between NORMAL and HIGH
priority. When I set the idle function priority to normal, the tiles
were loaded only after the panning was completed. But after setting it
to high, I haven't experienced this behaviour any more.

>
> In 9ef262a7c43e47cc8d9b66435e57762bfc9770f1, it is indeed a wtf
> moment... It seems that TidyViewport returns invalid values when we
> update the Adjustments.  Could be worth investigating. But your later
> commits that block the callback might be a solution, did you retry to
> remove that fix again after?

I think the reason is quite clear now ( = get prepared for a
complicated explanation):

1. You call champlain_view_set_zoom_level()
2. champlain_view_set_zoom_level() calls resize_viewport()
3. resize_viewport() calls

  g_object_set (hadjust, "lower", lower, "upper", upper,
      "page-size", 1.0, "step-increment", 1.0, "elastic", TRUE, NULL);

with updated values.
4. This causes that the viewport needs to be updated and because we
are connected to the signal, viewport_pos_changed_cb() is executed.
5. viewport_pos_changed_cb() executes update_viewport()
6. In this case, viewport relocation is needed so the if-condition is
satisfied and the contents of the block inside executed. Before my
change we weren't blocking the signal here and there was return at the
end of the if-block. Based on the relocation difference you update the
viewport origin.
7. Updating viewport origin causes that viewport_pos_changed_cb() is
invoked again. The viewport origin is set correctly but you forgot to
update priv->viewport_size.x and priv->viewport_size.y based on the
new anchor so these two still have the old values.
8. update_viewport() is called again, the if-block is not executed
this time. The latitude and longitude are updated using
viewport_get_current_longitude() and viewport_get_current_latitude().
Look at their implementation:

static gdouble
viewport_get_current_longitude (ChamplainViewPrivate *priv)
{
  return viewport_get_longitude_at (priv, priv->anchor.x +
      priv->viewport_size.x + priv->viewport_size.width / 2.0);
}

Clear now, isn't it? You are combining the updated anchor with the old
priv->viewport_size.x and the result is crap.

9. After all this is finished (update_viewport() terminates and
update_viewport() that called it terminates too), resize_viewport()
can resume and after it terminates we have the wrong values of
latitude and longitude.

OK, we could fix it by setting riv->viewport_size.x and
priv->viewport_size.y inside the if-block in update_viewport() - but
do we want these crazy "invisible" calls? I think that disabling the
signal makes it much easier to reason about the code - and speeds up
things too.

>
> All in all, those a huge improvements! And the widget really feels
> faster and more responsive. I've looked into fixing those memory fixes
> many many times, throwing hours at it but I was locked in my mental
> model of the problem. I am still experiencing a memory leak, but it
> seems to be much more contained than before (and as previously thought,
> it could be a leak in my 3d driver). You've also neatly simplified the
> code! :)
>

Yes, what I was trying was to demystify the code. I think even I can
understand it a little now :-). For 0.8 I'd like to (at least partly)
hide the anchor and relocation thing into a class (I was thinking
about inheriting from TidyViewport and putting it there). I've also
tried avoiding it completely since clutter now uses floats for tile
coordinates. No luck. At higher zoom levels gaps start appearing
between tiles. After thinking about it, it's actually not much
surprising. For coordinates we need at zoom level 18 with tile size
256 values in 2^18 * 2^8 = 2^26, but float has only 23 bits for
significant values. Bad luck here :-(.

By the way, do you think it would be possible to put some traces into
ChamplainView? What I did was that I put printf("%s", __FUNCTION__);
at the beginning of every function there - it's very helpful for
debugging - especially with signals and problems like the complicated
one above. I did this about five times now - inserting this trace for
every function and then removing for the final commit - a little
boring. Having some macro permanently present in every function would
be nice.

I still wonder about the memory problem. One thing is clear - it's not
because the tile isn't deallocated. I've added one more optimization
(based on oprofile output), where view_update_state() consumed too
much time because it had to go through all the tiles every time tile
changed its state. I'm now just keeping a counter of tiles that are
being loaded and when their number drops to 0, the state changes to
DONE. I do this by connecting to the tile's destroy signal so as long
as the state indication works, we can be sure that all tiles are
deallocated - quite a good leak test as well :-).

If we are not leaking tiles, we might still be leaking something else.
I'll make some simple test program that checks whether these leaks
appear only when using clutter textures. Personally I suspect that
this:

https://bugs.freedesktop.org/show_bug.cgi?id=23530

didn't get to Karmic, which I use. Will try after Jaunty is out. What
graphics card and linux distribution do you use?

> My only complain about the widget now is something that I introduced in
> 0.4.3: before that, there never used to be a "black out" between zoom
> levels.  It'd be nice if the previously visible tiles were still visible
> while the new ones are loaded.  But that can be fixed after we merged
> all this.
>

This one is quite simple (I promise, it'll be short now). You did use
a kind of double buffer before that worked about this way:

new_screen <- view_load_visible_tiles()
swap new and old screen

This worked when filling tiles was synchronous - after calling
view_load_visible_tiles() you could be sure that all the tiles are
loaded and swapping the old screen with the new screen did what you
expected. With asynchronous tile loading it's different. When
view_load_visible_tiles() finishes, _no_ tiles have been loaded yet so
new_screen is empty when you swap the screens and you get the "black
out". I don't see any simple solution how it fix it with asynchronous
tile loading (which speeds up things a lot, so I wouldn't sacrifice it
in order to get the original behaviour). Well, actually I do. Use
white (or other light) background instead of black background like
launcher demo does. Instead of "black out" you get "white out", which
is much less disturbing because the maps are light too. Is it possible
to change the background colour in Champlain-gtk? Would be nice to
have.

> Thanks for your work.

No problem, hope it helps :-).

Jiri

>
> Pierre-Luc
>


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]