Re: News on the text rendering front



On Fri, 2006-10-13 at 19:50 +0200, Hans Breuer wrote:
Hi Lars,
sorry for the late answer. There where constantly other things dragging 
away my time, prohibting to look closely enough into Dia's weakest area ;)

I know exactly how it goes.  I really can't blame you on that front, as
I'm currently going through my "must answer" mails -- from two years
ago!

On 05.09.2006 20:34, Lars Clausen wrote:
The last couple of weeks, I have been working on secret text rendering
stuff, trying out various things I had hoped would work to improve the
quality and speed.  To that end, I have introduced a new object, called
TextLine, which comprises a single line of text with its font and height
and possibly some cached information.

This is an interesting idea but at the moment the TextLine objects seem to
be too short-lived to have much positive impact. In fact they only seem to
be created within the draw_text methods, so any performance boost through
caching seems moot, when looking at the code. [In fact there is a huge
speed improvement, but I don't completely understand why that is.]

It's from not having to make a huge number of PangoLayout objects in
order to scale it correctly.  Limiting that to just two (could be one
with more work) helps a lot.

The main reason behind the TextLine object is that a text should
*always* have its width available.  Without this, we cannot hope to make
the various renderers agree on the width -- we're having enough trouble
with the Pango renderers, the exporters to other formats have had no
chance.  So I took the minimum amount of information necessary to
determine the width and bundled it up in the TextLine, storing text
width information within.

The TextLine is not to be another layer, it is supposed to entirely
replace the ability to render strings directly.  This can be done either
by putting in (as yet undefined) TextLine property definitions, by using
the current definitions but store the values from the properties into
the TextLine object, or by having the separate values on the object and
keep a TextLine object updated on the side.  The first solution is the
best, the last one should merely be a stop-gap measure.

That the TextLine object allowed me to find a way to improve the
rendering speed was a lucky side-effect.  What it did for me first was
isolate the rendering issues enough that it was easy to experiment -- I
tried using a matrix, and was ready to try scaling a bitmap as the next
idea.

As it currently stands, the TextLine object is, as you say, only used
from within draw_string.  I have some not-quite-reliable UML code that
uses it directly (with the third solution above), it needs to handle
line wrapping correctly.  On the renderer side, it is implemented in the
Freetype-based GDK/libart renderers, in the PostScript renderers, and in
the C-based SVG renderer.

Also your implementation heavily depends on direct Freetype usage which is
a problem for portability. [Although Freetype is available for win32, Dia
does not depend on it, but instead transparently uses the win32 native font
backend through Gtk+ and Pango (or Pango/cairo since gtk+-2.8)]

Not true.  It depends on looking into the PangoLayout objects, but not
at anything Freetype-specific.  See app/diapsrenderer.c for an example.

So instead of poking into the internals of the various backends I've asked
google for help about the root problem, switching off font hinting:
http://www.google.com/search?q=win32+font+hinting

The first match is a post [1] from Owen Taylor of Gtk+, Pango and cairo fame.

Owens answer to specific rendering needs for Pango is PangoRenderer [2,3],
introduced with Pango 1.8, which is a requirement of Gtk+-2.6, so
acceptable as Dia's minimal version. PangoRender allows to set a matrix to
adapt the rendering, which also seems to be the only way to switch of
hinting in win32 [1].

I did experiment with that, but not successfully.

Just completely turning of dia_font_scaled_build_layout() and instead using
an appropriate pango_matrix_scale() gave very encouraging results, Dia's
strings were perfectly matching the width of the box - regardless of the
zoom - for the first time in Dia's history.

The remaining issues were:
1) if you use a large scale factor the missing hinting is very visible,
   to the extend, that single glyphs are not not drawn at all, see:
   http://hans.breuer.org/dia/text-box-1282-1.png
   But with a combination of scaled font and font matrix it got much better.

That is rather hefty mishinting, I'll say.  With a combination of
scaling and matrix, won't we get the text width issues again?

2) the alignment adjustment is somewhat wrong, probably a combination
   from still using dia_font_get_scaled_string_width() to offset but
   already somewhat adjusting the offset for Pango with the matrix
   http://hans.breuer.org/dia/text-box-1282-1.png

Absolutely.  Using TextLine's width info will help.

3) the cursor position is wrong when zoomed. This is not new with the
   new approach, but should be fixed anyway.

That is something to look at once we've settled on one solution.

If this matrix/scale system works better than adjusting the glyphs, I'm
all for using that.  It can easily replace the current TextLine
implementations.

This allowed me to fiddle enough with rendering to finally find what
appears to be the root cause of the font rendering problems:  Pango
seems to round the width of glyphs in its layout to pixels.  Thus
rendering the same text at different font sizes is almost guaranteed to
give different relative widths. Our current kludge of trying to find a
proper height was not only very slow, potentially creating many layouts
per rendering, but also could not possibly work in all cases.

But that kludge is still used in CVS, isn't it?

Only because I haven't gotten through enough of the code to be sure I
can weed it out.

In the TextLine rendering function for the GDK FT2 rendering, I make use
of cached values for the sizes of the glyphs at 100% to "manually"
adjust the sizes at other zoom levels to match.  This has allowed me to
sidestep the whole ugly kludge and render directly with the desired
size.

Could you explain me in simple words why we need another high level Text
object or better how the new TextLine is conceptually different from the
long time existing Text object, which already used to cache its size. Even
better it clearly belongs to the object side of rendering so many object
implementations alreay benefit from it.

See above.  It differs from Text in that it is single-line and does not
by default allow editing (I'm leaving it for later whether to allow
that, but at least it will be possible).  It does indeed belong on the
objects, but I haven't had time to go through every object that calls
draw_string and use a TextLine there instead.

The outcome:

    Text rendering is now up to 30 times (not just 30%, 30 times!)
    faster than before, text width is actually accurate, and we can
    toss out our ugly kludge.  Three very annoying birds with one
    stone.  I am very happy.

There's still some amount of work for this to be used throughout the
renderers, and the objects should use TextLine instead of ever calling
draw_string.  
Getting rid of draw_string, where font and text are completely separated
may be a worthwhile goal, but both our approaches don't really need this.

That is my goal.  The width thing is merely a sideeffect.

I expect to be working on this in the upcoming weeks, and
any help would be appreciated, especially from those who know the ins
and outs of the various renderers.  
The patch I've done just calculates the the width of the string a second 
time and returns the deviation of the desired as scale factor.
It does not need the new TexLine object to be almost as fast and I would 
prefer to not have another object between DiaObjects and DiaRenderers at 
least if we can do as well without.

See above.

One thing that the TextLine ought to do when implemented in objects is
store the calculated width in the XML as a hint.  Then batch processing
can use it, too.

-Lars




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]