Unicode normalization and text display differences
- From: Benjamin Kiessling <mittagessen l unchti me>
- To: gtk-i18n-list gnome org
- Subject: Unicode normalization and text display differences
- Date: Wed, 6 Jan 2016 15:00:52 +0100
Hi everybody,
I am trying to build a simple line generation tool for training neural
networks for OCR and everything is working fine except an oddity in
display depending on Unicode normalization, in particular diacritic
placement.
In [0] (text is normalized to NFC) diacritics are placed correctly while
in [1] (text normalized to NFD) diacritics are placed next to the
preceding code point.
If I understand Unicode correctly there should be no difference in
display and there is a presentation about Pango from 2004 claiming that
there shouldn't be one.
Is this a known issue or expected behavior? Is there some preprocessing
necessary before using pango_layout_set_text()?
All Best,
Ben
[0] http://l.unchti.me/dump/nfc.png
[1] http://l.unchti.me/dump/nfd.png
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]