Re: [scintilla] Re: [Anjuta-list] Russian in Anjuta 1.0.1 - Conclusion



   Hi Biswapesh,

> Thanks for taking out the time to investigate this. It does seems a bit
> convoluted but I'm hoping that once Scintilla has switched to using
> Pango, these things will be fairly automatic.
>
> Neil, I'm totally ignorant about these matters, so: will switching to
> Pango give us correct rendering of UTF automatically ?

   The file may be in an encoding other than UTF-8. For Russian it is most
likely the file will be in the KOI-8 encoding. Some piece of code needs to
deal with this. Anjuta could decide to convert files to UTF-8 upon input and
convert back upon save. This has the advantage of not requiring locale
switching within Anjuta if there are other files open using different
encodings but the disadvantage of requiring checking of all input to ensure
that no text is included that is not representable upon saving.

   You should discuss with Russian users what their requirements are in
terms of file encodings. Those I have communicated with prefer single byte
Russian encodings (although there is some disagreement about which encoding)
but as the world is moving towards UTF-8, you may find that UTF-8 is
acceptable in the context of Anjuta.

   My preference is to operate on the file in its original encoding,
converting to UTF-8 as required to display or from UTF-8 when inserting user
input or pasting. For this to work Scintilla has to be told of the encoding.
The current GTK+ version does not support converting from KOI-8 to UTF-8 and
will need to call iconv to do this.

# To enable writing Russian for Anjuta >= 1.0.1 you should put in the
# "user.properties" file
# a line "character.set=204" (this parameter originates from scintilla.h
# file in anjuta source package,
# it's placed there as #define SC_CHARSET_RUSSIAN 204, parameters for
# other languages
# are in this header file too).
# After that a normal text in anjuta editor
# will became italized
# (slanted) because of incompleteness of lucidawriter font for russian
# character set. You should
# chose some other font (courier-adobe with iso646-1 charset, for example).
# Strange, but "code.page=0" and "LC_CTYPE=ru_RU.UTF-8"
# properties seem to be ambigous -
# it works either with these properties set or without them.

   The code.page setting is for input encoding and is important for double
byte character sets. LC_CTYPE is also mostly used in Scintilla for DBCS.

   Neil





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]