Re: [gedit-list] Plugin development question
- From: Jesse van den Kieboom <jesse icecrew nl>
- To: Jon Leighton <j jonathanleighton com>
- Cc: gedit-list gnome org
- Subject: Re: [gedit-list] Plugin development question
- Date: Sun, 10 Jul 2011 16:24:59 +0200
2011/7/10 Jon Leighton <j jonathanleighton com>:
> On Sun, 2011-07-10 at 15:20 +0200, Jesse van den Kieboom wrote:
>> Just FYI, you need to probably be a bit careful with this approach and
>> take care of proper utf-8 handling (i.e. the difference between
>> character offsets and byte offsets etc.).
>
> Yes, good point. I have made a change to make sure Python knows it's
> dealing with a unicode string when matching the regexp:
>
> https://github.com/jonleighton/gedit-trailsave/commit/84b965fd02379a93a68d35cbc20a784ec6fe7e31
>
> This means that the offsets provided by python are now based on
> characters rather than bytes, which seems to work.
>
> Is the string returned by self.doc.get_text() always going to be utf-8
> though? Or do I need to be able to deal with other encodings too?
It is guaranteed to always be utf-8, so no need to do any conversions.
>
> Thanks.
>
> --
> http://jonathanleighton.com/
>
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]