Re: [gedit-list] Plugin development question



On Sun, 2011-07-10 at 15:20 +0200, Jesse van den Kieboom wrote:
> Just FYI, you need to probably be a bit careful with this approach and
> take care of proper utf-8 handling (i.e. the difference between
> character offsets and byte offsets etc.).

Yes, good point. I have made a change to make sure Python knows it's
dealing with a unicode string when matching the regexp:

https://github.com/jonleighton/gedit-trailsave/commit/84b965fd02379a93a68d35cbc20a784ec6fe7e31

This means that the offsets provided by python are now based on
characters rather than bytes, which seems to work.

Is the string returned by self.doc.get_text() always going to be utf-8
though? Or do I need to be able to deal with other encodings too?

Thanks.

-- 
http://jonathanleighton.com/

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]