Re: [rt.cpan.org #73177] ->show_text does not handle latin1 strings correctly



On 11.12.2011 18:40, Slaven_Rezic via RT wrote:
Sun Dec 11 12:40:59 2011: Request 73177 was acted upon.
Transaction: Ticket created by SREZIC
       Queue: Cairo
     Subject: ->show_text does not handle latin1 strings correctly
   Broken in: 1.081
    Severity: (no value)
       Owner: Nobody
  Requestors: SREZIC cpan org
      Status: new
 Ticket <URL: https://rt.cpan.org/Ticket/Display.html?id=73177 >


The ->show_text method does not handle codepoints > 128 in a perl string
without utf8 flag correctly. See the attached sample script.

A quick glance in Cairo.xs shows that SvPV is used. Probably SvPVutf8
should be used instead here to fix the problem. See also "perldoc perlapi".

Yeah, the current implementation basically burdens the user with ensuring correct encoding. As you show in the attached program, that basically boils down to utf8::upgrade() if you have latin1-encoded strings. I agree that it would be convenient if Cairo auto-upgraded strings if utf8 is expected. But I worry about backwards-compatibility:

* If someone used utf8::encode() instead of utf8::upgrade(), then auto-upgrading will lead to double encoding.

* If someone used utf8 literals without specifying "use utf8", then auto-upgrading will also lead to double encoding.

Technically, both these usages are incorrect. But they work OK with current Cairo and it would break if implemented auto-upgrading.

What do you think? I pushed a branch to the git repo which implements the necessary changes: <http://git.gnome.org/browse/perl-Cairo/commit/?h=auto-utf8&id=f40ee313502510aedcd2df1a225fa6095717c2d2>. I also CC the mailing list for further opinions.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]