Re: [evince] possible wrong error by evince



>>>>> "DK" == David Kastrup <dak gnu org> writes:

DK> Sure, it isn't.  But pdfmarks are not encoded in UTF-8.  They are
DK> encoded either in PDFDocEncoding (a subset of Latin-1) or in UTF16BE
DK> with byte order mark.

The error evince reports is about the /Metadata obj (20 0 obj),
which *is* xml.  Try something like:

  mupdfshow -b unicode.pdf 20

The first line of the stream is:

  <?xpacket begin='<U+FEFF>' id='W5M0MpCehiHzreSzNTczkc9d'?>

where the <U+FEFF> is the character, encoded in UTF-8.

At the end of the xml, one finds:

  <rdf:li><E4></rdf:li>

where the <E4> is a single octet, the 8859-1 encoding of ä (U+00E4).

So evince's complain is correct.

-JimC
-- 
James Cloos <cloos jhcloos com>         OpenPGP: 1024D/ED7DAEA6


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]