[xml] Parser converts html entities



I am using a rather old version of libxml2 (2.6.8),
and am in the process of merging with the latest.

When the parser encounters references like  
or •, I am seeing that I am getting called back
with the converted codes. For e.g.   is converted
to literal space ' ', • is getting converted to
the bullet.
Is this the default behaviour?
This happens when I am using the html parser api's from
HTMLparser.c.
Is there an option to turn off this conversion?

I searched through the message archives on the website,
but couldn't get much info, though there were some
discussions regarding "ampersand" and "entities".

Best Regards,
GPN



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]