[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]
Re: [xml] last version libxml2 2.4.23 problem in international "href" and "src" escaping
- From: Daniel Veillard <veillard redhat com>
- To: romis <romis nbc com ua>
- Cc: xml gnome org
- Subject: Re: [xml] last version libxml2 2.4.23 problem in international "href" and "src" escaping
- Date: Mon, 29 Jul 2002 10:54:32 -0400
On Mon, Jul 29, 2002 at 05:35:17PM +0300, romis wrote:
> Sorry for my bad english.
>
> I'm using libxml for apache module in company work. After changing:
>
> htmlAttrDump(xmlBufferPtr buf, xmlDocPtr doc, xmlAttrPtr cur) {
> ....................................
> ....................................
> if ((xmlStrEqual(cur->name, BAD_CAST "href")) ||
> (xmlStrEqual(cur->name, BAD_CAST "src"))) {
> xmlChar *escaped;
> xmlChar *tmp = value;
>
> while (IS_BLANK(*tmp)) tmp++;
>
> escaped = xmlURIEscapeStr(tmp, BAD_CAST"@/:=?;#%&");
> _______^^^^^^^^^^^^^^^^^^^^^^^^^^^________________
> if (escaped != NULL) {
> xmlBufferWriteQuotedString(buf, escaped);
> xmlFree(escaped);
> } else {
> xmlBufferWriteQuotedString(buf, value);
> }
> ....................................
> ....................................
>
> After calling function xmlURIEscapeStr strings from koi8-r stored in UTF-8
> throw libiconv dumped like pure UTF-8 sequence without decoding back.
>
> For example:
> string transmitted like %EC%E0%EA%E0%F0%EE%ED - 7 letters in koi8-r
> becomes %D0%BC%D0%B0%D0%BA%D0%B0%D1%80%D0%BE%D0%BD - 14 letters in utf8
>
> i don't know form which side change this. But previous versions don't do
> anything whith "href" and "src" now is more correct.
I'm not sure I understand the change you want to make.
My understanding of URI escaping is precisely that the string
must fist be transocded to UTF8 before being escaped with %xx codes.
It's a principle of the Web to keep URL context independant as much as
possible, and UTF8 was selected for this. The best reference I can offer
for this is:
http://www.w3.org/TR/xptr/#uri-escaping
Libxml2 do convert the strings to UTF-8 internally so that is taken care
of. Is libxml2 HTML serializer broken with %xx escaping ? Are you suggesting
a fix to be sure the string is properly %xx escaped while this was missing ?
Just to be sure I understand the problem and the suggested fix,
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard redhat com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[Date Prev][Date Next] [Thread Prev][Thread Next]
[Thread Index]
[Date Index]
[Author Index]