[xml] xmlSetProp reports error - "error : string is not in UTF-8" for a URL !



Hi ,Â


This is using C++/ gcc on LIBXMLÂ2.7.2

I am trying to add an attribute to a node , that raises an errorÂ
"error : string is not in UTF-8"

I am using the API
xmlSetProp(currentNode , (const xmlChar *) kAttribName , (const xmlChar *)"http://www.w3.org/2000/09/xmldsig#"))


Looking at the stack trace , the error originates from xmlNewPropInternal( ..)Â

whereÂ
xmlCheckUTF8(value) returns 0Â

I am baffled as to why xmlCheckUTF8 would fail when passing this string - "http://www.w3.org/2000/09/xmldsig#"
Basically , inside the for loop the first if statement isÂencountered (ifÂ((c &Â0x80) ==Â0x00

There isn't a check for NULL termination due to which Âit even passes the NULL characters at the end of the string and then grabs garbage and ultimately returns 0 .Â


int

xmlCheckUTF8(const unsigned char *utf)

{

  int ix;

  unsigned char c;


  if (utf == NULL)

    return(0);

  /*

  * utf is a string of 1, 2, 3 or 4 bytes. The valid strings

ÂÂ Â * are as follows (in "bit format"):

  *  0xxxxxxx                   valid 1-byte

  *  110xxxxx 10xxxxxx               valid 2-byte

  *  1110xxxx 10xxxxxx 10xxxxxx          valid 3-byte

  *  11110xxx 10xxxxxx 10xxxxxx 10xxxxxx      valid 4-byte

ÂÂ Â */

  for (ix = 0;;) {   /* string is 0-terminated */

c = utf[ix];

    if ((c & 0x80) == 0x00) { /* 1-byte code, starts with 10 */

      ix++;

} else if ((c & 0xe0) == 0xc0) {/* 2-byte code, starts with 110 */

  if ((utf[ix+1] & 0xc0 ) != 0x80)

    return 0;

  ix += 2;

} else if ((c & 0xf0) == 0xe0) {/* 3-byte code, starts with 1110 */

  if (((utf[ix+1] & 0xc0) != 0x80) ||

    ((utf[ix+2] & 0xc0) != 0x80))

  return 0;

  ix += 3;

} else if ((c & 0xf8) == 0xf0) {/* 4-byte code, starts with 11110 */

  if (((utf[ix+1] & 0xc0) != 0x80) ||

    ((utf[ix+2] & 0xc0) != 0x80) ||

((utf[ix+3] & 0xc0) != 0x80))

  return 0;

  ix += 4;

} else /* unknown encoding */

  return 0;

   }

   return(1);

}


Am I missing something very fundamental here ?Â

ThanksÂ



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]