Re: [xml] Making Sure Output is XML safe



Hi,

2008/11/25 Danie van der Walt <dvdwalt foneworx co za>:
Hi Elvis

I seem to have run into another problem :(.
I have a mysql database, storing messages in UTF-8, when I select from it
using c/c++ and libmysql.

I get a warning from libxml saying:
    error : xmlEncodeEntitiesReentrant : input not UTF-8

Im not sure if it is the data that is getting returned by mysql, or the
format of the string.

You have to explain a bit better here, or better yet show the code. In
any case, xmlEncodeEntitiesReentrant() is not lying, if it says the
input is not UTF-8, it is not UTF-8. There's nothing libxml2 can do to
help you with that. You need to pass a sequence of NULL-terminated
UTF-8 bytes to the libxml2 library. Period.

When I store the data plain(latin1)  is seems to work. Only problem is I
need to store the
information in UTF-8.

What storage are you talking about here? Your MySQL database? In any
case, no matter if your data is in UTF-8 in your database or not, it
must be UTF-8 when it is passed to libxml2.


Have you come across a similar problem?

No, I have never parsed XML coming from a MySQL database with libxml2,
and it was a long time since I used the MySQL C client library. But I
think that is besides the point. The only thing that matters here is
that you are passing invalid UTF-8 to libxml2, and that will never
work. I really don't think this is a libxml2 question. Just debug your
code and make sure that at the call to xmlEncodeEntitiesReentrant(),
the string you pass is valid UTF-8 bytes, and NULL-terminated.

Good luck,
Elvis


*********************************************************
Danie van der Walt
FoneWorx
Senior Programmer
Tel : +27112930000
MSN : predetor_me hotmail com
GoogleTalk : predetorlinux gmail com
*********************************************************


Elvis Stansvik wrote:

I forgot the footnote:

[1] http://xmlsoft.org/html/libxml-entities.html#xmlEncodeEntitiesReentrant

Elvis

2008/11/14 Elvis Stansvik <elvstone gmail com>:


Hi Danie,

2008/11/5 Danie van der Walt <dvdwalt foneworx co za>:


HI Guys

I hope you can help me.
I'm currently using libxml to parse incomming xml, but simply using printf
to generate my reply xml.

I have one variable that may contain characters that are not xml
safe/friendly like '<' as an example.
Is there anyway that I can parse some text to a function and get a xml
"safe/friendly" output that I can use
in my app.


Use xmlEncodeEntitiesReentrant() [1] to encode entities in a string. Like
this:

#include <stdio.h>

#include <libxml/parser.h>
#include <libxml/entities.h>

int main (int argc, char *argv[])
{
   LIBXML_TEST_VERSION

   const xmlChar *str = "string with < and > in it";
   const xmlChar *xml = "<foo />";

   xmlDoc *doc = xmlReadMemory(xml, 8, "xml", "UTF-8", 0);
   xmlChar *safe_str = xmlEncodeEntitiesReentrant(doc, str);

   printf("%s\n", safe_str);

   xmlFree(safe_str);
   xmlFreeDoc(doc);
   xmlCleanupParser();

   return(0);
}

Note that you need to pass it your document pointer as argument too,
so that it will know about all entities and not just &lt;, &gt; et.c.

Regards,
Elvis



Regards
Danie


_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml








[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]