Re: [xml] Making Sure Output is XML safe
- From: "Elvis Stansvik" <elvstone gmail com>
- To: "Danie van der Walt" <dvdwalt foneworx co za>
- Cc: xml gnome org
- Subject: Re: [xml] Making Sure Output is XML safe
- Date: Tue, 25 Nov 2008 12:17:12 +0100
Hi,
2008/11/25 Danie van der Walt <dvdwalt foneworx co za>:
Hi Elvis
I seem to have run into another problem :(.
I have a mysql database, storing messages in UTF-8, when I select from it
using c/c++ and libmysql.
I get a warning from libxml saying:
error : xmlEncodeEntitiesReentrant : input not UTF-8
Im not sure if it is the data that is getting returned by mysql, or the
format of the string.
You have to explain a bit better here, or better yet show the code. In
any case, xmlEncodeEntitiesReentrant() is not lying, if it says the
input is not UTF-8, it is not UTF-8. There's nothing libxml2 can do to
help you with that. You need to pass a sequence of NULL-terminated
UTF-8 bytes to the libxml2 library. Period.
When I store the data plain(latin1) is seems to work. Only problem is I
need to store the
information in UTF-8.
What storage are you talking about here? Your MySQL database? In any
case, no matter if your data is in UTF-8 in your database or not, it
must be UTF-8 when it is passed to libxml2.
Have you come across a similar problem?
No, I have never parsed XML coming from a MySQL database with libxml2,
and it was a long time since I used the MySQL C client library. But I
think that is besides the point. The only thing that matters here is
that you are passing invalid UTF-8 to libxml2, and that will never
work. I really don't think this is a libxml2 question. Just debug your
code and make sure that at the call to xmlEncodeEntitiesReentrant(),
the string you pass is valid UTF-8 bytes, and NULL-terminated.
Good luck,
Elvis
*********************************************************
Danie van der Walt
FoneWorx
Senior Programmer
Tel : +27112930000
MSN : predetor_me hotmail com
GoogleTalk : predetorlinux gmail com
*********************************************************
Elvis Stansvik wrote:
I forgot the footnote:
[1] http://xmlsoft.org/html/libxml-entities.html#xmlEncodeEntitiesReentrant
Elvis
2008/11/14 Elvis Stansvik <elvstone gmail com>:
Hi Danie,
2008/11/5 Danie van der Walt <dvdwalt foneworx co za>:
HI Guys
I hope you can help me.
I'm currently using libxml to parse incomming xml, but simply using printf
to generate my reply xml.
I have one variable that may contain characters that are not xml
safe/friendly like '<' as an example.
Is there anyway that I can parse some text to a function and get a xml
"safe/friendly" output that I can use
in my app.
Use xmlEncodeEntitiesReentrant() [1] to encode entities in a string. Like
this:
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/entities.h>
int main (int argc, char *argv[])
{
LIBXML_TEST_VERSION
const xmlChar *str = "string with < and > in it";
const xmlChar *xml = "<foo />";
xmlDoc *doc = xmlReadMemory(xml, 8, "xml", "UTF-8", 0);
xmlChar *safe_str = xmlEncodeEntitiesReentrant(doc, str);
printf("%s\n", safe_str);
xmlFree(safe_str);
xmlFreeDoc(doc);
xmlCleanupParser();
return(0);
}
Note that you need to pass it your document pointer as argument too,
so that it will know about all entities and not just <, > et.c.
Regards,
Elvis
Regards
Danie
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]