Re: [xml] Making Sure Output is XML safe
- From: "Elvis Stansvik" <elvstone gmail com>
- To: "Danie van der Walt" <dvdwalt foneworx co za>
- Cc: xml gnome org
- Subject: Re: [xml] Making Sure Output is XML safe
- Date: Tue, 25 Nov 2008 14:11:57 +0100
2008/11/25 Danie van der Walt <dvdwalt foneworx co za>:
Hi Elvis
Here is a code snippet:
const xmlChar *xml = "<foo />";
xmlDoc *doc = xmlReadMemory(xml, 8, "xml", "UTF-8", 0);
//xmlDoc *doc = xmlReadMemory(xml, 8, "xml", NULL, 0);
xmlChar *send_sms_str_xml_safe=NULL;
.
.
.
.
.
if(mysql_real_query(&gMY_DB,g_statement,strlen(g_statement)))
{
ReturnXmlError("System Error");
fprintf(g_fptr,"Sql ERROR |%s|%s|\n",g_statement,mysql_error(&gMY_DB));
exit(0);
}
g_result = mysql_store_result(&gMY_DB);
if (mysql_num_rows(g_result)>0)
{
printf("<?xml version=\"1.0\"?>\n<sms_api>\n");
//memset(row,0,sizeof(MYSQL_ROW));
while((g_row = mysql_fetch_row(g_result))!=NULL)
{
printf(" <sms>\n");
printf(" <sms_id>%s</sms_id>\n",g_row[0]);
printf(" <status_id>%s</status_id>\n",g_row[1]);
printf(" <status_text>%s</status_text>\n",g_row[7]);
if(atoi(give_detail)==1)
{
send_sms_str_xml_safe = xmlEncodeEntitiesReentrant(doc, g_row[2]);
printf("
<short_message>%s</short_message>\n",send_sms_str_xml_safe);
free(send_sms_str_xml_safe);
printf(" <source_addr>%s</source_addr>\n",g_row[3]);
printf(" <destination_addr>%s</destination_addr>\n",g_row[4]);
}
printf(" <time_submitted>%s</time_submitted>\n",g_row[5]);
printf(" <time_processed>%s</time_processed>\n",g_row[6]);
printf(" <rule>%s</rule>\n",g_row[8]);
printf(" </sms>\n");
g_row= NULL;
}
mysql_free_result (g_result);
printf("</sms_api>\n");
}
..
.
.
.
.
g_row[2] - is a text field, with mysql(utf-8) saved information.
Thanks Danie, but I'm not going to debug your code for you. I did
however make a small test case myself, which works as expected. Maybe
it can help you figure out where you're going wrong:
CREATE DATABASE xml2 CHARACTER SET = utf8;
GRANT ALL ON xml2.* TO xml2 localhost IDENTIFIED BY 'xml2';
CREATE TABLE xml2.test (id INT(10) NOT NULL AUTO_INCREMENT, text
VARCHAR(255) CHARACTER SET utf8, PRIMARY KEY(id));
INSERT INTO xml2.test (text) VALUES ('Hello World™ & and Merry Christmas!');
Code:
#include <mysql.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
int main (int argc, char *argv[])
{
MYSQL *db;
MYSQL_RES *result;
MYSQL_ROW row;
xmlDoc *doc;
unsigned long *field_lengths;
int ret;
LIBXML_TEST_VERSION
/* Parse test document */
doc = xmlReadMemory("<xml/>", 7, "xml", "UTF-8", 0);
if (doc == NULL)
exit(1);
/* Initialize MySQL */
db = mysql_init(NULL);
if (db == NULL) {
fprintf(stderr, "Out of memory!\n");
exit(1);
}
mysql_options(db, MYSQL_SET_CHARSET_NAME, "utf8");
/* Connect to MySQL server */
db = mysql_real_connect(db, "localhost", "xml2", "xml2", "xml2",
3306, NULL, 0);
if (db == NULL) {
fprintf(stderr, "MySQL: %s\n", mysql_error(db));
exit(1);
}
/* Select all rows from table `test` */
ret = mysql_query(db, "SELECT * FROM test");
if (ret != 0) {
fprintf(stderr, "MySQL: %s\n", mysql_error(db));
exit(1);
}
/* Fetch result */
result = mysql_use_result(db);
if (result == NULL) {
fprintf(stderr, "MySQL: %s\n", mysql_error(db));
exit(1);
}
/*
* Do entity encoding on strings found in the second field
*/
while ((row = mysql_fetch_row(result))) {
field_lengths = mysql_fetch_lengths(result);
if (field_lengths[1] != 0)
printf("%s\n", xmlEncodeEntitiesReentrant(doc, BAD_CAST row[1]));
}
mysql_free_result(result);
mysql_close(db);
xmlCleanupParser();
return(0);
}
$ gcc -Wall -o test `xml2-config --libs --cflags` `mysql_config --libs
--cflags` test.c
$ ./test
Hello Worldâ„¢ & and Merry Christmas!
This was with MySQL Ver 14.14 Distrib 5.1.29-rc and libxml2 2.6.32.
Just make sure that g_row[2] is what you think it is, just print it
and examine it. It's basic programming.
Good luck,
Elvis
*********************************************************
Danie van der Walt
FoneWorx
Senior Programmer
Tel : +27112930000
MSN : predetor_me hotmail com
GoogleTalk : predetorlinux gmail com
*********************************************************
Elvis Stansvik wrote:
Hi,
2008/11/25 Danie van der Walt <dvdwalt foneworx co za>:
Hi Elvis
I seem to have run into another problem :(.
I have a mysql database, storing messages in UTF-8, when I select from it
using c/c++ and libmysql.
I get a warning from libxml saying:
error : xmlEncodeEntitiesReentrant : input not UTF-8
Im not sure if it is the data that is getting returned by mysql, or the
format of the string.
You have to explain a bit better here, or better yet show the code. In
any case, xmlEncodeEntitiesReentrant() is not lying, if it says the
input is not UTF-8, it is not UTF-8. There's nothing libxml2 can do to
help you with that. You need to pass a sequence of NULL-terminated
UTF-8 bytes to the libxml2 library. Period.
When I store the data plain(latin1) is seems to work. Only problem is I
need to store the
information in UTF-8.
What storage are you talking about here? Your MySQL database? In any
case, no matter if your data is in UTF-8 in your database or not, it
must be UTF-8 when it is passed to libxml2.
Have you come across a similar problem?
No, I have never parsed XML coming from a MySQL database with libxml2,
and it was a long time since I used the MySQL C client library. But I
think that is besides the point. The only thing that matters here is
that you are passing invalid UTF-8 to libxml2, and that will never
work. I really don't think this is a libxml2 question. Just debug your
code and make sure that at the call to xmlEncodeEntitiesReentrant(),
the string you pass is valid UTF-8 bytes, and NULL-terminated.
Good luck,
Elvis
*********************************************************
Danie van der Walt
FoneWorx
Senior Programmer
Tel : +27112930000
MSN : predetor_me hotmail com
GoogleTalk : predetorlinux gmail com
*********************************************************
Elvis Stansvik wrote:
I forgot the footnote:
[1] http://xmlsoft.org/html/libxml-entities.html#xmlEncodeEntitiesReentrant
Elvis
2008/11/14 Elvis Stansvik <elvstone gmail com>:
Hi Danie,
2008/11/5 Danie van der Walt <dvdwalt foneworx co za>:
HI Guys
I hope you can help me.
I'm currently using libxml to parse incomming xml, but simply using printf
to generate my reply xml.
I have one variable that may contain characters that are not xml
safe/friendly like '<' as an example.
Is there anyway that I can parse some text to a function and get a xml
"safe/friendly" output that I can use
in my app.
Use xmlEncodeEntitiesReentrant() [1] to encode entities in a string. Like
this:
#include <stdio.h>
#include <libxml/parser.h>
#include <libxml/entities.h>
int main (int argc, char *argv[])
{
LIBXML_TEST_VERSION
const xmlChar *str = "string with < and > in it";
const xmlChar *xml = "<foo />";
xmlDoc *doc = xmlReadMemory(xml, 8, "xml", "UTF-8", 0);
xmlChar *safe_str = xmlEncodeEntitiesReentrant(doc, str);
printf("%s\n", safe_str);
xmlFree(safe_str);
xmlFreeDoc(doc);
xmlCleanupParser();
return(0);
}
Note that you need to pass it your document pointer as argument too,
so that it will know about all entities and not just <, > et.c.
Regards,
Elvis
Regards
Danie
_______________________________________________
xml mailing list, project page http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]