[xml] Beginner Question : How to parse Html ? how to complete this code fraction ?
- From: Meir Yanovich <meiry242 gmail com>
- To: xml gnome org
- Subject: [xml] Beginner Question : How to parse Html ? how to complete this code fraction ?
- Date: Sun, 4 Jul 2010 14:29:47 +0300
Hello
In search for Html parser i found that this (libXml2) can do Html parsing
i found only one Code example on how to use this section of the lib but its not complete
i need help to complete it for me to understand the API.
from this site :
http://laurentparenteau.com/blog/2009/12/parsing-xhtml-in-c-a-libxml2-tutorial/Here is the code its compiles just fine but im missing the logic of how to open Html file and how to read it :
void walkTree(xmlNode * a_node)
{
xmlNode *cur_node = NULL;
xmlAttr *cur_attr = NULL;
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
// do something with that node information, like… printing the tag’s name and attributes
printf("Got tag : %s\n", cur_node->name);
for (cur_attr = cur_node->properties; cur_attr; cur_attr = cur_attr->next) {
printf(" -> with attribute : %s\n", cur_attr->name);
}
walkTree(cur_node->children);
}
}
int main(int argc, char** argv[])
{
htmlParserCtxtPtr parser = htmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL, xmlCharEncoding::XML_CHAR_ENCODING_NONE);
htmlCtxtUseOptions(parser, HTML_PARSE_NOBLANKS | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET);
char * data; //: buffer containing part of the web page
int len ;//: number of bytes in data
// Last argument is 0 if the web page isn’t complete, and 1 for the final call.
htmlParseChunk(parser, data, len, 0);
walkTree(xmlDocGetRootElement(parser->myDoc));
return 0;
}
can you please help me to complete the code?
Thanks
[Date Prev][
Date Next] [Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]