[xml] Beginner Question : How to parse Html ? how to complete this code fraction ?

From: Meir Yanovich <meiry242 gmail com>
To: xml gnome org
Subject: [xml] Beginner Question : How to parse Html ? how to complete this code fraction ?
Date: Sun, 4 Jul 2010 14:29:47 +0300

Hello
In search for Html parser i found that this (libXml2) can do Html parsing
i found only one Code example on how to use this section of the lib but its not complete
i need help to complete it for me to understand the API.
from this site :
http://laurentparenteau.com/blog/2009/12/parsing-xhtml-in-c-a-libxml2-tutorial/
Here is the code its compiles just fine but im missing the logic of how to open Html file and how to read it :

void walkTree(xmlNode * a_node)
{
xmlNode *cur_node = NULL;
xmlAttr *cur_attr = NULL;
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
     // do something with that node information, like… printing the tag’s name and attributes
    printf("Got tag : %s\n", cur_node->name);
    for (cur_attr = cur_node->properties; cur_attr; cur_attr = cur_attr->next) {
        printf(" -> with attribute : %s\n", cur_attr->name);
    }
    walkTree(cur_node->children);
}
}
int main(int argc, char** argv[])
{

    htmlParserCtxtPtr parser = htmlCreatePushParserCtxt(NULL, NULL, NULL, 0, NULL, xmlCharEncoding::XML_CHAR_ENCODING_NONE);
    htmlCtxtUseOptions(parser, HTML_PARSE_NOBLANKS | HTML_PARSE_NOERROR | HTML_PARSE_NOWARNING | HTML_PARSE_NONET);
    char * data; //: buffer containing part of the web page
    int len ;//: number of bytes in data
    // Last argument is 0 if the web page isn’t complete, and 1 for the final call.
    htmlParseChunk(parser, data, len, 0);

    walkTree(xmlDocGetRootElement(parser->myDoc));
    return 0;
}

can you please help me to complete the code?
Thanks

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]