[xml] question about parsing html files

sorry for a possible double post, i've forget the subject.


i'm parsing an html file which contains, in the body, this code :

    TOC <em>emphasied text</em> and <strong>strong text</strong>

I parse it in the following way:

  xmlNodePtr node;
  xmlChar *name;
  for (node = body_node ; node ; node = node->next){
    if (xmlStrcasecmp (node->name, "div") == 0){
      name = xmlNodeListGetString (file, node->xmlChildrenNode, 1);
      if (name) printf ("%s\n", (char *)name);

this code displays

    TOC  and

that is, name is string containing 'TOC' and 'and'. Hence, i can't display
the emphasied and strong strings before and after 'and'.

is there a way to modify the code above so that i can retrieve 'TOC' and
'and' separately ?

thank you

Vincent Torri

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]