Re: [xml] Remove whitespaces from text nodes


I wrote a small function for this purpose some time ago.
I didn't test it with the last versions of libxml2 nor did I ensure that this code if correct but it gives 
you the idea of a method that you can use to remove blank nodes.

The usage is for instance:

doc = xmlReadFile (xmlfile, NULL, 0);
if (doc == NULL)
        /* Deal with error... */
      return 1;
glbRemoveBlankNodes (xmlDocGetRootElement(doc));

Hope this helps,

Best regards,

Georges-André SILBER

glbRemoveBlankNodes (xmlNodePtr n)
  xmlNodePtr cur;
  xmlNodePtr next;

  if (n == NULL)
    return 0;

  cur = n->children;
  while (cur)
      next = cur->next;      
      if (xmlIsBlankNode (cur))
          xmlUnlinkNode (cur);
          xmlFreeNode (cur);
        glbRemoveBlankNodes (cur);
      cur = next;

  return 0;

Le 16 févr. 2012 à 08:57, spam spam spam spam free fr a écrit :

Yes you are right.
But I am not sure my function will do a good job.
I know 2 whitespaces : " ", "\t", ... But I am not sure that I know all of them.
My function will probably forgot to strip some whitespaces...
This is the reason why I would like to use an already defined function.

Is there a function which do this job?

----- Mail original -----
De: "Liam R E Quin" <liam holoweb net>
À: "spam spam spam spam" <spam spam spam spam free fr>
Cc: xml gnome org
Envoyé: Jeudi 16 Février 2012 08:40:31
Objet: Re: [xml] Remove whitespaces from text nodes

On Thu, 2012-02-16 at 08:28 +0100, spam spam spam spam free fr wrote:
Anyway, there seems to have no other solution with libxml2 only.

The spaces are part of the text of the document, so it's not likely that
a conformant XML parser will strip them for you.

You could of course remove the spaces in C after parsing, just as if you
decided to remove every occurrence of an upper-case "B" from the input.

That's just standard C string processing.

Liam Quin - XML Activity Lead, W3C,
Pictures from old books:

xml mailing list, project page
xml gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]