RE: [xml] character entity replacements

Sorry for the late reply.

Okay, first point is that if you want to validate your file, I tend to
that checking that entities are "well known" seems IMHO part of the
one really ought to do.

Absolutely, and I have. All the entities are part of the common ISO sets
but the customer can't handle them.

I've decided on editing my iso-???.ent files to accomplish this on my end
before delivery.



 That said doing the technical change you're suggesting is not trivial and
would require some coding, the problem is the following, when running with

--noent the parser is instructed to not generate any entity reference node.
So at that point the parser has no information left allowing to represent
the entity when reserializing the tree:

paphio:~/XML -> cat tst.xml
<!DOCTYPE doc SYSTEM "foo.dtd">
<doc>hello this is &foo; some text</doc>
paphio:~/XML -> cat foo.dtd
paphio:~/XML -> ./xmllint --shell --valid --debug --noent tst.xml
tst.xml:2: error: Entity 'foo' not defined
<doc>hello this is &foo; some text</doc>
tst.xml:2: validity error: Element doc content doesn't follow the DTD
Expecting (CDATA), got (CDATA)
<doc>hello this is &foo; some text</doc>
/ > ls
?--        1 doc
---        1 doc
/ > cd doc
doc > ls
t--       24 hello this is  some text
doc >

  Actually when there is an undeclared entity libxml doesn't produce
any entity reference in the generated tree, whether --noent is declared or

(gdb) p *doc->children->next->children
$4 = {_private = 0x0, type = XML_TEXT_NODE, name = 0x80a3260 "text",
  children = 0x0, last = 0x0, parent = 0x80f4ce0, next = 0x0, prev = 0x0,
  doc = 0x80e4960, ns = 0x0, content = 0x80e4e60 "hello this is  some text",
  properties = 0x0, nsDef = 0x0}

  This is an error handling condition, I don't know what's the most
thing to do in this case.


Daniel Veillard      | Red Hat Network
veillard redhat com  | libxml Gnome XML XSLT toolkit | Rpmfind RPM search engine
xml mailing list, project page
xml gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]