Re: [xml] indentation (again!)

  Hi Bruce,

On Sat, Jul 25, 2015 at 08:12:39AM -0400, Bruce Miller wrote:
Hi all;
  I find unindented XML to be virtually impossible
to debug, but...

Frankly, I'm impressed at how good the built-in heuristic
for formatting works (apparently the rule is: once it sees
mixed content, it turns off indentation below that level).
Of course, it isn't "correct" and is occasionally giving me
really messy problems.

In principle, and perhaps naively, it seems the Correct rule
is rather simple, provided you have access to the DTD/Schema/whatever.

 The only correct rule is if there is a character on input it goes
on output, the indentation is a trick but not what the standard suggests

Namely, if an element allows mixed content, do not add whitespace.
(and you CAN apply indentation on descendants that do NOT allow mixed

Is there some switch or method that I'm overlooking to
achieve this effect?

  no, indeed this would make sense ... assuming you have a schemas, etc...
but when you serialize a document while the DTD *might* be available
you can't really find out he RNG or XSD associated (or trust them, download
them, etc....) it open a can of worm TBH.

Is this likely to be relatively easy
to implement? (I've managed to avoid learning libxml2's C API,
as I use it via Perl's XML::LibXML.)

I'd hate to have to give up indentation or write
my own serializer....

  have you looked at xmllint --pretty 2

it does pretty printing but without adding (or removing) any significant
character, it uses only the non-significant spaces  from within markup
which are discarded at parsing time.

  I know it's not what you asked for but might still be useful :-)


xml mailing list, project page
xml gnome org

Daniel Veillard      | Open Source and Standards, Red Hat
veillard redhat com  | libxml Gnome XML XSLT toolkit | virtualization library

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]