RE: [libxml++] Blunt Q: what C++ bindings for XML?



> I don't really understand this. All XML parsers allow you to 
> read the tag
> names and attributes values. But they won't give you the raw 
> text, because
> they are XML _parsers_. You shouldn't need the raw contents and if you
> really think you do then you should use normal file I/O.


Common misconception: parsers drop the original text.
An artifact of the early lexers, like lex.

Much recent work, e.g. in error handling and macro processing
and refactoring, has provided access to the original text
even in the parsed stream.

E.g. consider error handling...  it should be possible
to trace any error back to any and all files, lines, characters
which are associated with the error. In the presence of macros,
you should be able to say where the error occurred,
in every phase of macro expansion.  Yet most parsers that
drop the original text lose this ability.

E.g. consider an XML-like tool.  Consider 

	<foo>
      <bar><blick attr2="3"
                  attr1="4">biff</blick><bar>
      </foo>

Say I want to indent this. But not to the
fully indented form typical of most XML printers:

	<foo>
        <bar>
           <blick attr1="4" attr2="3">
              biff
           </blick>
        </bar>
      </foo>

Instead, I want to indent, but not adding any new newlines,
in either the text or the tags/attributes:

	<foo>
        <bar><blick attr2="3"
                    attr1="4">biff</blick><bar>
      </foo>

The blick indentation seems to require knowledge of the raw
text.

Prettyprinters for programming languages do this - I have coded
them - by *PARSING* that gives significance to whitespace.

I'm looking for such a tool for XML.







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]