RE: [libxml++] keek blancs.



> From: Christophe de VIENNE [mailto:cdevienne alphacent com] 
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Le Mercredi 28 Mai 2003 14:46, Murray Cumming Comneon com a écrit :
> > > From: Christophe de VIENNE [mailto:cdevienne alphacent com]
> > > Hi,
> > >
> > > Not keeping blancs at parsing time or adding some when
> > > writing to indent is
> > > breaking the XML specifications.
> > > However I think it can be usefull to do so. For example in my
> > > application I do
> > > have XML files in which I have no content nodes, and that I
> > > have to edit by
> > > hand. Having them indented automatically is _very_ much
> > > easier than to
> > > include artificial content nodes. I'm sure I'm not alone 
> in this case.
> > >
> > > So I propose to modify the API of libxml to give the
> > > possibility not to keep
> > > blanks (xmlKeepBlanksDefault option in libxml).
> >
> > To start with I think the whole keep_blanks name is very confusing.
> 
> Indeed.
> 
> > We seem to be dealing with 2 things here:
> >
> > 1. _Ignoring_ significant white space when parsing the document.
> > 2. _Adding_ indents when writing the document.
> >
> > Those are easy to understand. If we add the features then I 
> think we should
> > add them separately and with meaningful names in the API.
> 
> 1. Does renaming set_keepblancs to set_ignore_whitespaces 
> (and keepblancs 
> parameter to ignore_whitespace) seem ok for you ?

Yes. But see below.

> 2. I think write_to_formated_xxx is meaningful enough,

Sound OK, but I'd probably call it
write_to_file_formatted()
write_to_string_formatted().
(just moving the xxx)
because it's the writing that is formatted, not the file or string, and
because I like prefixing stuff rather than suffixing.

> but if 
> somebody propose 
> me something better. Or do you prefer some external functions ?
> 
> >
> > I think other parsers add 2. as some kind of 
> pretty-printing method. I
> > don't know if any other parsers add 1.
> > 1. does seem much more XML-spec-breaking than 2.
> 
> Probably. But if we have content nodes it makes use of 2. 
> easier.

I'm happy to let the application developers ignore their own white space. I
don't want to encourage misuse and misunderstanding (and the resultant
FAQ-questions on this list every week) by having a please_break_the_spec()
method that's too easy to use and forget about.

> (1. is easy 
> to avoid if we don't have content nodes : we can just filter 
> the element and 
> ignore the content node.).

I'm fairly happy to have 2. but I'm against 1. I don't think other parsers
have 1 apart from for backwards-compatibility and I think most of them wish
they didn't. I could be wrong. I'm not the maintainer.

Murray Cumming
murrayc usa net
www.murrayc.com 





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]