Re: [xml] how to interpret/reproduce this type of xml?

From: Bob Sabiston <bob flatblackfilms com>
To: xml gnome org
Subject: Re: [xml] how to interpret/reproduce this type of xml?
Date: Wed, 7 Oct 2009 09:54:45 -0500

So it's all due to the formatting that I'm having trouble, but does
anyone know how to do this? Specifically, if I'm reading the file and I
get the text between the brackets, how do I know where the formatting
ends and the real text starts? If I'm writing the file, what do I do to
write it in this format?

<richcontent TYPE="NOTE"><html>
 <head>

 </head>
 <body>
 
 Notes 1
 
 
 Notes 2
 
 
 Notes 3
 
 </body>
</html>
</richcontent>

Date: Wed, 7 Oct 2009 12:25:21 +0200
From: "LAUN, Wolfgang" <wolfgang laun thalesgroup com>

Hi Bob,

Your question is not at all ignorant or simple.

In XML, all characters between a "<tag...>" and its counterpart "</tag>"
are relevant, either being this element's content, or a subordinate
element. Therefore, you cannot decide, just by looking at some content
text, whether a blank or a newline is content as set by the XML text
creator - or merely a formatting quirk.

Therefore, it's only possible by taking the kind of document and element
into account, or by being assisted by an XML schema's information, that
XML processing can handle the content adequately. If you are dealing
with XHTML, the content of the paragraph element (and also some
others) should be interpreted by trimming leading and trailing
whitespace and collapsing embedded runs of white space to a single
blank. (This is XML schema's processing facet "collapse".) With (X)HTML,
it's the task of a renderer (printer or browser) - possibly assisted by
style sheets - to supply spacing before and after a paragraph's text,
indentation of the first line, alignment, line breaks, etc.

Moreover, notice that <body> has "content", too - the result of all the
characters surrounding the contained -elements. But the
interpretation of <body> does not require processing of its content
value at all.

-W

Thank you for the explanation! However I am still not sure what I need to do. I don't think I am dealing with XHTML -- how can I tell? And I found the schema for the XML I am using, but I guess it hasn't been updated in a while. There isn't anything in the schema about the richcontent or the 'NOTE" -- these are new features of the software, and I guess they didn't update the schema.

My only real problem seems to be knowing how many spaces to insert / take out? Is there any way to tell? Maybe with this particular node, it is always the same and I just need to count them.

Thanks

Bob

Follow-Ups:
- Re: [xml] how to interpret/reproduce this type of xml?
  - From: Bruce Miller

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]