Re: [xml] how do I...



On Tue, May 24, 2005 at 02:47:12PM -0600, Sebastian Kuzminsky wrote:
I've got a couple of processes that trade XML messages over the network.
The sender writes each XML message on a single newline-terminated line.
The receiver uses select and read, and reads until it gets a '\n',
then passes the line to xmlParseMemory and validates the resulting doc
with xmlValidateDtd.


This works well, but it's a little annoying to have to use '\n' as the
message separator.  What I'd really like is to let the sender spread
its message out over several lines if it wants, and have the receiver
detect the end-of-message without any gross hacks.


I'm imagining a stateful, incremental parser function that I can
repeatedly call with the buffers I read from the network, and it'll
consume up to the end-of-buffer or end-of-message, whichever comes
first, and return me a doc if it finished one, or NULL if it didnt.
If the function didnt consume the whole buffer (because it found an
end-of-message before the end-of-buffer), it'll have to tell me where
it left off so I can call it again with the rest later.
[...]
Any clues for the clueless?
  By the definition of XML this is not possible. Packing multiple
XML document on a single stream without out of band markers is a frequent
but huge design flaws. The demonstration is obvious for anybody who
read the 2 first pages of the XML standard:

   http://www.w3.org/TR/REC-xml/#sec-well-formed

   First production of the XML specification:
   [1] document ::= prolog element Misc*

 Misc* means there is no potential limit to the number of Misc element at 
the end, and not finding one is a fatal error. 
 The direct result from this is that the parser must be told that the document
is finished. And libxml2 API being strictly conformant does not offer APIs
for what you want. 
 I strongly suggest you redesign your network format to include markers
or documents size in the pipe, the current state sounds broken.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]