Re: [xml] xmlParseFile;encode - newbie question

On 23.02.2005 12:21, Pieter Louw wrote:

Daniel Veillard wrote:

On Wed, Feb 23, 2005 at 12:55:40PM +0200, Pieter Louw wrote:
xml data:

<name>Murray & Karl</name>

 this is not XML. You cannot parse this with an XML parser, it must
raise an error and not deliver data. Whatever generating this must be fixed.

 You probably need
   <name>Murray &amp; Karl</name>

 this is not specific to libxml2.


That is my question, how do I convert the & to &amp;?
or are there no function to do this in libxml?

How you are to convert it to &amp; is your burden. Libxml is a XML parser. It parses XML. It does not parse some weird thing you put at its feet, it parses only XML. What you delivered is not XML, no matter how similar to XML it looks to you.

Your Acrobat Reader will parse PDF, it won't parse Macromedia Flash. Your web browser will parse HTML, it won't be pleased if you give it Postscript.

You must understand, XML is a language of its own. You are bound to its strict syntax if you want to use it. If you have data which is saved in another format, another language, then you must convert it to XML before libxml even thinks about handling it. The same is true for any language/processor combination you can think of.

In my opinion, XML is the most misunderstood language of this world. Everyone seems to think its processors should understand all thinkable variations one could come up with. Despite its intuitiveness, XML is a language designed to be understood by a machine, not by a human. You cannot simply invent a new syntax form and expect the today's deterministic, nonintelligent machines to understand it. The more gigahertz you have, the sooner you will get the error message. One fine day, far in the future, when you can talk to your computer the way you talk to your girlfriend and emerge as puzzled from that conversation, that day, perhaps then, you will be pleased. Until then, you are bound to XML's strict syntax if you wish to have it parsed by a machine.

But, of course, creativeness has no limits set to its spread. If you can come up with a way for a machine to understand the human interpretation of XML wihthout breaking the specs, without loosing the data, without jeopardising the interoperability... well, we will listen, and gladly follow your lead.

Blah, blah, blah... :)


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]