Re: [xml] Substituting Umlauts by Unicode (&#x...) entities


Thanks a lot for your reply!

On Thu, 23 Jan 2003, Daniel Veillard wrote:

  Sometimes, I really wonder why I loose my time getting documentation,
a web site with search capabilities, commented code, etc ... I don't have
more time than you, simply searching on the web site for
    "save encoding"

It seems like you misunderstood me in my first mail. I didn't want to save
the output, I wanted to simply store the output in memory (but with the UTF-8
characters substituted by their corresponding character references).
And actually not the entire document, but just a specific text node. Maybe
xmlNodeDumpOutput() would have been more appropriate for dumping a single
Besides, the try with the xmlOutputBufferPtr was just in order to emulate the
behavior of xmlDocDump() except for the part where it comes to saving the
output to a file. Is it possible to view the contents of an
xmlOutputBufferPtr using xmlBufferContent() or not? (That's one of the
important questions in my first mail.)

Sorry, but as a libxml2 *user* who has never dealt with the issue described
in my first mail, I just cannot know whether the xmlOutputBuffer stuff is
the right tool for what I'm trying to achieve.

In case using xmlOutputBuffers is wrong, does xmlDocDumpMemoryEnc() convert
UTF-8 characters by the corresponding character entities.

Furthermore, two more things should have become obvious from my first mail:

- I *did* spend quite some time on the issue (more precisely, two days.
Otherwise, I couldn't have come up with such a list of items in my first
- I reached a point where I got stuck and really didn't know how to continue

Brings back in the first page:
     - a pointer to the encoding document description

In case you are talking about

I read that. Just doesn't include the info I'm looking for, so that's nood a
good example to come up with.

     - a pointer to the encoding part of the documentation
     - the correct function to use to do so
     54xmlSaveFileEncfunctiontreeDump an XML document, converting it to the given encoding
       with a pointer to

Why should that be the correct function in case I don't want to save the
output to a file? In case you got that impression because I mentioned
xmlDocDump() in my first mail: That's just because I wanted to come up with
an example of how I wanted the output to appear (UTF-8 characters
substituted by character entities).
  I think the contract is relatively simple:
   1/ I need to do my work


   2/ I cannot answer question all the time

Right. But you should be able and willing to distiguish between

- basic questions that *really* can be found in the already existing docs
and those that cannot and
- between library users and library developers. Not everybody who wants to
*use* libxml2 in his own apps should need/have to know all the ins and outs of
libmxl2 in order to be able to use it.

   3/ answering basic question which seems to me can found in the doc
      really frustrate me because I need to do 1/ to be able to keep
      this project running

Well, as you admit yourself, my questions in my first mail were not that
basic (please see below).
2/ is crucial . If 3/ reaches the "I can't stand it anymore"  threshold
well I will have to drop messages and not answer them.

I am aware of the fact that cannot answer questions all the time. I'm also
aware of the things written in red on

Besides, I don't know how I violated any of the points mentioned there.
First of all, I know that there's no guarantee that a question gets answered
and second I know that any mail should be sent to the list so that the
info/question can be shared with others. In the case where I used the wrong
reply function of my MUA, I apologized. In addition, I also understand that
thinking about an issue and formulating an appropriate response takes
some time.

In case you feel that you tend to answer a question many times, the FAQ on
the web site might be the appropriate place for putting the answer there.

Your first message was not trivial, well I understand it wasn't easy to
guess about using "ascii" (though it's documented in the encoding page)
maybe the doc nee more improvements.

I'm glad you admit it. I think it's generally not a good idea to make
judgments based on an impression. You seem to have gotten the impression
that I didn't take the time to look things up in the docs. Well, sorry
to tell you, but this assumption was just wrong (as should have been obvious
from the contents of my first mail). Daniel, I highly appreciate your work 
on libxml2 and the wealth of functions this lib provides (as do many others),
but please refrain from judging things you cannot judge because you have no

Concerning the "ascii" stuff: I'm aware that I can pass a specific encoding
to xmlSaveFileTo(), but it's not obvious to me why I should pass an encoding
if I already used the xmlFindCharEncodingHandler() mechanism. Sorry in case
I'm mixing things up here.

I would also be willing to help with the docs, but that's clearly a job that
requires one to have much more knowledge of libxml2 as I do. (It doesn't
make much sense to help with documenting things that I don't understand
really well).

Using the right words helps too,
saying "saving unicode char" while the exact terms are "using character
references", but acquiring the right vocabulary need time.

That's clearly a mistake on my behalf and may have caused some confusion.

XML is not simple. Libxml2 is not simple. An investment is needed to use them

I agree with you. But what makes you think that I haven't invested time?

I don't see any workaround. I don't see how I could spend more time on 
libxml2 as I do. I don't want to stop development to do support.

I understand. But you are talking as if the traffic on this list had reached
more than 100 mails a day, which is clearly not the case. Don't you think
you're exaggerating a bit?

It's a big WARNING, I may at some point stop answering questions,
as a community of users it's your duty so that this doesn't happen.

I agree with you and I try to contribute my share to prevent that from
happening (by taking measures on my own before positing to this list). To be
honest, I don't know what else I can do to help ease the situation.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]