RE: [xml] encoding tutorial draft



Hi Igor,

alright, alright. Let's stress the necessity of conversion. But I still
believe that the majority of readers of the tutorial, will only need the
hint and know what this is all about. I would rather point people not
familar with the idea of encoding to some different source of information
and help the rest to a fast start using libxml. However, since I am not
going to write the tutorial (nobody would like it anyway :), everything is
fine with me!!
Maybe I am too biased, though. I just started programming C but am quite
familar with Java, XML, MIME, ...
I quickly scanned through the tutorials before my first try on libxml and
loved it being so short :)
But thats just my personal perception ;) I am also very pleased with reading
libxmls source because I am learning this way.

Cheers,
  Marcus

PS: Am I mistaken that every introductory text on XML contains a chapter
about encoding?

-----Original Message-----
From: Igor Zlatkovic [mailto:igor stud fh-frankfurt de]
Sent: Tuesday, November 05, 2002 4:27 PM
To: xml gnome org
Subject: Re: [xml] encoding tutorial draft


I am not sure Igor if you do not expect too few. Of course there are
programmer who never put a thought on the general problem of data
representation in computing, but from time to time every 
programmer is faced
with the problem, that (s)he needs to know about the actual 
representation
of data. (This does not relate only to character data but 
to numbers and of
course the more complex data types). Not knowing what 
"encoding" is produces
so many traps. I think you can expect, that most of the 
libxml users DO know
what encoding is and that there are many different ways of 
representing
character data, since everybody has once walked into one of 
these traps or
the other. 

Sure, you are right. Still, the fact remains, that people 
have problems 
believing that the conversion is necessary. Statements like 
"I have found a 
bug, libxml misinterprets my strings when they contain äöü" 
are too common.

I wasn't trying to introduce any reader in C, but to 
emphasise the necessity 
of conversion as much as possible.

 > And don't forget the people we are talking about are C
 > programmers! I emphasize again: C!! Only that they need a 
small reminder
 > perhaps.

C programmers are the targeted audience and every C programmer was a 
beginner once upon a time. Many posts here could be observed 
which hint that 
the poster knows little beyond the first hello-world lesson. 
But, these 
beginners in C are also our audience. What better way to 
learn programming 
than to write software? Libxml is not reserved just for the 
expirienced.

More grateful they will be if supplied with a simple way of 
achieving the
conversion to utf-8. Maybe the smaller code snippet I made 
for the FAQ
should at least be mentioned in the tutorial, since 80% of 
all appliances of
conversion will be from iso latin-1 (or some local windows 
codepage almost
the same) and about 10% the other way round (just a guess 
and not counting
east asian programmers who know the business of encoding 
quite well :-):

in = "some null terminated iso latin-1 string";
temp = size = (int)strlen(in)+1; /*terminating null included*/
out_size = size*2-1; /*terminating null is just one byte*/
out = malloc((size_t)out_size); 
if (!out) {
    if ((ret=isolat1ToUTF8(out, &out_size, in, &temp)) || 
temp-size) {
            free(out);
            out=NULL;
    }
}


Sure, everything that makes sense to someone is worth 
mentioning. Each human 
perceives the information differently and only the readers of 
a tutorial can 
judge if it helped them learn something or not.

The fact remains, that libxml is there to process XML, not to 
transcode data 
from one format into another.

Ciao
Igor

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xml gnome org
http://mail.gnome.org/mailman/listinfo/xml







[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]