Re: Broken i18n and the new UI handler.



Daniel Veillard <Daniel Veillard w3 org> writes: 
>   So it's mostly a (shared) library size issue...
> 

Also "GLib integration", that is you would want to have stuff
allocated with g_malloc(), use GError, use GList, that type of thing.

> > I actually think a full XML parser could be small enough (remove some
> > of the addons such as nanohttp and xpath, etc.) - but I'm not sure
> > it's worth doing that given that we already have libxml. I don't
> 
>   It may be worth splitting libxml into more than one shared library,
> I'm not opposed to this at all ... Problem is that the entity resolution
> may need the HTTP or FTP access and I don't know how to load them only
> when needed at runtime (feasible, iconv works this way, but portable ?)
>   Actually libxml configure allow to remove all non-core options
>

Right. One thing that might be possible would be to use libxml
internally in GLib (via the SAX interface) but then export a
GLib-style interface. Sort of a G-wrapper for libxml. I'm pretty sure
such a patch wouldn't be accepted for GLib though; I don't know. Maybe
it could ship with GLib as GObject does, in a separate lib. Just
speculation. I'm sure there would be significant feeling that it was
bloat to depend on libxml.
 
In the Inti C++ bindings I'm planning to do something like this,
export an Inti-style C++ interface, but using libxml internally to do
the parsing.

> > personally care to maintain such an alternative XML parser at least.
> > So since that won't happen GMarkup will have to do.
> 
>   But make sure it's 100% XML ... 
>

It's supposed to save/output valid XML. It certainly can't read/parse
full-blown XML.
 
> > >  - How do you intent to handle I18N and L14N issues in GMarkup ?
> > >
> > 
> > GMarkup just loads and saves UTF8, and that's it.
> 
>   Well humm... XML requires UTF16 support, but it's not widely used on
> linux ...
>

Yep. GMarkup can only read UTF8, but the UTF8 it outputs should be
valid XML since XML processors are required to handle UTF8 I think.

> > If you have time to look at GMarkup and help ensure it is a genuine
> > XML subset, that would be a very good thing. As long as it only saves
> > valid XML, people can easily upgrade to libxml in the future.
> 
>   I don't really have much time, I should spent some anyway. Did you
> made a list of the specific productions of the XML specification you 
> do not support ?
> 

GMarkup basically accepts:

 - elements and attributes <foo bar="baz"></foo>
 - the empty-element shortcut <foo bar="baz"/>
 - the standard 5 entities &lt; &gt; &apos; &quot; &amp;
 - character references &#3424;
 - UTF8 encoding
 - parses processing instructions and comments, but doesn't do 
   anything with them (just passes them through in an opaque node,
   and resaves them, but with no interpretation)

It will not handle cdata, custom entities, DTDs or validation, 
etc.

Havoc




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]