Re: Content string encoding?



On Fri, 2008-02-01 at 01:01 +0100, Murray Cumming wrote:
> We are wrapping the g_content_type_* functions for giomm, and have a
> question:
> 
> Can/must the content strings here be UTF-8, or are they a blob of data
> of unknown encoding (a bit like a URI)
> http://library.gnome.org/devel/gio/unstable/gio-GContentType.html

I'm not sure. I mean, on unix they are mimetypes, and on windows they
are extension strings like ".doc", "audio", "*". Both of these will in
practice be ASCII strings in all cases, but I don't think there is
anything prohibiting e.g. adding a non-ascii type in the windows
registry which then could be returned to the app via gio.

For unix the source of mime-types is the freedesktop shared mime spec,
and its files are defined in utf8, so all unix mimetypes should be utf8.
Maybe we can say that the content type must be utf8, and then we filter
out those who are not (in practice none).

Also, URIs are not undefined, they are a limited subset of ASCII. If any
non-ascii character is unescaped in the URI it is invalid (by the spec).



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]