Re: ustring::validate() costs?
- From: Chris Vine <chris cvine freeserve co uk>
- To: gtkmm-list gnome org
- Cc: Matthias Kaeppler <matthias finitestate org>
- Subject: Re: ustring::validate() costs?
- Date: Fri, 2 Dec 2005 23:11:01 +0000
On Friday 02 December 2005 06:55, Matthias Kaeppler wrote:
[snip]
> Yes, I am only referring to the names of files, not their contents. I
> don't think what you say is true though. For example, on my notebook my
> lcoale is set to ISO-8859-1 and if I create a file in Nautilus, then
> Nautilus will encode the name in UTF-8 and not in my current locale
> encoding. Maybe that's what you are referring to in your other post but
> it's very likely that this happens, actually.
But does it do it if you have G_BROKEN_FILENAMES or G_FILENAME_ENCODING
correctly set (before you start Nautilus, which usually occurs when GNOME
starts up)? If it does, you probably need to report a bug.
> Wait a second, you always separate between locale codeset and filename
> codeset. Aren't filenames always encoded in the codeset specified by the
> current locale, unless some application explicitly creates them in a
> different codeset? (For example all Gtk+ based apps where you enter a
> filename in some widget and which don't call filename_from_utf8() before
> writing it to the disk.)
Your first and second sentences contradict each other. To your question, yes
normally, but glib provides Glib::filename_to_utf8() (and vice versa) to do
differently if you wish. Since there is that option, I assume that there is
some set of circumstances in which it may be desirable to have different
codesets. Forgetting to call Glib::filename_from_utf8() would not be one of
them.
Some history: the glib developers have taken rather a high-handed approach to
filename codesets, as represented by the name of the environmental varia -
particularly relevant if you are using networked filesystems such as NFS or
CIFSble G_BROKEN_FILENAMES. They would like you to use UTF-8 whatever your
locale codeset (probably so that filenames are portable between systems -
particularly relevant if you are using networked filesystems such as NFS or
CIFS). Most people ignore that advice.
> Well, maybe I'll just post the code I have written so far. Let me tell
> you that it doesn't work, I'm still getting conversion errors thrown at
> me as soon as I read something encoded in non-UTF-8 with umlauts in it:
>
> // Here I am reading files from the disk
> file_info = dir.read_next(file_exists);
> if (!file_exists)
> break;
>
> filename = file_info->get_name();
> if (filename.validate())
> {
> Glib::setenv("G_FILENAME_ENCODING", "UTF-8");
> Glib::setenv("G_BROKEN_FILENAMES", "0");
> }
> else
> {
> std::string charset;
> Glib::get_charset(charset);
> std::cout << "Current locale: " << charset << std::endl;
> Glib::setenv("G_FILENAME_ENCODING", charset);
> Glib::setenv("G_BROKEN_FILENAMES", "1");
> }
> // this call throws if the filename contains special chars
> // and is not encoded in UTF-8, how can this happen??
> filename = Glib::filename_to_utf8(filename);
Your code setting environmental variables is pointless. If you want to
convert filenames from the locale codeset to UTF8 (if the filename codeset is
not UTF-8) as a mandatory policy in your program, use Glib::locale_to_utf8().
It is bizarre to programatically set G_BROKEN_FILENAMES or
G_FILENAME_ENCODING so that Glib::filename_to_utf8() will do the same thing.
Though it is not relevant for the reasons mentioned above, you should not set
both G_BROKEN_FILENAMES and G_FILENAME_ENCODING - you set one or the other.
If you set both, glib resolves the conflict by choosing G_FILENAME_ENCODING.
Chris
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]