Re: ustring::validate() costs?
- From: Matthias Kaeppler <matthias finitestate org>
- To: gtkmm-list gnome org
- Subject: Re: ustring::validate() costs?
- Date: Fri, 02 Dec 2005 07:55:41 +0100
Chris Vine wrote:
Are you worried about the codeset of a file's contents or of its filename?
You begin by referring to filenames, but you appear to end by referring to
the codeset in which a file has been written to. All filenames in any one
system will use the same codeset - you cannot have "files with mixed
encodings", as you put it, in that sense.
Yes, I am only referring to the names of files, not their contents. I
don't think what you say is true though. For example, on my notebook my
lcoale is set to ISO-8859-1 and if I create a file in Nautilus, then
Nautilus will encode the name in UTF-8 and not in my current locale
encoding. Maybe that's what you are referring to in your other post but
it's very likely that this happens, actually.
<snip>
If all you want to do is to force a conversion of a filename from the locale
codeset to UTF-8 and you don't want to bother with the G_BROKEN_FILENAMES or
G_FILENAME_ENCODING environmental variables, just use Glib::locale_to_utf8()
(this will have the same effect as calling Glib::filename_to_utf8() with the
G_BROKEN_FILENAMES environmental variable set). You lose the flexibility of
being able to cater for the locale codeset and the filename codeset being
different, but how many systems would do something as insane as that anyway?
Wait a second, you always separate between locale codeset and filename
codeset. Aren't filenames always encoded in the codeset specified by the
current locale, unless some application explicitly creates them in a
different codeset? (For example all Gtk+ based apps where you enter a
filename in some widget and which don't call filename_from_utf8() before
writing it to the disk.)
Well, maybe I'll just post the code I have written so far. Let me tell
you that it doesn't work, I'm still getting conversion errors thrown at
me as soon as I read something encoded in non-UTF-8 with umlauts in it:
// Here I am reading files from the disk
file_info = dir.read_next(file_exists);
if (!file_exists)
break;
filename = file_info->get_name();
if (filename.validate())
{
Glib::setenv("G_FILENAME_ENCODING", "UTF-8");
Glib::setenv("G_BROKEN_FILENAMES", "0");
}
else
{
std::string charset;
Glib::get_charset(charset);
std::cout << "Current locale: " << charset << std::endl;
Glib::setenv("G_FILENAME_ENCODING", charset);
Glib::setenv("G_BROKEN_FILENAMES", "1");
}
// this call throws if the filename contains special chars
// and is not encoded in UTF-8, how can this happen??
filename = Glib::filename_to_utf8(filename);
Regards,
Matthias
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]