Re: Natilus using UTF-8 for filenames regardless of locale

Am Fri, 16 Jul 2004 13:22:36 -0400 schrieb Owen Taylor:
> The problem is that it doesn't work to use locale-encoded filenames:

Well, it worked remarably well for me so far...

>  - Multiple people on the system may have different locales

That's right and they should be able to.

>  - If you create a tarball on your system, how should the
>    filenames be encoded?

I'd guess that tar considers the filename as an octet string and doesn't
mess with the encoding at all.

>  - If the kernel is converting Windows filenames from a VFAT
>    or SMB filesystem it has to choose a single encoding; that
>    encoding can't depend on the user's environment.

That's right. But in that case I'd much rather have every user who wants
that to work use the encoding *the admin* has configured not the one that
Gnome decided on.

> Etc. So the default we chose for GLib was to do something that
> at least works from one Glib/GNOME app to another. Many distributors
> (including Red Hat) do turn G_BROKEN_FILENAMES by default.

At least that's good to hear.

> UTF-8 is an encoding of the Unicode character set and thus can basically
> handle all the world's languages.

That's right.

> There is no guessing going on.

This is not. As Maciej already wrote, UTF-8 and the iso-8859-* charsets do
not have the same representation for non-ASCII characters.
IIRC iso-8859-* characters that are not also ASCII characters are not even
legal UTF-8 characters.
So there certainly is something going on here. When Nautilus comes across
a filename that is obviously not in UTF-8 it somehow gets the right
charset and also displays the filename correcty.
This is what my question was about. Where does Nautilus get the right
character set from?
(BTW: This leads to situations where two different files can show up in
Nautilus with the same filename :))

> To get filenames working correctly, you need to use a UTF-8 locale.
> That's it.

That's what I was hoping not to hear. The reason is that some of the
applications I use don't cope too well with UTF-8 (the main culprit being
zsh). So that is not an option (for me and at this moment).

But I now know how to switch it off so I'm happy for the moment.

And thanks for your reply.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]