Re: Localized Pages



Joakim Ziegler wrote:

Could you try automatic language detection with SourceForge (http://www.sourceforge.net)? SourceForge uses language preference detection based on PHP, roughly the same solution that was discussed here. And yes, SourceForge does this by default.

Sourceforge is in English to me, no matter what I do with IE's language
settings.

I have had no problem with this in Mozilla and Netscape. I suspect your problem in IE is because SourceForge expects a language code (like "ll"), while what IE sends might be a full locale (like "ll-CC" where "ll" is language code and "CC" is country code).

However, this is trivial to fix. For most languages, we don't need a particular country code (since we don't have specific translation for that country, just for languages), so it's just a matter of stripping the last country code part and only use language code. Problem solved. Not a very advanced programming hack, and it has been done before.

What is more important is that you *did* after all get English as the fallback, which is what you wanted for fallback. However, if the language detection had worked for you, you would have gotten a translated page, no fuss, no mess. If your English knowledge would be very bad or you didn't understand English at all you would have been able to read the page, which I think is a big plus compared to not being able to understand the page at all.

I and obviously others have no problems with this, it works for us, and as said above, support for IE would be *very* trivial to add.


At this point, it seems like our language detection algorithm is getting
complex, but it's really not. Also, it only needs to be done once, since we
should use a cookie to set the preference.

Actually, I think setting a cookie is only necessary if the user switched language manually (then we know that he is not satisfied with the detected language). Only then, we must explicitly remember his choice. In all other cases the language detection obviously worked and there is no need to set a cookie.


I still think that it should respect the language setting in the browser by default. If it is ok for SourceForge, then why not for the GNOME web site? I think you should try SourceForges language detection and see if it works for you. If it does, your problem was with the debian.org implementation of language detection (debian.org uses Apache for language detection I think), and I don't see a good technical reason at all not to use language detection by default on the gnome pages to server content in the detected language.

It doesn't work.

Yes it does. You have only provided *one* example where the results were "fatal" (you got Polish when you wanted English on the debian.org page). This was with IE (if I remember correctly), and more importantly, debian.org uses Apache for determing language. This is probably not at all what we will do (it seems we will use PHP, and hence we can use our own code for this) and I don't think that it is relevant since SourceForge, which uses a PHP solution, doesn't give you a Polish page. So that problem was specific to the Debian solution, and hardly relevant to this.

I think picking *one* case like this where it didn't work, a problem with a specific browser and a specific language and where the problem probably won't affect our solution to this, and on the same time disregard that not respecting language settings generally will result in a lot of people around the world, regardless of browser, won't be able to understand the site, is a very strange way to prioritize things.


However, it fails for me in a different way than Debian.org
(I always get English).

See above.


It seems to me that language detection is error-prone
and badly supported.

Not at all. It is not more error-prone than we make it. We can make the language detection exactly like we want it, and as good as we want it. On the other hand, that is no good if we don't use it, which is what this discussion is about.


On the other hand, people *expect* to get sites in
English when they go to a site with a generic TLD, such as .org, unless the
site is something specific to a certain country or language.

I disagree. The only thing that would make a user expect English when he goes to microsoft.com is that he has been there before and it was in English at that time. People, in my experience, go to .com, .net and .org domains because they suspect these are the official sites, and they do want to go to the official site, but not because of a particular language preference.

For example, .nu is very popular, it is the country code for the island of Niuawi (sp?) but that doesn't make the content on the majority of .nu domains in the language of Niuawi, or people expecting it that way.

.com, .org or .net and other TLDs aren't bound to US, and there is no rule that says that content on those TLDs should be in English. They often are, but not always. Assuming that those common TLDs should be reserved for English is very broken. A lot of countries have *very* strict domain name rules for the country TLD, so that it is in fact impossible to register the domain that you want in that TLD if you're not a company registred in that country and can prove that you have a company name or trade mark like that domain. Sweden has such rules for example, that is why we for example can't register gnome.se, because we can't afford to start a company in Sweden with that name just to get that domain. As you would assume, .com and .net are popular domain TLDs for Swedish sites, and sites just targeted to Swedish visitors. I know this is true in other countries too, it has happened more than one time that I've gone to a .com just to find a site in Japanese, targeted just to japanese visitors.

Another reason .com is popular is because this is what all browsers will default to. If you just enter "something", it will try to go to "www.something.com". Of course, this is attractive to people around the world when setting up a site, no matter the language, but this assumes that you have registered something.com.

The result is that there are *a lot* of .com-like domains out there that are targeted to other audiences than those speaking English, and they get more and more all the time. The reason why you might experience that .com, .net and .org domains are targeted to English-speaking people is that it is only advertisements for those sites you will see if you live in an English-speaking country. As a person who doesn't live in such a country, I can tell you that that assumption is not at all clear here - you have a lot of advertisement and targeting for local sites using .com, .net, .org, and .nu, all sites targeted to the local language. So for an international visitor, he might not at all expect an English site when he hears about GNOME and goes to www.gnome.org to find out about it.


Automatic language detection is not used by any large, successful commercial sites,

Why does it have to be commercial to count as an example? SourceForge is a large web site, but it isn't commercial, and localization obviously works for them.


such as Yahoo!, Apple, Microsoft, etc., they all rely on subsites (or sites
under the relevant TLDs, which we can also do if someone wants to pay the
registration), and that doesn't seem to scare away any users.

The problem is that in some countries you can't get a domain in that TLD even if you pay an enormous amount of money. You'll have to have a company with that name registered in that country too before they will even look at your domain name application. Microsoft, Yahoo!, Apple, Adobe etc can afford setting up local companies, but I doubt the GNOME project can.

And microsoft.CC (CC=country code) still isn't intuitative to most people, that's why they do an enormous amount of advertising on microsoft.com to attract people to the local sites. A construct like www.se.gnome.org is even less intuitative to someone just trying to guess where the GNOME site is. That is why localization of the main site is so important.


On the other hand, presenting a page in Polish or whatever when people expect
English *is* a potentially fatal error, because people aren't used to Polish
being the standard language, and as such are much more prone to thinking "Oh,
this is a Polish project, it has nothing to do with me", and leave forever.

And how should this be more common than a person coming to gnome.org to learn about GNOME, thinking "oh, this is only in English" and leave, just because he didn't realize that GNOME isn't at all reserved to just English, and that GNOME *is* availiable in his language by default, but the web page isn't?

You can reverse the scenario like this over an over; an English-speaking user getting the wrong page because of a bug and thus loose interest in using GNOME, or an international visitor getting a page in English, a language that he doesn't understand well or not understand at all, and thus loose interest in using GNOME, but I think the latter would be *much* more common than the former scenario. The former is caused by a rare bug that only affects some of the visitors and that is fixable, the latter is caused by a bad site design for international visitors and will affect all international visitors.


Christian





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]