I'm an English-speaking Canadian living in Germany. Quite often I go to a website like Google or Kayak and find myself looking at a German version of the site.
Okay, I do live in Germany, but why assume that everyone within Germany speaks German? What about visitors from other countries, or even people living here that would prefer to use another language?
What must be happening is these sites are taking my IP address, looking up the geographical location of that address, and choosing the official language for that country. This may work most of the time, but there is an even easier way to choose a language.
Most browsers send an Accept-Language header. For example, mine is set to:
en-ca,en;q=0.8,en-us;q=0.6,de-de;q=0.4,de;q=0.2
What this basically says is that I prefer (in decreasing order of preference) Canadian English, generic English, US English, German spoken in Germany, and lastly generic German. Any web site I visit is capable of looking at this list and deciding what language I would prefer.
Of course, no matter what assumptions you make about a visitor, give them a chance to change their language if needed. For example, if you use an Internet cafe in Berlin, you shouldn't be stuck viewing websites in German!
One really nice thing: I often see Google Ads and other geographically targeted ads in German, and this makes ignoring the ads much easier! :)
Update: I was inspired to throw together a quick Accept-Language parser in PHP:
$langs = array();
if (isset($_SERVER['HTTP_ACCEPT_LANGUAGE'])) {
// break up string into pieces (languages and q factors)
preg_match_all('/([a-z]{2}(-[a-z]{2})?)\s*(;\s*q\s*=\s*(1|0\.[0-9]+))?/', $_SERVER['HTTP_ACCEPT_LANGUAGE'], $lang_parse);
if (count($lang_parse[1])) {
// create a list like "en" => 0.8
$langs = array_combine($lang_parse[1], $lang_parse[4]);
// set default to 1 for any without q factor
foreach ($langs as $lang => $val) {
if ($val === '') $langs[$lang] = 1;
}
// sort list based on value
arsort($langs, SORT_NUMERIC);
}
}
// look through sorted list and use first one that matches our languages
foreach ($langs as $lang => $val) {
if (strpos($lang, 'de') === 0) {
// show German site
} else if (strpos($lang, 'en') === 0) {
// show English site
}
}
// show default site or prompt for language
This would produce the following structure for my Accept-Language string:
Array
(
[en-ca] => 1
[en] => 0.8
[en-us] => 0.6
[de-de] => 0.4
[de] => 0.2
)Good advice, indeed. Way simpler than looking for the geo location of an IP address.
I am only wondering about the reason why they once picked that content negotiation format for HTTP headers like Accept. Refering to your example, how would one parse the header easily to know that en-ca has a quality factor of 0.8? Exploding it on “;” or “,” does not really help.
Oh, wait! Now I see, en-ca does not have a quality factor of 0.8 but of 1 (by default) since a “q=” parameter has been omitted.
For some reason I misunderstood this content negotiation syntax for a while. But reading and re-reading the specs cleared things up: http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html
So, sorry for the confusion. Exploding on “,” is the way to go.
@Geert - You inspired me to throw together a parse script in PHP that deals with the q factor. Feel free to use and rewrite this as much as you like. (See above.)
Wow, that's exactly the way I feel too! I'm a native English speaker living in China and it drives me nuts when I go to some major websites and am automatically given the Chinese version. I can read a little Chinese, but sometimes its even hard to find the link to go to the English-language version. Its enough to make me want to stop using the website!
I started using a geotargetting feature in our ad server that will pick up the users language preference and target ads based on that - instead of Geo location. I saw that and thought pretty much like you did. If you speak spanish and live in texas - then spanish ads make more sense than if you speak english but live in spain.