Opened 14 years ago

Closed 13 years ago

#6515 closed bug (fixed)

language mix due to country settings

Reported by: axeld Owned by: zooey
Priority: high Milestone: R1
Component: Kits/Locale Kit Version: R1/Development
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

I have set "English" as language, and "Deutsch (Deutschland)" in the country tab of Locale (which is obviously misnamed, as I don't choose a country here, but locale common settings).

Anyway, when you do that, you get a language mix between German and English in several places: the locale list in the Country tab, the timezones, the up-time display in AboutSystem, and probably more.

Change History (16)

comment:1 by pulkomandy, 14 years ago

Some of these should be fixed, but not all of them : what happens is we use the locale for all the date and time formatting. This not only sets the order of the short date format, but also the long date format, and all the other formats, including the time interval format used in AboutSystem for the uptime. If we make all of them use english, then I don't see the purpose of having a separate setting for language and country.

The time preflet should be fixed already.

That leave us with only the Locale preflet country tab to fix. Unless you have another solution for the other settings... I'm not sure ICU would like trying to set an en_DE locale or something similar.

comment:2 by axeld, 14 years ago

If I can set it, it should be supported - and why shouldn't I be able to set it?

The locale setting should only set the formats to be used, it shouldn't have anything to do with the language. If I have English as language, and use the German date notation, it should still read "5. March, 1990", and not "5. März, 1990" - the latter would be inconsistent with the language setting, and is just wrong.

If this is a ICU limitation, it is a design issue that should be either worked around, or fixed there.

comment:3 by pulkomandy, 14 years ago

For now the preflet sets the locale to de_DE, which is what you requested in the preflet. It is not possible to set en_DE with the current preflet. That's the first problem. The country tab should only list countries and the locale should be build from the selected country and the favorite language in the other tab.

The second problem is I'm not sure an en_DE locale would work at all in ICU. I'll have to test. If it doesn't, we would have to build a special locale doing the following :

  • Build de_DE locale
  • gather all the format strings from there
  • Build en_GB or en_US or ? locale (or an empty locale then set its language).
  • Force the strings of the new locale to the one gathered from ICU.

This will work for the date/time format, but I'm not even sure there is a format string for the interval format.

comment:4 by axeld, 14 years ago

Ah, thanks for the explanation - so the inconsistency is already in our locale kit that allows you to set two distinct languages, ie. that it differentiates between languages for catalogs, and languages for the rest of the system.

That's a problem that needs some care to be solved correctly, as you also don't want to mix your secondary language with another in a single application. If an app does not provide a catalog in French, but only in your secondary chosen language Spanish, the time/name formatting should be consistent within that application (ie. in Spanish, not French).

in reply to:  4 ; comment:5 by bonefish, 14 years ago

Replying to axeld:

That's a problem that needs some care to be solved correctly, as you also don't want to mix your secondary language with another in a single application. If an app does not provide a catalog in French, but only in your secondary chosen language Spanish, the time/name formatting should be consistent within that application (ie. in Spanish, not French).

I don't see why. If you have a preference for time, currency, etc. formatting, why should the unavailability of the catalog for your preferred language make the pain even greater by also forcing another formatting on you? I can't imagine that e.g. Germans preferring German as the interface language would be particularly happy with English dates and times in applications that don't have a German catalog.

In case you're only referring to e.g. the names of months and days of the week, then those should probably be consistent with the language for the application, but that has nothing to do with the formatting.

in reply to:  5 comment:6 by axeld, 14 years ago

Replying to bonefish:

In case you're only referring to e.g. the names of months and days of the week, then those should probably be consistent with the language for the application, but that has nothing to do with the formatting.

You just misunderstood me, that's exactly the message I was trying to bring across - the formatting should be used from what I specified, the names should come from the language the application is actually using.

I always found the "Should I proceed? Ja|Nein" requests in Windows to be unacceptable (a non localized application was still using the default strings from the globally selected language).

in reply to:  2 ; comment:7 by zooey, 14 years ago

Replying to axeld:

The locale setting should only set the formats to be used, it shouldn't have anything to do with the language. If I have English as language, and use the German date notation, it should still read "5. March, 1990", and not "5. März, 1990" - the latter would be inconsistent with the language setting, and is just wrong.

Bold words - I personally think that if anything is wrong in that context, it's a date like "5. March 1990" - at least it doesn't match any existing locale, and it's awkward to read for native English and German speakers. And it will get weirder if you change the example around and use German as language and English formats - can you guess offhand what string you'll get to indicate "am"/"pm"?

Please do a 'ls -l' with LC_MESSAGES=en_US.utf-8 and LC_TIME=de_DE.utf-8 on Linux or try strftime() and you'll see that it'll use German date format with German month names. So there *are* varying opinions out there :-)

If this is a ICU limitation, it is a design issue that should be either worked around, or fixed there.

No, this is not an ICU limitation, it's just that ICU by default uses a single locale per formatting. But it can be used in slightly more complicated ways to mix'n'match formats and "foreign" language strings.

The wrong language being used for the duration format in AboutSystem has already been fixed on my disk, by using the default language instead of the default locale. That was indeed wrong, since a duration format doesn't really involve any pattern-based formatting, it just deals with language.

comment:8 by zooey, 14 years ago

I forgot: the mixture of German/English names in the timezone page of the Time preflet is due to the regions (the outmost items) not being translated yet. Since ICU doesn't have localized names for all region names as they are used by the timezones, we either have to map timezone regions to region names which have localized versions in ICU available or do the translation ourselves.

comment:9 by zooey, 14 years ago

Owner: changed from pulkomandy to zooey
Status: newin-progress

comment:10 by axeld, 14 years ago

LC_TIME=de_DE.utf-8 is wrong in this context; as Adrien correctly noted, it would need to be en_DE.utf-8 (either of them doesn't exist on my Linux installation, though, so I can't test it).

For the Time preferences, I think we should use region names as a fall back if there are no localized timezones (and offer the opportunity to change that).

In general, the month/week day names in an application should be consistent with the language the application uses, anything else doesn't make any sense (I stand by my bold words :-)). I can even manually alter the formatting used, so I don't quite see why there should be a connection forced between formatting, and the names being used.

in reply to:  7 comment:11 by bonefish, 14 years ago

Replying to zooey:

Replying to axeld:

The locale setting should only set the formats to be used, it shouldn't have anything to do with the language. If I have English as language, and use the German date notation, it should still read "5. March, 1990", and not "5. März, 1990" - the latter would be inconsistent with the language setting, and is just wrong.

Bold words - I personally think that if anything is wrong in that context, it's a date like "5. March 1990" - at least it doesn't match any existing locale, and it's awkward to read for native English and German speakers.

As a German speaker using the English language and German date formatting under Linux, not only do I not find this awkward to read, it's also exactly what I expect and get. At least in KDE applications -- unfortunately the command line is not set up correctly.

And it will get weirder if you change the example around and use German as language and English formats - can you guess offhand what string you'll get to indicate "am"/"pm"?

I'm too lazy to check, but I would expect "am"/"pm" as Latin abbreviations are pretty stable across languages. :-)

Please do a 'ls -l' with LC_MESSAGES=en_US.utf-8 and LC_TIME=de_DE.utf-8 on Linux or try strftime() and you'll see that it'll use German date format with German month names. So there *are* varying opinions out there :-)

I do consistently get ISO date/time regardless of LC_TIME -- might be some other setting, but I don't see where that would be. date changes its output -- I couldn't say whether the "Sa" and "Aug" in "Sa 28. Aug 17:44:21 CEST 2010" are German or English, but date +%A prints "Samstag". In fact I even think that is correct, since LC_TIME says "de_DE" which specifies the language to German as spoken in Germany, so the week day and month names should be in German. Desired would be some "en_US..." where the "..." part would specify/override the formatting. Since POSIX declares the strings implementation defined and I haven't looked up what our or Linux' implementation (ICU respectively glibc I suppose) supports, I have no clue whether that is possible. The Unicode LDML identifiers would theoretically allow that, but key-value pairs for the time formats do not seem to be defined.

Anyway, while it would be nice to have the correct settings also in the Terminal, at least in the GUI we should respect the user's wishes, just as KDE does.

Replying to axeld:

LC_TIME=de_DE.utf-8 is wrong in this context; as Adrien correctly noted, it would need to be en_DE.utf-8 (either of them doesn't exist on my Linux installation, though, so I can't test it).

I believe "en_DE" doesn't make sense. AFAIK these two parts aren't independent from each other, but the region part just refines the former. "en_US"/"en_GB" is English as spoken in the US/in GB, but there simply isn't a region in Germany where English is spoken primarily (US bases don't count ;-)).

comment:12 by pulkomandy, 14 years ago

It is possible to do it with ICU :

  • Set the locale to 'en' (no country info needed)
  • Then set the date, time, … formats to the one you extract from de_DE locale (using fICULocale->GetDataFormat() and fICULocale->SetDateFormat).

Now, how to design the GUI for that ? If we use the language from the language list (I have french;spanish;english there), I'd want french apps to have french dates, and spanish apps to have spanish dates, but with french format. So the locale should be altered when loading the catalog for an app. That is without thinking about add-ons/shared libs that load their own catalogs and may format their own dates. I see great confusion coming here. Also, if a lib or an add on format dates but has no catalogs at all, should it use the language from the app it is loaded into ? How would it know which one it is ?

in reply to:  10 comment:13 by zooey, 14 years ago

Replying to axeld:

LC_TIME=de_DE.utf-8 is wrong in this context; as Adrien correctly noted, it would need to be en_DE.utf-8 (either of them doesn't exist on my Linux installation, though, so I can't test it).

Nope, LC_TIME=de_DE.utf-8 is perfectly fine in this context, you are mistaken. As Ingo has already mentioned - a locale named en_DE would represent the English variant spoken in Germany, *not* English language and German formatting rules.

And the POSIX specification is clear about the meaning of the locale given in LC_TIME: it defines the formats, the strings and the separator characters used when formatting dates and times.

Anyway, to have our cake, and eat it, too, we could add one more checkbox to the 'country' page that switches the strings between the language locale and the formatting locale.

This settings could then be transported to the POSIX layer (and from there into its ICU backend) by adding a corresponding keyword to the POSIX/ICU locale specifier (so LC_TIME would be either 'de_DE@strings=time' or 'de_DE@strings=messages'). If you guys insist on it, we could even go with 'strings=messages' as default :-)

This would even solve pulkomandy's example: With preferred languages 'fr,es,en', a French app would use French month names and format and a Spanish app would use the Spanish strings in French format.

I'm happy with that, since I'd still be able to use LC_TIME=de_DE@strings=time - fair enough.

Last edited 14 years ago by zooey (previous) (diff)

comment:14 by zooey, 14 years ago

hrev39123 has added support to the Locale preflet for switching month-/day-strings between date-locale and messages-locale. However, the same support is still missing in the POSIX layer.

comment:15 by idefix, 13 years ago

Does hrev42394 also fix the POSIX part of this bug?

comment:16 by zooey, 13 years ago

Resolution: fixed
Status: in-progressclosed

No, hrev42394 didn't fix it, but hrev43781 fixes it.

Note: See TracTickets for help on using tickets.