Opened 3 weeks ago

Last modified 2 weeks ago

#19216 new task

Change userguide language code for Chinese

Reported by: humdinger Owned by: waddlesplash
Priority: normal Milestone: Unscheduled
Component: Website/Userguide Translator Version: R1/beta5
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

As there's currently some activity for Chinese translations, I wonder if we can change the language code for Chinese.

Right now we use "zh_CN" for Chinese, but this has some problems:

  • this is a country code, not a language code
  • currently our "zh_CN" user guide is in "Chinese simplfied". There interest to translate to "Chinese traditional"

Often "Chinese traditional" seems to use "zh_TW", but as we should move away from country codes, this wouldn't be a preferred solution.

Pootle already uses "zh_HANS" and "zh_HANT". Can the userguide (and Welcome package) do likewise?

Change History (7)

comment:1 by humdinger, 3 weeks ago

I've just had a look at this stackoverflow page. It appears like it's not just a matter of different character - zh_HANS vs. zh_HANT, but that translations differ between regions. For example, there could be zh_HANS_HK and zh_HANT_HK for the Chinese variant that's spoken in Hong Kong...

Maybe we need some more expert input from out Chinese users in the forum.

comment:2 by pulkomandy, 2 weeks ago

There isn't really a problem with country codes here.

For example, this is what we do for portuguese as well, there is pt_PT (for Portugal) and pt_BR (for Brazil). These are two languages named "portuguese" but they have diverged quite far from each other.

The situation is similar for Chinese, but, there is an additional complication due to political history. In the cae of portuguese, it's easy to say that Portugal portuguese is the "original" one and the brazilian variant derived from it (it's not entirely true, but it's acceptable). So, pt_PT is in our case simply "pt".

But for Chinese, this is not so simple, because Taiwan says they are the continuation of the original china. This results in various strange things, for example in the olympics they somehow ended up participating as "chinese taipei".

In the case of language codes, this led to the decision to use "traditional chinese" and "simplified chinese" (zh_Hant and zh_Hans) as language codes that do not mention any country name. Note that this is purely a language code, and, like any other language code, you can additionally add a country suffix. This can be useful if a country happens to use both variants of the language. Another example of this would be Serbia, which uses both latin and cyrillic alphabets for the same language. We currently have those in Pootle as sr and sr_Latn. It seems this could be the case in Hong Kong for Chinese, but maybe they can just use the generic language code without country in our case, because the other settings (date format, currency unit, etc) are configured separately.

In the end, it is as usual with localization: trying to fit complicated historical, political, cultural things into a computer isn't easy.

In any case, for example in https://www.localeplanet.com/icu/iso639.html I don't see a possibility for zh-CN. You can do just zh, or zh_Hans/zh_Hant, and only the latter two optionally get a country suffix. I think the generic "zh" may exist only for spoken things, where the writing system doesn't matter, but for text, you have to specify which writing system to use. It seems some people decided to "imply" the writing system based on the country, but as we have seen, this wouldn't work for Serbian and other languages where two writing systems can co-exist in the same country. So, we'd better do the right thing :)

comment:3 by nephele, 2 weeks ago

Often "Chinese traditional" seems to use "zh_TW", but as we should move away from country codes, this wouldn't be a preferred solution.

I don't get this. why?

And honestly, I think the whole distinction is a bit ridicilous. Of course we *can* support severall variants of chinese if we want to. Heck we even support variants of some languages that are spoken way less than any variant of chinese.

Anyhow, the distinction for simple vs traditional characters is not something you can answer in isolation because this is a nom-phonetic alphabet. Just like you can't group european languages based on "what version of latin do you use to write?" We don't write in latin but a derived alphabet.

Anyway I would do it like this: Have a variant for chinese mailand and chinese taiwan (like the country codes linux uses) and leave the question of which alphabet to use to the translators. If they wish to do the work to provide this in both then we should let them provide it in both too. But you can't map these variants to one area or another because there are more differences ontop of that. (even for english uk vs english US for example)

comment:4 by pulkomandy, 2 weeks ago

Anyway I would do it like this

I don't think we need opinions here. We should follow the practices of either ICU, or what other systems do, and not invent our own thing. Especially if we do so without input from the people who actually use the Chinese language. This isn't really for us to decide.

I have provided context on what we did previously, if people from China and Taiwan say it is not the best solution, we should listen to them.

comment:5 by nephele, 2 weeks ago

Using it like we did before is already consistent with linux. So what would you change then? I don't see what the initial problem of this ticket is.

If some translator for any of these variants would like a different additional translation we can do that, but other than that I don't see what you want done instead.

comment:6 by humdinger, 2 weeks ago

FWIW, the forum thread didn't help me that much so far... :)

I don't see what the initial problem of this ticket is.

I just noticed that we use zh-HANS and zh-HANT at Pootle for the interface translation and zh-CN for the userguide. I was wondering, why both translations don't use the same codes and if they should, which one were the correct one.

in reply to:  6 comment:7 by MichaelPeppers, 2 weeks ago

Replying to humdinger:

FWIW, the forum thread didn't help me that much so far... :)

Then please speak up about it in the forum and tell us what isn't clear about the stuff said there. Chinese users as well as people like me who studied somehow related writing systems might be so used to dealing with the characters we might be glossing over stuff that's very obvious to us.

Replying to nephele:

Using it like we did before is already consistent with linux.

I'm sorry, but the Chinese users in the forums, the ones who actually use those locales, clearly disagree, otherwise they wouldn't be saying Nano is not being localized.

Version 1, edited 2 weeks ago by MichaelPeppers (previous) (next) (diff)
Note: See TracTickets for help on using tickets.