Sign in to your account. Recent version of django moved zh-tw and zh-cn locales to zh-hant and zh-hans. There is some code for backwards-compatibility , but that will go away and we'll need to handle that. I had started doing that in but it requires more thought so I reverted that commit and moved the discussion here. One of the issues is that if we change the locales like this, we need to add redirects from the old locales and migrate data in translated fields.
This blocks See comment. The text was updated successfully, but these errors were encountered:. It's converting one thing to something different - although it's true that zh-CN has typically been misused to mean Simplified Chinese anyway. Sorry, something went wrong. It looks like zh-CN was indeed considered to be Simplified Chinese and that's why they changed it.
So zh-CN if not resolved directly, should be resolved as zh-Hans. But the scripts used are common within the sets. Re-reading the django code I don't think it's a blocker for us When django removes the backwards-compatibility code, we'll be in trouble, but at the moment it will still work - and produce a deprecation warning.
Critical to the acceptance of the position of the script subtag was the inclusion of information in the registry to make clear the need to avoid script subtags except where they add useful distinguishing information. Thus, the registry entry for the language subtag "en" English has a field called "Suppress-Script" indicating that the script subtag "Latn" should be avoided with that language, since virtually all English documents use the Latin script.
Note that this doesn't mean that "en-Latn" tags will never be used. There are cases where the script will provide information that distinguishes content. For example, a document that contains both Latin script and Braille might need to distinguish the two forms. However, these are unusual cases and the exception will be sensible and even obvious in those cases.
In any case, for virtually any content that does not use a script subtag today, it remains the best practice not to use one in the future. Languages that do use more than one script or are undergoing a script transition - such as those listed above - can and should benefit from identifying content using script subtags. Just over a year from its registration, a quick look at a search engine shows over pages in Simplified Chinese mentioning the tag "zh-Hans" alone.
The generative syntax will greatly assist the use and acceptance of script subtags for languages that need them. The registry is a text file in a special, machine-readable, format called "record-jar". Each subtag has its own record, consisting of several lines of text, which identifies the subtags, their use, and some information useful in selecting which subtags are right for specific circumstances.
Each record contains the subtag itself, its type "language", in this case , a description or set of descriptions , and the date that the record was added to the registry. All of the initial records have the date "" as shown above. Additional information is sometimes available. For example, in the record for the Czech language cs above, you'll notice a field called "Suppress-Script".
This field indicates that most texts in Czech are written in the Latin script and that the "Latn" script code is inappropriate for most language tags identifying content in Czech. That is, a tag like "cs-CZ" is recommended, while a tag such as "cs-Latn-CZ" is strongly discouraged.
Other fields that can appear include a "Deprecated" field that shows a date on which a particular code was deprecated. This almost always appears with another field called "Preferred-Value", which indicates a more appropriate subtag to use for that value. For example, the code "TP" was deprecated by ISO when that country changed its administration and name in The registration process can still be used to add information to or update information about specific records, as well as adding entire new subtags.
Records cannot be removed and there are rules to prevent the meaning of a subtag from being "mutated" to mean something completely different. The file itself contains a "File-Date" record, showing the last time the registry was updated. Combined with the various date fields in the records themselves, it is possible to validate any particular tag or its subtags for any given date, past or present.
RFC bis actually consists of three parts. First, there is the document that describes the syntax of language tags and the registry, as well as how language tags are maintained and so forth.
This document is an Internet-Draft called "draft-ietf-ltru-registry" and is about 62 pages long. This document was edited and maintained by Doug Ewell. The last piece of the puzzle is an Internet-Draft on matching of language tags.
This document was being worked on at the time this was written and its current name is "draft-ietf-ltru-matching". The IETF website hosts all of these documents, or you can find the latest versions of them all listed on my personal website and on the W3C site.
Matching, as noted earlier, is fairly well understood in its simplest, "prefix matching" form, which is described above in the section on scripts. However, there are some intriguing applications for RFC bis style tags in matching, as well as some well-known matching schemes that were not well documented in RFC This work is, at the time of writing, awaiting completion of the Last Call process. Part 1 covers the registration of two-letter codes. ISO is an international standard for language codes.
In defining some of its language codes, some are defined as macrolanguages [ Now I need to write a piece of code to verify that I receive a valid language code. But since what I receive is a mix of 2 letters language and macrolanguage , what standard am I supposed to stick with?
Are these code belonging to some sort of mixed up perhaps common standard? Following RFC at page 4 a language tag can be written with the following form : [language]-[script].
BCP 47 recommends that language codes be written in lower case and that script codes be written "lowercase with the initial letter capitalized", but this is basically for readability.
Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? User:Stevenliuyi find this table. I agree to use the more specific tags, because it's clearer for people to see the relation and difference between, say, "zh-hans" and "zh-hans-cn". Currently people are likely to be confused about the relation between "zh-hans" and "zh-cn".
Besides, according to the current BCP 47 RFC actually "zh-hans-cn" should be "cmn-hans-cn" or "zh-cmn-hans-cn" since it's only used for Mandarin Chinese, but nonetheless I think we can keep the usual, customary way to use "zh" instead of "cmn".
Could somebody enlighten me, please? I think it should be the "user preferences". Other components use these tags because they are in the "user preferences". In [[Special:WantedCategories]] on it.
Does it mean that the "old" categories should be moved to the new ones?
0コメント