Are there distinct objects for writing systems in CultureInfo?

Hi all -

This is more of a question about C# in general as it relates to languages, but it bears on a translation engine plugin I'm writing for Trados.

Some languages can be written in more than one writing system. E.g., Serbian can be written in the Latin or Cyrillic alphabets. The way this seems to be implemented in CultureInfo in the language pairs is that there's a neutral CultureInfo object for Serbian Cyrillic called sb-Cyrl; but there doesn't seem to be a way to extract the fact that the writing system is Cyrillic - the same Cyrillic as Russian uses - except by parsing the name. I've looked at the documentation for the TextInfo member, and I suspect I could probably torture one of the code page fields to cough up this information, but I'm wondering whether there's a standard way to do it.

Thanks in advance.

  • Answering my own question: the CultureInfo objects conform to IETF RFC 4646, and the script codes, when they appear, conform to ISO-15924. The CultureInfo names are a combination of the two-letter ISO language code (which can be three letters if there was never a two-letter code), and then, optionally, the script and region. Virtually all the non-neutral (leaf) CultureInfo objects specify the region, but only some of the specify the script, and there are some cases where the language has multiple scripts available but the CultureInfo inventory does not distinguish between those scripts for that language. In these cases, Trados will have to add new CultureInfo objects in order to handle these cases. The script is not in a field; it has to be extracted from the CultureInfo name. Trados has added some custom CultureInfo objects (they have -x- in their names, as RFC 4646 provides for), but none of them are for languages with multiple scripts, and it's impossible to know what Trados would do if confronted with one of these cases.

    References:

    docs.microsoft.com/.../system.globalization.cultureinfo
    docs.microsoft.com/.../a9eac961-e77d-41a6-90a5-ce1a8b0cdb9c
    www.unicode.org/.../languages_and_scripts.html
    www.ietf.org/.../rfc4646.txt