Best Practices - Writing DITA with Localization in Mind – Tips for using the “xml:lang” attribute

The “xml:lang” attribute specifies the language of the element content. The “xml:lang” attribute can be specified on every element in the DITA content model. When no value for “xml:lang” is supplied, the default value of English is assumed.

Tip: Ensure the “xml:lang” attribute is available on the root level

The “xml:lang” attribute must always be present at the root level of a topic, as it indicates the language the topic is written in. This will ensure that every tool used in the translation process knows the language in which the topic is written.

Tip: Try not to mix different languages in the same topic

Try not to mix different languages in the same topic by using the “xml:lang” attribute. If possible, use semantic mark-up to indicate words such as trademarks and file paths. Semantic mark-up refers to element designations that describe the nature of the type of content. Using semantic mark-up instead of the “xml:lang” attribute allows the downstream translation process to more readily exclude certain types of information from the translation process.

Don’t write:

<ph xml:lang=”en”>Microsoft Windows</ph>

Instead write:

<tm>Microsoft Windows</tm>

With this approach, the translation tools can be set to prevent the content between the <tm> elements from being translated.

In the following example, the word “.mtx” is marked as a file path, so the translation tool can be configured to block the translation of file paths:

<p>You can browse for Matrix configuration (<filepath>.mtx</filepath>) files.</p>
<p>Matrix 構成ファイル(<filepath>.mtx</filepath>)を参照できます。</p>