Abstract
This article explains the issues of software localization, proposes
three concrete approaches and explains the differences between them.
Why Multilingual Architectures?
Multilingual architecture addresses the issues of running a single
software application in several markets or environments at the same
time. Such environments can be characterized by:
- Language
- Country specific dialects or language variants
- Time, date and currency formats
- Character sets
- Address formats (ZIP code formats, telephone number digits,
...)
- Hot key assignments (Ctrl-O opens a file in an English application,
Ctrl-A in a Spanish)
- Legal differences (privacy regulations in Europe, VAT calculations,
accounting rules, ...)
- Other issues of globalization, such as culture specific elements
that may be perceived differently in other countries.
Most software applications are being created monolingual and face
the challenge to operate in an international environment later in
their lifecycle (see CFO's view). In
this situation, a company has to learn a lot about foreign markets,
mentalities etc., so the idea is to reduce the hassle with the localization
of the software product to a minimum. Typically, such a company
will contact one of the leaders of the localization industry to
investigate options.
Localization of Windows Executables
The solution that most "localization companies" will
offer is to pass your .EXE and .DLL files through some special "localization
tools". These tools are able to extract resources from Windows
binaries into a translator friendly format and to compile executable
target binaries with the translated text.
These localization tools are very popular with localization providers
because:
- They are easy to use, even for technically less sophisticated
translators (which represents the vast majority).
- They maintain the context of the translatable
items, allowing the translators to choose the precise expression
and tone for every translatable item.
- They can be integrated with the translation
memories of the translators, to maintain the translation consistent
with online help and other documentation.
- They normally deliver running target executables
It is important to understand that the "translation"
of software is a surprisingly complex task, supported by its own
set of software tools and restricted to a small elite of translation
companies who have the tools, the tech skills and the right people.
And these localization tools allow to reduce the complexity of localization
projects and the skills necessary for the translators.
Changing to a Multilingual Architecture
Having learned about the basics of localization and the advantages
of localization tools, it surprises that most large software companies
are dealing with in-house, without the localization tools (then
frequently called internationalization or I18N). There are several
advantages:
- Working with "localization tools" ties you to a single
provider, who will take advantage of the lock-in effect and charge
you a premium price.
- Extracting the English text from the source files isn't really
that complicated .
- Localization tools do not help you with the internationalization
(I18N) of other software components, such as date and number formats
etc. (see above). You have to adapt your software anyway.
- Keeping the language specific text in separate files allows
you to deliver a single product that can run in several countries,
instead of delivering different binaries for each country.
Particularly the last point gains a lot of weight, if you have
many countries and short software releases cycles or patch intervals.
Imagine if Microsoft would have to deliver new binaries every two
weeks for 50 countries for several hundred products...
So sooner or later you will have to implement your multilingual
architecture. The question is really about the best timing. But
before we drill into that, we checkout what happens if a company
starts with a multilingual architecture without being familiar with
internationalization.
Traps and Issues with Multilingual Architectures
Software architect are typically make several important errors
when designing multilingual architectures:
- Most prominently, architects don't design the system to provide
context to the translators.
A translator is lost if he has to translate for example the English
word "Account" for a German ERP software, where there
are over 20 different terms in German, depending on specific type
and use of the account.
- Duplicated text strings can cause problems if they have to be
translated differently in different pages of a program.
- Software is not translated "once and forever".
New releases and patches require a tight integration with the
same translator during the lifetime of the application.
"Tight" because the translation company needs to be
able to react fast and "same translator" because you
can't have a patchwork of personal styles and expressions throughout
your application.
Multilingual Architecture Requirements
These issues translate into a number of requirements that a multilingual
architecture has to deal with, beyond the simple separation of source
text from the program code:
- Localization Fallback Handling:
But errors are going to occur even if you are using an automated
translation workflow. Typical cases include missing translations
or missing localized GIFs (all images that include text). In these
cases you want to provide the user with reasonable default values
(such as English text) and you want to inform your Translation
Workflow about the error ASAP.
- Translation Workflow:
Imagine that you have just added one new field to a screen of
your program and that you want to deliver the modified application
to your clients ASAP. But before you can do so, the new field
has to be translated by N (the number of your target languages)
translators, N EMail replies have to be checked for completeness
and have to be manually integrated into your resource files. Finally
you should have made N final tests.
So what you need is a workflow application that takes care of
the organizational aspects and that helps you to reduce errors.
Such workflow applications notify translators if a change has
occurred in the source text and that tells you when the last translator
has returned his or her results.
- Translation Cost:
You can be sure that your CFO is going to ask you after your second
software release why the translation costs are so high. This is
because translation is a manual process and good translators charge
daily rates comparable to software developers. So you will have
to add cost saving features to your translation workflow such
as computing the "diff" between the current version
and the last version. Or you have to close a special deal with
your translation agency.
- Translation Context:
Another issue that is going to occur is that your translators
are going to complain about missing "context" for their
translations. Missing context leads to wrong translations and
can have very funny results. The question is whether your clients
also think that it is funny... So you need to provide your translators
with the context of their phrases.
- Terminology Consistency:
Finally you have to make sure that your translators use the same
words for the same concept throughout your GUI and your documentation.
This issues becomes even more important if you are working with
several translators on the same project of if you are changing
translators. So you have to make sure that your translators use
a Translation Memory and a Terminology Maintenance Application
at home. Unfortunately these tools are very expensive and they
only work together with Microsoft Word which unfortunately is
not the file format of your resource files.
Two MLA Examples
I want to present two extreme examples of multilingual architectures:
The Car
Configurator application represents a typical object oriented
Java architecture. The ]project-open[
Web Application represents a typical Web Application with a
database backend.
|
|
|
System Architecture |
Java Servlets and Business Objects with local state |
Stateless Tcl architecture with central Oracle database |
Resource File Format |
Flat File |
Database |
Translation Lookup |
Centralized "Localization Subsystem" cached in memory |
Centralized translation lookup using a single database lookup
table |
Translation Workflow |
EMail exchange of MS-Word resource files |
Online workflow per translation item |
Fallback strategy |
Explicit fallback rules + logging of fallback events for the
translation workflow |
Return of English text + error events to the translation workflow |
Templating |
No templates |
Design templates with separate components for all localizable
items |
Performance |
Translation strings cached in RAM because of slow Java-DB
access speed |
Database access for all localization strings because of fast
DB connection |
Both multilingual architectures have proved to work well in practice.
However, the way of maintaining the systems is very different:
- The Car Configurator maintains all translatable items in Microsoft
Word resource files structured as tables. This makes the work
very easy for the translators who can use the Trados translation
memory and MultiTerm terminology maintenance component. Also,
this made it easy to communicate with translators (by EMail) who
are normally not very technology savvy. The organizational overhead
is limited because there is only a single resource file per language.
- The Competitiveness.com Marketplace includes its own online
Translation Workflow module that allows translators to work online.
Consistency and error control is automatic. However, translators
were complaining about missing context and expensive online time.
Translation is now done by in-house translators in the Competitiveness.com
offices.
Conclusion
Both multilingual architectures present viable options. However,
I recommend to define clearly the linguistic and organizational
aspects before the implementation of a MLA.
Please contact us if you
have doubts, questions or comments. Tell me if you want me to put
up your banner ad. Also, I am available as a freelance consultant.