This site has been restructured in order to make it more accessible. This site might contain information that hasn't been reviewed yet.

For the most recent version of our site please visit http://www.project-open.com and adjust your links/bookmarks accordingly. Thank you.

Home Contact Sitemap
Solutions Products Services Partners Research Company
 
  Quick Links
Download
List of Features & Modules
Frequently Asked Questions
Key Benefits
Online Demos
Manuals & Documentation
White Papers
Find a Partner
How to get it?
  Solutions
Translation
Consulting
Agency
  Developer Community
Project-Open.org
SourceForge.net Logo
Knowledge Management at La Salle University
 
Home / Whitepapers / Localization / OSS & L10n

Open-Source Software and Localization

An introduction to OSS and its impact on the language industry.
Published March 2005 in Multilingual Computing & Technology.
By Frank Bergmann

Open-source software (OSS) is already part of the mainstream information technology. Most medium and large companies in the world are already using it in some way or another. Apart from being cheaper, OSS is considered to be more secure and more flexible than its commercial counterparts. Corporate customers love the independence from a particular software vendor and the possibility to customize the software to the company’s needs, making it difficult for closed-software providers to compete with OSS.

However, OSS just recently became the candidate for “the next big thing” in the IT industry, the driver of a major wave of change that might radically alter the market forces, comparable only to the introduction of the PC or the Internet. But this time, the revolution is not that much about technology, but about the business models of the IT companies. This article explores some of these potential changes and how they might affect localization customers and providers.

The Rise of Open-Source Software

Before starting to discuss the impact of OSS on the software localization process, we need to understand how OSS went from its roots to conquest the corporate world. OSS was “born” in the 1960’s and 1970’s in the university and research environment [1]. Researchers started to use computer programs for their activities and, working in a non-competitive environment, began to share the resulting computer programs amongst them just like they did with their research findings. These groups of collaborating software developers are today known as “open-source developer communities”.

However, these early OS developers wanted to make sure that they received the fame and reputation as the authors of the code, similar to the scientific system of quoting research publications. So the “GNU Public License” (GPL) software license emerged [2], implementing the scientific citation rule in the domain of intellectual property rights. The GPL advocates that everybody can use, modify and redistribute “GPLed” software, provided that the initial authorship information is maintained. However, modifications and additions to GPLed software are GPLed again, creating what is known today as a “viral effect”. The GPL “infects” other code when combined, so that the body of OSS grows and grows ever since then.

OSS Leaves the Academic Niche

A major breakthrough for OSS came with the advent of the dot.com boom. The Internet initially developed in research institutions, and most of it is based on OSS. Also, the first industry-strength versions of Linux appeared during this time, creating an ideal environment for the young entrepreneurs. So it is no surprise that many startups during the dot.com boom used the readily available OSS as a base for their business. Google, eBay, Yahoo and Amazon are all still using this infrastructure.

Another breakthrough came with the need of these first OSS companies to support and maintain their software. So they started to outsource these services to other companies, effectively creating a market for the first Linux distribution companies such as RedHat and SuSE. The business model of these companies is based on selling professional services around the free OSS product.

The support work of these companies contributed to the quality of the OSS, lifting it into the same quality dimension as its closed-source competitors. And the availability of professional services made OSS an attractive choice for companies of all sizes who had to slash costs after the dot.com bust.

Finally, another important wave of change is just starting: OSS-based companies have started to offer “mixed-source” [3] software, extending OSS with proprietary functionality. These companies use OSS merely as a base, while providing the same service level to their customers as their closed-source competitors. As a result, the marketing mussels of these companies now push OSS. The most famous example in this field are IBM and Novell with their Linux strategy and Sun Microsystems with its StarOffice/OpenOffice and Java Desktop products.

The “Pure OSS” L10N Market

But how is the l10n market going to look like that is created by these new players? To answer this question we are going to differentiate between “pure” OSS and “mixed-source”.

Looking at the l10n needs of “pure” open-source developer communities, we may find that these communities are not very attractive customers, because they do not earn any revenues from their software products. Instead, they have to rely on volunteers from within the OS community in the same way as they rely on volunteers for software development. The quality of these translations is in general not as high as in closed-source software. However, this situation actually stimulates unhappy users to participate in the OS project and to contribute an improved translation.

However, there are some notable exceptions to this system, namely when OSS customers are willing to pay for a professional l10n. In particular, this is the case in the public sector where government agencies around the world seem to favor OSS over proprietary software [4,5]. There are bodies in the European Union facilitating these efforts [6], so we may expect an increasing standardization in the products being employed and a need for professional l10n.

The Mixed-Source L10N Market

The situation is more promising in the realm of mixed-source companies who somehow combine OSS with proprietary software in order to deliver a professional product to the market. These companies need to provide high-quality l10ns and have a budget and an organization in place to provide this service. For instance, Melissa Biggs from Sun Microsystems Globalization Engineering Group reported to us in a telephone interview that the “l10n processes for OpenOffice are basically the same as for other Sun products”.

However, mixed-source companies can also rely on the l10n volunteers from the OS community, depending on quality and completeness requirements and the available budget. The Sun G11N Engineering groups for instance has started a “Pilot Process” to “improve communication” between the Sun g11n group and the OS community [7].

OSS L10N Technology

We are now turning our focus towards the technical resources and skills that a l10n company needs in order to enter the OSS l10n market. To shed some light in this area, we present you below the l10n architectures of three very different OSS applications: Linux is an operating system, OpenOffice is a desktop application similar to Microsoft Office and ]project-open[ is a web-based application.

Also, the three systems are very different with respect to the l10n organization, with Linux being a “pure” OSS and localization by community volunteers, OpenOffice l10n management split depending on the language (Sun manages 10 languages, the OS community the rest) and ]project-open[ l10n split depending on application modules.

Common to all three systems is that their l10n processes are considerably different from the ones used for standard Windows applications. Every system comes with its own set of l10n tools and philosophy, requiring a considerable learning effort from a potential l10n provider.
Frank Bergmann is a l10n consultant and
Frank Bergmann is a l10n consultant and
founder of ]project-open[. He can be reached at
frank.bergmann@project_dash_open.com

Conclusion

OSS l10n is probably not an interesting mainstream l10n market yet, and pure OSS will probably never be. However, the overall share of OSS is growing fast and mixed-source l10n will become an interesting market in the close future.

Companies who are determined to enter this market will need considerable in-house technology resources. Getting involved in a particular OSS project may be a good start to investigating the new terrain.

Case Studies

Below we present three different open-source software packages and compare the technical and linguistic aspects of their localization.

"Tux", the Linux Penguin. Linux is probably the most well known open-source product.
"Tux", the Linux Penguin. Linux is probably
the most well known open-source product.

Linux Localization

Linux [8] is probably the most well known open-source product. Linux servers represent 15.6% of 2003 overall server market with growth rates of 40% annually (IDC). Linux is currently localized into some 73 languages.

The Linux l10n software architecture is based on the GNU “gettext” tool suite [9], together with a range of gettext compatible translator’s tools such as KBabel [10], PO-Edit, GTranslator and others. Gettext allows identifying translatable strings in the Linux source code and extracting them into a format suitable for KBabel and the other l10n tools. This l10n architecture is shared by the majority of open-source projects, forming the de-facto standard in open-source related l10n.

The quality requirements for the Linux operating system and server software in general are considerably low, because most Linux users are system administrators with a high level of English. Also, users of open-source software typically don’t expect a very high level of translation quality and completeness.

The l10n “market” of gettext is organized as groups of volunteers from the target language countries. Most of these volunteers are university students who are using the software for their own purposes.

OpenOffice Localization

OpenOffice [11] is an open-source office suite similar to Microsoft Office, including applications such as word processor, spreadsheet, presentations and drawing. OpenOffice has been localized into 25 languages and has been downloaded by some 16 million+ users. OpenOffice is an open-source variant of Sun Microsystems StarOffice product and localized under the organizational umbrella of Sun.

The OpenOffice l10n architecture is similar to the GNU gettext architecture explained above. A specific localization tool called “localize.pl” [12] is used to extracts translatable strings from the source code. This list can be converted into the gettext format suitable for KBabel or into a format suitable for Trados and other translation memories.

The l10n quality requirements for OpenOffice depend on each language. OpenOffice inherits the professional l10n of the 10 languages under the responsibility of Sun’s G11N Engineering Group (FIGS, Swedish, Brazilian Portuguese, Japanese, Korean, Simplified and Traditiona Chinese) [7]. Several open-source groups consisting of volunteers handle the translation of the remaining languages.

OpenOffice is currently developing a “Localization Pilot Process” [7] to involve the open-source community into the l10n process, probably with the goal of cutting costs. This process will reduce the need for professional l10n outsourcing if successful.
]project-translation[ [12] is a web-based project management and workflow system specifically designed for translation and localization companies.
]project-translation[ is a web-based
project management and workflow
system specifically designed for
translation and localization companies.

]project-translation[ Localization

]project-translation[ [12] is a web-based project management and workflow system specifically designed for translation and localization companies. ]project-translation[ is “mixed source” software because most of its modules are open-source, while a company provides professional services and extension modules.

Being a typical web-based application, ]project-translation[s can rely on a relational database to store its localization strings. This organization allows ]project-translation[ to provide several l10n tools via a web interface. In particular, it supports a “translation mode” (see screenshots) that allows for online translations within the application context, similar to the Catalyst and Passolo resource editors.

The quality requirements for such a mixed-source web applications are in line with industry standards.

Members of the open-source community are currently carrying out most of the translation work of the OS modules. The l10n of the closed-source modules is outsourced to professional translators.

Screenshots


The KBabel main translation screen.
Please click on the image to see the enlarged image.


The KBabel Catalog Screen allows keeping up with translation in large projects.
Please click on the image to see the enlarged image.


KBabel directory – basic terminology maintenance
Please click on the image to see the enlarged image.


An example screen from ]project-translation[.


The same screen again, but in translation mode.
Green dots appear behind all translatable strings, allowing the translator to
work in the linguistic context of the application.


The ]project-open[ translation screen from the example above.


The ]project-open[ catalog screen, showing the list of all translations in a specific module.
Please click on the image to see the enlarged image.

References & Resources

[1] “A Brief History of Free/Open Source Software Movement”
http://www.openknowledge.org/writing/open-source/scb/brief-open-source-history.html

[2] The GNU Public License
http://www.gnu.org/copyleft/gpl.html

[3] “Open, closed: Novell's 'mixed source' software”
http://star-techcentral.com/tech/story.asp?file=/2004/9/10/technology/8872977&sec=technology

[4] “Governments Mull Open-Source”
http://www.businessempowered.com/issues/2004/03/en/dept_shortcuts.shtml#opensource

[5] Open-Source and Government: The “FLOSS ” Final Report
http://www.infonomics.nl/FLOSS/report/

[6] European Commission IDA Open Source Observatory
http://europa.eu.int/ida/en/chapter/452

[7] “OpenOffice Localization Pilot Process”
http://l10n.openoffice.org/localization/L10n_pilotprocess.html

[8] Linux Homepage
http://www.linux.org/

[9] The Gettext Localization Suite
http://www.gnu.org/software/gettext/manual/html_mono/gettext.html

[10] KBabel L10N Tool
http://i18n.kde.org/tools/kbabel/

[11] OpenOffice Homepage
http://www.openoffice.org/

[12] OpenOffice L10N Framework (“localize.pl”)
http://l10n.openoffice.org/L10N_Framework/iso_code_build2.html

[13] ]project-open[ Homepage
http://www.project-open.com/

 
 
 News
Sign up for this years ]project-open[ developer conference

Run your ]po[ VM on Apple
]project-open[ rated #1 in Business Management

]project-open[ invited to present at "Open Source Meets Business" conference

New module available:
One-Time-Passwords

]project-open[
Developer Conference
September 28th, 2006

Linux
Magazine Brazil
PM Article

Thank you for visting us

Community
Choice
Awards

]project-open[
3.1 Linux released

]project-open[
3.1 Windows released

Frank Bergmann at Localization World 2005 in Seattle

Unicode '05 Localization PM Best Practices Wiki

More news ...
 
 Search


project-open.com
project-open.org
The Internet
  Register
  Sign up and we will keep you updated on product news.
(Read privacy statement)
Your E-Mail:
Register  Unregister


  Quick Feedback
   Name:
E-Mail:
Message: