Let a Thousand Languages Bloom!

Popular Indian Linux magazine “Linux For You” has included this article on Transifex in its latest issue on Localization in open source projects. The article gives an introduction to the challenges of open source localization and showcases Transifex as a way to improve productivity. It also includes more technical details and information on how to become a contributor.


For most open source projects,
translations are pretty
important. Projects that are
used by desktop users, such as
desktop environments, GUI applications,
and distributions, most frequently ship
localised user interfaces, documentation,
websites and other types of resources.

Take Fedora, for example — one of
the most popular Linux distributions out
there. Around 60 per cent of its users use
a localised desktop, and the percentages
may probably be higher with other major
desktop environments. In the case of
Fedora content, this gets translated to
something like 3-5 million users. With
contributions having such a large audience
and impact, it’s no surprise that the
open source translation community is
very active, and most major open source
projects enjoy an active community
devoted to translating the project into
various languages.

Challenges in FOSS localisation

Typically, software developers use an
internationalisation platform like gettext,
which parses the source code and extracts
the translatable strings from the code into
special PO files. These files are handed
to translators, who translate them into a
target language using a variety of tools.

The challenge for most projects lies in
receiving those translation files back in
their version control system (VCS).
Giving access to your VCS to a
few developers is usually okay, but
having to administrate accounts
for hundreds of translators could
be a challenge. To avoid that,
some developers even decide to
only accept translations with bug
reports or e-mail attachments.
But a developing product usually
means that “strings are changing
often”, and with each release,
translators will send a new batch of
translations in. That’s a lot of bug
reports and e-mails.

Larger projects usually have the
advantage of developing their own
translation community. In which
case, however, some developers feel
more productive using a different type
of VCS, and some others even host
their project on external servers. The
consequences of these approaches
are either low productivity, or just
a small number of translators and
quality that suffers.

Finding a solution

Transifex has been developed as
a solution to these issues, and to
make translations dead-simple both
for developers and translators. The
goal with Transifex was to work
as a translation proxy and handle
the mechanical processes for both
these groups of users, allowing
them to work more efficiently and
effectively.

Developers give Transifex
access to their source repository.
The Transifex “robot” can log in to
a number of different versioning
systems and grab the translation
files for the translators. The latter
log in to a unified, easy-to-use
interface, independent of the
upstream VCS type and location,
and receive the translations they
need. Upon translation, they can
use the same interface to submit
the files back to the VCS.

How it works

Richard Hughes is the software
developer of PackageKit. He hosts
his project in packagekit.org, and
needs to find a way to receive
quality translations in a hassle-free
way. He fires up his browser to an
existing Transifex server (such as
the soon-to-be-launched transifex.
net) and registers his project there.
He then receives an SSH key and
uses it to create a special user on
his server, with write access in the
translation directories. His project
is now ready to receive translations.
At this point, Richard is asked
whether he’d like Transifex to
scan its translation memory from
other projects to bootstrap the
translations of his own projects. He’s
delighted to see that his PO files
have been translated to somewhere
between 20-40 per cent with no
human interaction.

Piotr is a Polish translator who
loves translating free software GUIs.
He has registered with Transifex and
requested to receive notifications
for new projects registered, which
might interest him. He receives
an e-mail with a direct link to the
Polish PackageKit translation and
another link that he can use to
submit the file back.

Once the file is submitted back,
Richard is notified that language
translation for Polish is now at 100
per cent.

Architecture details

Under the hood, Transifex abstracts
all VCSs and runs a clone/checkout
on the repository. It identifies the
i18n method and the translation
files. Depending on the i18n method,
it compares the translation files
with the template file (for example,
the English one) and calculates
translation statistics for each one.

The management burden is
removed from developers, who can
concentrate on what they do best,
which is writing code. Translators
can use their single Transifex
login account to contribute to any
project they like, as long as it’s
registered on Transifex.

As a high-level Python
application, the service includes
hooks that can improve the
workflow in a number of ways. Pre-commit, the validity
of the file’s syntax is checked, avoiding breaking the
developer’s build process with broken files. It also allows
fine-grained permissions to files the translators need
access to. Post-commit, Transifex can notify language
leaders and others about file submissions, provide RSS
feeds for submissions, etc.

Transifex currently supports git, hg, cvs, svn and
bzr, and adding more VCSs is a matter of writing a few
lines of code. Its developers serve POT-based projects,
and are looking forward to extending the i18n support
to include intltool-based projects (GNOME), XLIFF, etc.
The login mechanism also supports OpenID.

Development of Transifex

The development of Transifex began as part of the 2007
Google Summer of Code project by Dimitris Glezos.
It was initially written in Python
using the TurboGears framework, and right after the
summer it was put into production in Fedora, used by
more than 100 projects and 500 translators.

Next year, Transifex was presented in more than
10 international conferences, including FOSS.in 2008.
In the summer, Transifex earned three more GSoC
applications and was re-written from scratch using the
Django Python framework, now including many of the
suggestions from existing users. Development has taken
place since then on transifex.org and on the transifex-
devel mailing list.

In the meantime, other projects liked the platform
and joined in our efforts. GNOME’s Damned Lies and
Vertimus tools migrated their code to Django, with the
goal of being merged with Transifex at some point in the
future.

Future features

With more contributors joining in the developer team,
Transifex is now moving towards a stabilised platform
to serve independent and upstream software projects
and then on to bigger ones.

One of the immediate features we’d like to add
is per-VCS file monitoring, so that translators can
‘track’ a project and get notified when the translation
percentage for their language changes. Adding
commenting support for projects and submissions, as
well as developing support for file uploads will enable
translators to better collaborate in QA.

Another often requested feature is the development
of a command-line interface allowing translators to do
something like the following::

$ tx set-language bn_IN
$ tx get-collection Fedora
Received anaconda/po/bn_IN.po
Received packagekit/po/bn_IN.po
$ # Translation...
$ tx send-collection Fedora
Sent ‘anaconda/po/bn_IN.po’ (100% translated)

The vision: Transifex.net

As mentioned earlier, Transifex allows downstream
communities to send files directly to the VCS of upstream
projects. One might wonder then, which Transifex
community should an independent project choose to receive
translations from—Fedora, GNOME, or example.com?

Having a common place where open source
translations take place is key to link translation
communities together and reach new levels of
collaboration between translation teams. Here’s a
plan we’re evolving with www.transifex.net: Establish a
healthy network where developers can translate their
applications and translators can contribute to their
favourite projects. Project teams that wouldn’t like to
undertake the trouble of setting up their own Transifex
instance, should have a stable, rich-in-features service,
to join their efforts to, with the rest of the open source
community, under a common umbrella.

Becoming a contributor

Transifex is written in Python and utilises the
awesome Django branch with its infamous top-notch
documentation. This makes it really easy for folks to
join in and extend the platform with the features they’d
like to see added. Development information can be
found at transifex.org/wiki/Development. To set up a
development environment of your own, check out the
documentation at docs.transifex.org/intro/install.html.

An example of an easy task would be to add
support for associating registered projects with their
maintainers/developers. This will give translators a
contact point for more information on the project
and for conflict resolution. Creating a patch that adds
simple support for project maintainers is a matter of
a few lines of code: add a foreign key from the Project
model to the User and probably edit the User Profile
page to include a section listing the projects the user
maintains.

Adding support for more VCSs and i18n back-ends is
also quite feasible because of the abstractions Transifex
includes in those areas. For most needs, one just copies
a Python file and changes accordingly. We’ve marked
quite a few tickets with the ‘easy_task’ keyword, so
check out transifex.org/report/9 to start hacking.

Let a thousand languages bloom!

What are you waiting for? Sign up for your 30-day free trial now.

TRY IT FOR FREE
REQUEST DEMO

Request a Demo

Tell us a bit about yourself and we’ll be in touch soon!