Re: open translations database
- From: Stefan Rieken <StefanRieken SoftHome net>
- To: Aoife Dunne - Sunsoft ELC <Aoife Dunne ireland sun com>
- Cc: whampton staffnet com, gnome-i18n gnome org
- Subject: Re: open translations database
- Date: 02 Nov 2000 08:24:28 -0100
> Dear Stephen & All
It's Stefan! ;-)
> My name is Aoife Dunne and I am the project manager responsible
> for the GNOME Localisation at Sun.
> I am writing this mail in the hope that I can take Stephen's
> suggestions one step future helping the open source community in
> providing localised product versions of GNOME and similar open
> source products thereafter.
> I work for Sun Microsystems who are
> planning on shipping GNOME with the next marketing release of
> Solaris, therefore I am writing this mail with the GNOME project
> in mind.
> However, we want any solution to be for general benefit
> of free and open source software and I would be very interested in
> offering our team assistance across all localised open source
You have a few good points here already. Let me be so bold to say that Sun could hire professional translators for GNOME. I wouldn't be against that (hint ;-), but it would be a short-term solution, and I guess you realise that. The long-term solution would be to get the translation community self-organised. All bigger (succesful) projects have some kind of organisation, e.g. the Linux module system and the GIMP plugin system. Translation, which I think is a big project, doesn't have such a unification as of yet.
> How can Sun help:
> Stefan mentioned it would be nice to have a web-accessible
> "database" (or just a simple file) which would contain one or more
> set of standard English and associated translations for standard
> words/terms. Develops and translators of software and
> documentation could use the terminology listings as reference.
> Terming Tool
> We have a script, which extracts terms from the English software
> files, providing suitable terms for the initial database/file. A
> term is defined as no more than one or two words. This script
> extracts terms from the strings, removes duplications, ignores
> terms such as "the, is, numbers etc.". It is not possible to
> extract the associated translated terms, so it would require
> translators to provide the translated terms. Once this is done,
> the terminology listings can be posted to a web site, where it can
> be updated/modified as development of applications progress. It
> is preferred that the suite of applications within a product use
> the same terminology ensuring consistency, however by defining the
> application it is possible to use different terms when
> English Term English Definition Translated Application
> Initially it may not be possible for me to supply the source of
> the terming tool due to licensing problems, however I can help
> immediately by supplying a simple text file with the English
> terms. Would this be of help?
> Translation Memory
[snip: a very interesting story about the TM]
> The TM system is still in development but is coming close to
> completion. We may be able to help by providing you with a .po
> file parser. However, we would need to look into possible
> licensing issues.
You describe the system I had in mind, so I can't help being very enthousiastic about the idea.
There are a few questions here. The first one, you touched the topic politely, is: how to make this useful for Free Software folks.
> Style Guides
> We have some localised versions of a style guidelines. These
> guidelines are used to aid the translators. For example, in
> France how the date, time formats should be localised. In many
> countries such data is correct in many formats, however, the use
> of style guides decide on the preferred format for the use of
> consistency. Our style guides could be used as reference and
> updated to create a GNOME specific style guide for all languages.
> Let me know if you are interested and I will send you a copy of
> our country specific style guides.
> How else can Sun help,
> * possible act a the host for the translation memory database,
> populating newly translated products.
> * provide linguistic quality assurance feedback and implement
> linguistic changes if necessary checking for grammar, spelling,
> inconsistencies etc.
> If any of the above suggestions would be of help and if you have
> any other suggestion on what I can bring to the table, please let
> me know. Looking forward to getting any feedback.
> Best Regards
> > X-Unix-From: StefanRieken@SoftHome.net Thu Oct 12 18:56:35 2000
> > Delivered-To: firstname.lastname@example.org
> > Subject: open translations database
> > From: Stefan Rieken <StefanRieken@SoftHome.net>
> > To: email@example.com, firstname.lastname@example.org
> > Date: 12 Oct 2000 16:55:26 -0100
> > Mime-Version: 1.0
> > X-BeenThere: email@example.com
> > X-Loop: firstname.lastname@example.org
> > X-Mailman-Version: 2.0beta5
> > List-Id: Internationalization (I18N) of GNOME
> > To the folks at openstandards.org and the gnome-i18n mailing
> > Hello,
> > This mail was sent out to give space to an idea that I developed
> > today. This idea is rough, unimplemented and untested.
> Nevertheless, I
> > hope that it is of interest for you. This mail was sent to the
> > mentioned above, just because I didn't know any better place to
> > If you believe I shouldn't have sent it to you or your list, I
> > apologise. If you believe I missed someone out, you are free to
> > this. (But I must warn you in advance that this idea is too
> young for me
> > to know if it will survive my busy schedule.)
> > Problem:
> > The current translation of open source software suffers from a
> lack of
> > manpower. Thjs usually doesn't result in a lack of translations,
> but in
> > bad translations. Half of the time translation engines such as
> > are being used. These engines often can't produce correct
> > of small strings because of a lack of context (e.g.: the title
> of the
> > window I am writing this message in says, directly translated
> back to
> > English: "is composing a new message" instead of "Compose a new
> > message"). They also don't care about the size of the translated
> > which can be important when used in a program. Translation by
> > individuals can often also cause errors. These vary from
> > to overlooking spelling caveats common for the target language.
> > It would be helpful to have one or more sets of standard
> > for standard words and strings. Translators of software would
> > from this, but also translators of larger documents that contain
> > standard words and strings (such as "radio button"; you'll be
> > to know how hard it is in some languages to come up with a good
> > translation for it).
> > Context:
> > I am writing this with the GNOME project in mind, because I am
> > with it. However, I want my solution to be for the general
> benefit of
> > free and open source software.
> > There are a lot of standard strings in applications. Many GUI
> > define which ones you can use. Desktop projects such as GNOME
> often have
> > a set of these standard strings, and their translations,
> included. They
> > can, however, not provide translations for less commonly
> > Another problem arises when standard strings are part of bigger
> > (e.g. when "show toolbar" is standard, and a string like "show
> > toolbar" is being used). Most open source projects don't really
> > about documenting their use of standard strings, as the
> > should be clear enough.
> > In the past, I have done some minor translation work for ATO.
> This is an
> > international organisation of translators of Amiga software (the
> > Translation Organisation). They were pretty well organised (but
> being an
> > Internet development newbie, it took me some time to get known
> with the
> > organisation). One of the best parts of the organisation (of the
> > division anyway), was a document that described the translation
> > and also contained a list of common Amiga terms and their
> > Because I want my solution to be global, and not e.g.
> Amiga-specific, I
> > think it is not a good idea to provide a procedure for the
> > process. Different projects may have different standards. I also
> > think that a small list of common terms will do the trick.
> Again, these
> > terms may vary slightly from one project to another, and if we
> are going
> > to sum up only a few general words, the result wouldn't be
> > useful.
> > Solution:
> > I was thinking that it would be nice to have a web-accessible
> > being set up to tackle this problem. The "database" (or just a
> > file) would initially be empty, but it would be available for
> > modification through a CGI script. This service should be
> neutral, so
> > that we wouldn't get duplicate attempts to solve this global
> > (E.g. hosting it at gnome.org wouldn't make it very neutral to
> KDE folks
> > ;-).
> > The interesting part is how the database should look and behave.
> I only
> > have given this part little attention as of yet. There are,
> however, a
> > few schemes one could follow, and I imagine that one of these
> > would be more or less ideal.
> > The Economy Scheme:
> > Simply feed the database a list of words and their translations,
> > language. This would be the scheme of preference if it turns out
> that my
> > time, help and knowledge are really low.
> > The Business Scheme:
> > Same as above, but now with even more features! ;-), including:
> > - an argument-based history of the translation. Example:
> > "English: 'file', Dutch: 'bestand'
> > Previous translation 'bestant' is wrong because of a
> > Previous translation 'document' is inaccurate"
> > - a project-specific translation. Example:
> > "English: 'edit', Dutch:
> > 'Bewerken' (KDE standard)
> > 'Bewerk' (GNOME standard)"
> > - per-project tips and guidelines. Example:
> > "English: 'Are you sure you want to ...',
> > KDE tip: doubting the user is not friendly. Please use
> > confirm ...' instead."
> > - per-language (and per-project?) tips. Example:
> > "English: edit, Dutch: bewerk
> > Dutch language tip (GNOME): always use infinitive[*]"
> > - automatic parsing of your .po files??
> > - automatic updating of a few registered .po files??
> > So this is my plan for a "translation bazaar". As said, the idea
> is that
> > it is empty at start, and then maybe someone would dump a few
> GNOME and
> > KDE .po files into this database, and the initial revision
> process can
> > kick off. But the real idea is that folks supply their own
> strings they
> > want to have translated, and the database would slowly get
> filled, while
> > translations grow to be more accurate over time because of
> > But actually I've no idea if this would become a success. I know
> that I
> > myself have only little time and resources, so I'd be happy
> already if I
> > only managed to get the Economy scheme. I also never worked with
> > files and stuff. But I did do some CGI and Perl stuff recently,
> > again I can't say that I have a good cgi-bin place to put this.
> It would
> > be really cool if folks could just file their (not too specific)
> .po or
> > similar files into the system, and that the system automatically
> > these files translated and up to date. But as said, I don't know
> > anything of this .po stuff, so that really is beyond my
> potential. But
> > if someone thinks "yeah, this is a really neat idea, and I can
> do it!",
> > I would be delighted to form some kind of team, of course. It
> may also
> > take some not-me expertise to support languages with different
> > alphabets.
> > So in fact, it will kind of depend on what you guys think of
> this idea.
> > Can it succeed? Will it be popular? Will this system become a
> > part of e.g. the rules for GNOME translation, if it works? Do
> you feel
> > like working on it? Do you have a good CGI space?
> > I must say, I don't know if this is a good idea, or if it is
> only a nice
> > theory with no practical value. So I really look forward to any
> > feedback.
> > Greets,
> > Stefan
> > [*] I'm not sure if this is the correct term because it's been a
> > since I had to learn it. But the problem Dutch translators have
> to face
> > is that in English, in "I edit", the word "edit" is the same as
> in "to
> > edit" and "you edit", while in Dutch it is not. So when
> translating to
> > Dutch, you need to know which one to choose.
> > _______________________________________________
> > gnome-i18n mailing list
> > email@example.com
> > http://mail.gnome.org/mailman/listinfo/gnome-i18n
> Aoife Dunne
> Program Manager
> European Localisation Centre
> Sun Microsystems Ireland Ltd
> Hamilton House
> East Point Business Park
> Dublin 3
> Tel.: +353-1-8199-266
> Fax:. +353-1-8199-261
> Email: aoife.dunne@Ireland.Sun.COM
> gnome-i18n mailing list
"As for systems that are not like Unix, such as MSDOS, Windows, the
Macintosh, VMS, and MVS, supporting them is usually so much work that
it is better if you don't." -- Richard Stallman, GNU Coding Standards
] [Thread Prev