[xslt] [xml/xslt] libxml2 industrial use
- From: Jean Senellart <senellart systran fr>
- To: xml gnome org, xslt gnome org
- Subject: [xslt] [xml/xslt] libxml2 industrial use
- Date: Tue, 6 May 2003 07:16:04 -0700
Daniel Veillard wrote:
> Speaking of "industrial uses", I would be interested in getting
> feedback especially from industry professionals, in 2 ways:
>
> - use cases and "success story": basically I don't have any public
> list of use case, if you use libxml2 for serious processing, think
> about making a public statement, it will allow libxml2 to look
> "professionnal" and also will give me references I can point
> my management to when they wonder if my time is well spent, you
> should consider this as being your own interest if you can do it!
Bonjour Daniel, hi All,
Answering Daniel call for industry feedback, a quick note from SYSTRAN "real-use successful scenarios".
[Background information, SYSTRAN develops machine translation systems
(www.systransoft.com or more commonly the "translate this page" link that
you have on google, altavista (babelfish), and most other translation online
services powered on the web). In fact, SYSTRAN translates more than 8
million translations a day on the web.]
We are working for about one year on a complete overhaul of our systems and
will release a version 5 of our engines in few months.
XML (and using libxml2, libxslt for about 6 months - we had initially
developed an in-house library replaced by libxml2) has a key role in the new
architecture on the following axis:
- (main application - deep user/process interaction) (XML,DTD) main
"text flow" from formatted text to internal sentence representation is
xml-based. xml provides us a perfect way for preserving several layers of
information structure and allows us to very easily handle interactions
between those layers. At every step of the text transformation/analysis, we
can regenerate initial format with additional markup reflecting current
stage, and reciprocally any information contained in original text is
carried to any of those layers where it modifies default behavior (keyword
of machine translation is customization).
- (use of document structure for customizing translation) (XSLT,XPATH)
based on xsltproc and with the addition of xsl extension functions, we build
"translation stylesheets" that allows the definition of accurate
translation options for specific XML areas.
The translation process is directly driven by xsltproc (see attached .xsl
file for sample stylesheet: extension functions are, for example,
systran:translate). This mechanism overruns all classical
"formatting filter" since it allows the full separation of formatting and
content, and achieves a very high degree of customization based on document
structure. Subsequently, defining a specific "translation stylesheet"
enables the translation of any structured XML document
(Oo doc, DocBook, Office11(?)...) with a high degree of customization
- (exchanging data) (XML, Schema) XML for terminology exchange standard
- (ease development, simplify data sharing) (XML,XPATH) "global
option/variable" environment: all options and information shared between the
different modules are stored into a dynamic xml structure and accessed through
XPath. One innovative approach is that XPath is used as a read/write pointer:
first (classical) as a way of finding an indentified resource in existing
structure, but also as a way of creating a path (of course restricted to
eligible XPath operators).
This provides an evolutionary multitype dynamic structure, very
convenient for sharing complex information between modules during translation
process.
for reference, we are compiling those systems under linux/windows and solaris systems
thank you for the quality of the libraries you provide and thanks to all contributors
best regards,
Jean Senellart
SYSTRAN R&D Director
book.xsl
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]