Daniel Veillard wrote:
Speaking of "industrial uses", I would be interested in getting feedback especially from industry professionals, in 2 ways: - use cases and "success story": basically I don't have any public list of use case, if you use libxml2 for serious processing, think about making a public statement, it will allow libxml2 to look "professionnal" and also will give me references I can point my management to when they wonder if my time is well spent, you should consider this as being your own interest if you can do it!
Bonjour Daniel, hi All, Answering Daniel call for industry feedback, a quick note from SYSTRAN "real-use successful scenarios". [Background information, SYSTRAN develops machine translation systems (www.systransoft.com or more commonly the "translate this page" link that you have on google, altavista (babelfish), and most other translation online services powered on the web). In fact, SYSTRAN translates more than 8 million translations a day on the web.] We are working for about one year on a complete overhaul of our systems and will release a version 5 of our engines in few months. XML (and using libxml2, libxslt for about 6 months - we had initially developed an in-house library replaced by libxml2) has a key role in the new architecture on the following axis: - (main application - deep user/process interaction) (XML,DTD) main "text flow" from formatted text to internal sentence representation is xml-based. xml provides us a perfect way for preserving several layers of information structure and allows us to very easily handle interactions between those layers. At every step of the text transformation/analysis, we can regenerate initial format with additional markup reflecting current stage, and reciprocally any information contained in original text is carried to any of those layers where it modifies default behavior (keyword of machine translation is customization). - (use of document structure for customizing translation) (XSLT,XPATH) based on xsltproc and with the addition of xsl extension functions, we build "translation stylesheets" that allows the definition of accurate translation options for specific XML areas. The translation process is directly driven by xsltproc (see attached .xsl file for sample stylesheet: extension functions are, for example, systran:translate). This mechanism overruns all classical "formatting filter" since it allows the full separation of formatting and content, and achieves a very high degree of customization based on document structure. Subsequently, defining a specific "translation stylesheet" enables the translation of any structured XML document (Oo doc, DocBook, Office11(?)...) with a high degree of customization - (exchanging data) (XML, Schema) XML for terminology exchange standard - (ease development, simplify data sharing) (XML,XPATH) "global option/variable" environment: all options and information shared between the different modules are stored into a dynamic xml structure and accessed through XPath. One innovative approach is that XPath is used as a read/write pointer: first (classical) as a way of finding an indentified resource in existing structure, but also as a way of creating a path (of course restricted to eligible XPath operators). This provides an evolutionary multitype dynamic structure, very convenient for sharing complex information between modules during translation process. for reference, we are compiling those systems under linux/windows and solaris systems thank you for the quality of the libraries you provide and thanks to all contributors best regards, Jean Senellart SYSTRAN R&D Director
Attachment:
book.xsl
Description: Binary data