Re: [xslt] XSLT and Safari



On Tue, Aug 10, 2004 at 04:26:15PM -0700, Mark Vakoc wrote:
> --- David Hyatt <hyatt apple com> wrote:
> > Hello.  My name is Dave Hyatt, and I work on WebCore, the open-source 
> > layout engine that powers the Safari Web Browser.  Some time ago we 
> > switched away from expat to libxml2 for all our XML parsing, and now 
> > I'm working on integrating libxslt into Safari.
> 
> Cool.

  yup, extremely cool :-)

> > I can figure out all the imports/includes, use our own security 
> > routines and recursion defenses when loading, and ultimately end up 
> > with a bunch of xmlDocPtrs.  What I'd then like to do is be able to 
> > tell libxslt to parse a top-level stylesheet, but then register 
> > callbacks, so that for a given import/include URI in a specified parent 
> > stylesheet, I could then hand back the xmlDocPtr without libxslt having 
> > to do any I/O.  I'm not trying to reuse these xmlDocPtrs, so libxslt 
> > would be free to modify them if it needed to.
> > 
> > Does this sound like a reasonable request?
> 
> I think one area that may cause all sorts of problems in this approach would be
> the dictionaries.  libxslt makes use of string interning through dictionaries,
> and the included documents share the same dictionary as the top level document.
>  Parsing the includes/imports seperately would cause multiple dictionaries (or
> no dictionaries at all) to be used which I think will mess up libxslt's
> internal processing.

  Dohh, right, I overlooked this aspect ... But if one look at the existing
import code in xsltParseStylesheetImport() it does 
  import = xmlReadFile((const char *) URI, NULL, XSLT_PARSE_OPTIONS);
and the dictionnary is not passed down. Ideally the dict should be shared,
but it seems not the case right now. I think the dictionnary unification is
at compile time, all strings are extracted then and remapped to the dict
used for compilation.
  However for includes it's different:
    xsltParseStylesheetInclude() calls 
    xsltLoadStyleDocument() which calls
    xsltParseDocument()
and there the stylesheets are shared.
I think the dictionnary merge for include is needed because the way include
are processed is that the subtree is loaded into the top document, and then
sharing dictionaries is needed to avoid troubles later like when freeing the
resulting doc. 
  In general it seems the callback API should be doable for include and imports
but it need to pass the stylesheet, the dictionnary used, the parsing options
and the URI. The stylesheet should be needed to unify error handling, the
dictionnary for making sure the document will share strings, the options
because existing APIs need them and the URI of course.

typedef xmlDocPtr (*xsltDocLoader) (xsltStylesheetPtr style,
                  const xmlChar *URI, xmlDictPtr dict, int options)


Note: seems the current APIs use the xsltStylesheetPtr as the context
      information for the compilation , while I would really expect
      some kind of compilation context ... As a result integrating the
      loader in the current set of API is not really possible, one would
      need to design a couple more APIs based on xsltParseStylesheetDoc()
      and xsltParseStylesheetFile() adding the extra loader argument,
      as well as an extended internal version of
      xsltParseStylesheetImportedDoc(), not very hard in practice.

  If the main requirement is to keep control of the parsing phase as I would
expect to be David's need then such an API would work fine. If the intend is
to lookahead documents to preparse them before they are needed this might
be more of a problem.

Daniel

-- 
Daniel Veillard      | Red Hat Desktop team http://redhat.com/
veillard redhat com  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]