Re: [xslt] xsltproc for very large documents?
- From: Daniel Veillard <veillard redhat com>
- To: xslt gnome org
- Subject: Re: [xslt] xsltproc for very large documents?
- Date: Fri, 17 Jan 2003 04:43:31 -0500
On Fri, Jan 17, 2003 at 08:55:52AM +0100, Eric van der Vlist wrote:
> Hi Stuart,
>
> On Fri, 2003-01-17 at 08:29, Stuart Hungerford wrote:
>
> > I'm working with some very large (well, large by my
> > standards) XML documents of around 300MB each.
>
> I can't tell specifically for libxslt, but AFAIK this is a design issue
> with XSLT which is not "streamable" in that it requires some kind
> representation of the input documents in memory to work.
>
> You might be interested by STX [1], a project started a while ago to
> define "Streaming Transformations for XML".
Right.
More specifically for xsltproc there is two aspects of the problem
I can think of:
- first the fact that XSLT in general requires to load the full
document in memory (well xsltproc does, other tools tries to
work around this) which is a limitation of the general XSLT
processing model. This shouldn't generate atrocious performances
unless you're swapping, i.e. the working set doesn't fit in
memory and the hard drive is used constantly to swap blocks in and
out. Usually such trashing makes processing very slow, that's
normal, only solution is more RAM.
- second if no trashing happens, then there might be some troubles
which got reported once on the list about libxslt/xsltproc speed
when sorting very large node sets. It was reported only once and
related to very long sequences of child for examples.
The first one I can't really fix, the second I can work on it assuming
I have some example. So could you check if your box is "trashing" i.e.
problem 1/i , if not could you provide a smaller sample of your data
and a stylesheet exhibiting the problem, with some quantification of the
time needed to process the full data.
Daniel
--
Daniel Veillard | Red Hat Network https://rhn.redhat.com/
veillard@redhat.com | libxml GNOME XML XSLT toolkit http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]