Re: [xslt] Replacing the dict used by the transform context



Salut Daniel,

Daniel Veillard wrote:
> On Mon, Aug 07, 2006 at 03:25:13PM +0200, Stefan Behnel wrote:
>> Sure, I'm not questioning the first rule. I was serious. Would you consider
>> the above a viable solution or could you come up with a way to safe the
>> concurrency support in XSLT processing?
> 
>   Well libxslt is thread safe by default. It's the fact of not using a subdict
> for each transformation which makes it non-safe.

I know, that's the problem we face in the single-threaded case.


>   If you clearly point out that every XML/XSLT data should be kept in the
> same thread then yeah, I think you should be safe, but it also depends on
> the user not doing crazy things, and when it comes from threaded programming
> people will do crazy things without even realizing.

To add some more infos here, our current setup is this:

* the main thread has a global dictionary.
* other threads create thread-local sub-dictionaries of this dictionary as
  required.
* sharing read-only documents between threads is allowed
* sharing xmlStylesheets between threads is allowed
* modifying a document in another thread than the one it was created in is
  allowed only if the thread continues running (i.e. the dict reference
  persists)

That's about it. Now, the problem is, if transform contexts independently
create dictionaries, we can't integrate them into our GC scheme.

So, to comply with the rule of deriving the transform dictionary from the
stylesheet dictionary, we must reuse the stylesheet dictionary as transform
dictionary. I currently see two ways:

* cancel stylesheet concurrency, keep each stylesheet inside the thread of
  its own parser

* parse and compile stylesheets only in the main thread (where the global dict
  is writable) and then use them everywhere

I personally prefer the second solution, as it should be relatively easy to
achieve and allows for concurrency. Any objections to such an approach?

An alternative would be to support both use patterns and just check the
restrictions internally, i.e. if the stylesheet comes from the main thread,
allow it everywhere, otherwise allow it only in the thread it was parsed it.
Hey, that sounds like a good solution!


> The big claim of lxml
> was

Uhum: "is"

> that it would be pythonic in removing the constrain for a programmer
> to keep track of the lifetime of XML object. You end up with having to 
> track the thread 'owning' an object instead, which would be better for 
> some apps but could turn nightmarish for others.

Well, then the solution for the latter is: don't use threads. It's just like
regular expressions: if they don't solve your problem, don't use them.

Stefan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]