Re: Jinja2 syntax for conditionals



Hi Sam,

Thanks a lot for sharing more on this thread, it's really important
these days that I unblock this and get passed it...


On Tue, 2017-09-19 at 13:52 +0100, Sam Thursfield wrote:
Thanks for the response here.

To be clear I don't have any particular attachment to Jinja2, I just 
didn't want to propose "please use infix syntax that looks like Python" 
without some kind of implementation to back it up.

Ok I see.

I did look for a standalone "Python-like DSL for conditional 
expressions" library and found nothing; I'm sure other projects 
implement such a thing (Meson does, for example) but none of them have 
gone to the extra effort of maintaining it as a separate shared library. 
(Which is perhaps a point in favour of also doing our own thing in 
BuildStream... although that still feels wrong to me).

Yeah, there are a handful of pragmatic reasons for rolling our own
thing here, especially if the only practical alternative is to depend
on a subsystem of another large project.

With our other dependencies, we do use probably >= 90% of all of them;
if ever those projects flake somehow; it will be very easy to fork and
distribute with BuildStream - click is probably our biggest dependency
aside from ruamel, and I've seriously been considering forking click;
already I've forked the bash completions because upstream's completions
just wont work for us, and the maintainer doesnt want to improve it (or
in fairness, is just too overworked to really consider the changes I
proposed and come to a compromise which fixes the issues we have).

jinja2 is also a pocoo, which through my experience with click, raises
a bit of a red flag, this red flag would indeed be lessened if it were
specifically just the syntax parser parts in a standalone lib: It gives
us the security that we can just fork it with little expense.

On 19/09/17 04:30, Tristan Van Berkom wrote:
The whole thing is only 150 lines, and could be less if we didn't care
about the "is defined" / "is undefined" operators (which require some
special case hackery).

Right we dont really need that I think, considering that we already
mandate that the project define what these variables are before their
values can ever be controlled by a user (so the proper response for a
reference to something undefined is to simply error out anyway).

Yeah that makes sense on reflection. I just copied that code path across 
from Ansible anyway.

To make it work we just embed the condition expression into a template
and then render it through jinja2 to get a result. Of course it feels a
bit dirty, but Ansible has been doing this since 2012 to implement
'when' conditions in playbooks and it seems to be working well enough
for their tens of thousands of users.

So jinja2 seems to be a pretty huge thing, even includes it's own
execution sandbox (no idea why this is, or if it's a real sandbox or
some kind of python venv or what)...

It's 7K lines of Python, with 1 external dependency (MarkupSafe). The 
package is 2.4MB installed, while BuildStream is 2.8MB. So it's not a 
lightweight dep, but I don't think "it's too big" on its own can rule 
out this approach.

  o Its too big for what we use in it means that the burden of a fork
    is much greater (we'd probably actually have to understand what the
    code does, which is not really true for our other dependencies, we
    only really need to understand their APIs).

  o It's main purpose is not what we want it for, which means our needs
    can easily diverge from the project's needs - understandably
    creating situations where it is difficult to land patches;
    contributing to the likeliness that we will want to fork this.

I wouldnt really take this lightly, I would certainly prefer to roll my
own whole custom parser (if I cant find something generic, simple, and
not S-Expressions) - rather than to put the project in what I feel to
be a precarious position, by depending on this.


The "sandbox" is just a bunch of small mixin classes. I think it's there 
because you can pass arbitrary Python objects as variables into 
templates, and Python objects can do operator overloading so 
theoretically you might end up with 'a == b' formatting your hard drive 
because type(a).__eq__() does something horrible. That wouldn't affect 
us as we'd control everything we pass in.

So it doesn't do anything like BuildStream's bubblewrap sandbox or 
virtualenv, it's just intercepting functional calls that may have side 
effects.

And the way it is used in your example is to have the jinja2 templating
engine resolve a template in memory, and then evaluate the resulting
generated text to return a boolean value from.


This does seem vastly complex and roundabout in terms of codepaths
reached; for the simple things we need... and then what we get out of
it is... a rigid syntax or set of existing operators that cannot be
easily extended ?

There's a certain ugliness to it indeed, but on the other hand these 
codepaths have probably hundreds of thousands of users already and are 
maintained by other people.

Extensibility is certainly needed. I don't think there should be any 
need to redefine what "+" or "==" mean beyond the existing 
Jinja2/Python-like interpretations of them, but we'll want to add helper 
functions. Jinja2 expressions can execute callables using the () 
operator, so let's say we want to implement a join_path() method, we 
could write that in Python and pass it in as a global to the template 
environment, and then write conditions like:

     full_path == join_path(prefix, libdir)

Yeah, this looks a bit scary to me (like an invitation to make
conditional resolution notations actually become a little programming
environment).

On the other hand, when not taken too far, it's comforting to know that
with a python like syntax we can still add helpers (which can help us
wiggle out of corners if we get stuck).


That said the jury is still out on the general direction of this; you
seem to be after the exact opposite of what I'm looking for and I'm
trying to understand / balance the reasoning behind this:

   Do we want something fully specified and very cute, with a rigid
   non-extensible syntax that other programmers may recognize because
   they might have used that syntax before ?

Or

   Do we only want a data serialization format that is a bit more
   practical than YAML is for the purpose of serializing optionally
   quoted strings and numbers ?
   The exact same thing can be done with JSON or YAML, it just happens

   to be practical to use S expressions for these one liners.

I don't see this as the question at all. Either way we're adding logical 
expressions to what was previously a serialization format, and making 
something that is (a) a serialization format with syntax sugar for 
variants, or (b) something else.

Sure you can represent the syntax tree of the S-Expressions as a YAML 
data structure, the same is true of the syntax tree of the Jinja2 
expressions.

For me the questions are:

1) do we want to use S-Expressions, or a more Python-like syntax?

2) do we want to write and maintain the execution engine ourselves, or 
reuse something that exists?

My understanding is that with S expressions, your answer to (2) is that
we basically use something that exists, and that the few functions for
crawling the loaded data and resolving the expression is negligible.

For (1), I would really like to find something more attractive as a
project to depend on than jinja2 for this.


Also, when you say:

   "What I'm suggesting is that we avoid creating a new DSL for
    expressions in the first place"

I'm not sure that this qualifies; i.e. if parsing some serialized data
to be interpreted as nested conditional expressions qualifies as
creating a new DSL, certainly the BuildStream format itself, even
though expressed through YAML, is also it's own DSL.

There's no universally agreed place where we draw the line between 
"code" and "data". In my view, once we have self-modifying conditional 
expressions we're crossing line into "code", regardless of whether the 
syntax used is S-Expressions or Jinja2 expressions, or YAML. Where the 
line is doesn't really matter, my point is that (if we ignore 
implementation details) our proposals amount to the same thing.


While looking for alternative serialization formats for expressions, I
did come across this which is interesting:

    http://readable.sourceforge.net/

Which is a thing called "sweet expressions" and basically is just a
slightly more exotic parser around S expressions, making them appear to
be more like fancy programming languages look like.

That's interesting indeed.

Unfortunately it seems to be tied to common lisp and not just the S
Expressions that lisp is built on, and then requires some weird stuff
to actually run; but as a concept seems usable yeah.

To be honest though; I personally dont really see the sweet expressions
as being better or even more readable than just the S Expressions - but
that is personal opinion I guess, also comes down to how you format
things.

I.e. nesting for instance is very clear and natural to follow with S
Expressions:

  (and
     (ifeq foo "foo")
     (ifeq bar "bar")
  )


Anyway, lets sleep on this for today...


Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]