Re: incremental lexing

From: "Marat Boshernitsan" <maratb cs berkeley edu>
To: "Mark Slicker" <jamess1 wwnet net>, "Owen Fraser-Green" <owen discobabe net>
Cc: <gnome-devtools gnome org>
Subject: Re: incremental lexing
Date: Thu, 26 Apr 2001 16:59:32 -0700

----- Original Message ----- 
From: "Mark Slicker" <jamess1 wwnet net>
To: "Owen Fraser-Green" <owen discobabe net>
Cc: <gnome-devtools gnome org>
Sent: Wednesday, April 25, 2001 12:01 PM
Subject: Re: incremental lexing

[...]

> > Yes, it's very regrettable that they havn't made any code available. I sent 
> > a mail to the head of faculty, Susan Graham but so far havn't received any 
> > reply. I read that Ensemble (Harmonia's predecessor) consisted of over 
> > 300,000 lines of code so I would guess writing even just the incremental 
> > lexing part would be a mamoth task.
> 
> Not really. A good deal of incremental lexing is handled by the batch
> lexer flex, and the incremental algorithm is pretty easy to follow.
> Really, I almost had most of this done and also incremental parsing which 
> is similarly handled by the bison program. The parts I considered more
> difficult or requiring a lot of work are the grammar tools used for
> turning a language specification into a language module used by the
> system.

The grammar tools are actually not that involved.  We use standard flex with a custon lexer class (using the C++ option) and a modified variant of bison that outputs parse tables and the AST class definitions rather than the parser code.  Though I agree that building these from scratch would be a bit tricky.

> Also incremental semantic analysis looks fairly difficult, mainly
> because their isn't an easy to follow paper like Tim Wagner's Thesis. 

Yes, incremental semantics is difficult.  There have been two earlier efforts in our group to build an incremental semantic analysis framework, but neither have been entirely successful.  (See http://www.cs.berkeley.edu/~harmonia/publications/ensemble-pubs.html#maddox-thesis).  Our short-term solution is to use an ad-hoc hand-coded semantic analyzer incremental at the granularity of a translation unit (e.g. a class definition, in Java).  There is a possibility that we will once more try our luck in building a real incremental semantic analyzer driven by a declarative specification, but I don't know if/when this will happen.

> And
> on top of all of this there is incremental preprocessing required for a
> language like C or C++. I had a good idea of how to handle this, but it is
> still a bit of work to implement it properly.

Yes, the preprocessor is a tar pit.  We've had some ideas on incremenalizing pre-processing, but I don't think that's the right approach.   My current view is that it is best to first translate the pre-processor directives to be structural rather than textual and incorporate them into the grammar of the language. (See http://www.cs.berkeley.edu/~harmonia/projects/projects-available.html#preprocessor).  However, being a research group, we have the luxury of ignoring the incremental pre-processing issue and concentrating on more modern langauges such as Java.  Not to mention that we simply lack the manpower to pursue this project.

Marat.

References:
- Re: incremental lexing
  - From: Mark Slicker

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]