Re: parsing bibtex using gscanner



On Mon, Feb 18, 2013 at 05:15:56PM +0000, Rudra Banerjee wrote:
I am trying to parse a bibtex file using gscanner.
The problem is that, due to many formats accepted by bibtex, it seems
bit hard to parse it.
What I mean is as long as the bibtex is of the form key="some value",
then g_scanner_get_next_token can get the string.
But it fails if it is in the format key={value}.

And it fails even before escaping literal TeX code using braces within
entries or string macros have come to play...

I am attaching my code. Some help (outside using btparse/ bison )is
needed.

Don't do it this way.  GScanner is a lexical scanner, it just tokenizes
the input but it does not help with grammar.

The best approach to parse a grammar is, you know, using a parser.

If you insist on writing one manually realise that you need to formally
keep state, eg. the nesting level of braces at which you are now, etc.
Construct the parser similarly you would if you did if you just wrote
the BNF and let the parser be generared, e.g. write subroutines to parse
balanced braces, string, etc. possibly recusrively calling each other.

Attempting to write code for all the cases that can occur using
sequences of hardcoded ifs will only result in buggy mess.  You have
been warned.

Yeti



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]