Re: [gtk-list] Re: g_scanner funtions what for?



On Tue, 9 Mar 1999, Andreas Tille wrote:

> On Fri, 5 Mar 1999, Tim Janik  wrote:
> 
> > a GScanner will tokenize your text, that is, it'll return an integer
> > for every word or number that appears in its input stream, following
> > certain (customizable) rules to perform this translation.
> > you still need to write the parsing functions on your own though.
> > here's a little test program that will parse
> > 
> > <SYMBOL> = <OPTIONAL-MINUS> <NUMBER> ;
> > 
> > constructs, while skipping "#\n" and "/**/" style comments.
> > ...
> Thanks for the example.  I treid it and it worked so far.  Because
> I had to scan also other things than floats I modified your
> example but wasn't successful.  I append my code as attachment.
> 
> The code fails in scanning gchar * data.  Is there any documentation
> to do that right?

your modifications:

/* some test text to be fed into the scanner */
static const gchar *test_text =
( "Datum: 02-16-1999\n"
  "Probenname: Hans_1\n"
  "Modulator f: 500.0\n"
  "Schwingkreis f: 500.0\n"
  "Schwingkreis U: 5.0\n"
  "Temperatur: 25.0\n"
  "Trigger-Delay: -400\n"
  "Punktzahl: 8192\n"
  "Messzeit: 2.0e-006\n" );

"Schwingkreis f" can't be scanned as a single token, because a GScanner
will skip spaces by default and not feature spaces as a valid symbol char.

when you create a new scanner, g_scanner_new() accepts a pointer to a
GScannerConfig structure that contains various default values for its
scanning behaviour. if you pass that structure as NULL, gscanner.c will
revert to a default structure, defined in gscanner.c:

static GScannerConfig g_scanner_config_template =
{
  (
   " \t\r\n"
   )                    /* cset_skip_characters */,
  (
   G_CSET_a_2_z
   "_"
   G_CSET_A_2_Z
   )                    /* cset_identifier_first */,
  (
   G_CSET_a_2_z
   "_0123456789"
   G_CSET_A_2_Z
   G_CSET_LATINS
   G_CSET_LATINC
   )                    /* cset_identifier_nth */,
[....]

the field cset_skip_characters is setup with " \t\r\n", so that
spaces, tabs newlines and carriage returns will be automatically
skipped by the scanner.

further, "Trigger-Delay" can't be a valid symbol either with the
default configuration. symbols are initially scanned as identifiers, and
eventually get converted to symbols if a lookup in the scanners internal hash
table is successfull. there is no '-' in either the cset_identifier_first
field or cset_identifier_nth contained, so GScanner will not parse identifiers
across '-'es.

to get this to work, you should either rename "Trigger-Delay" and "Schwingkreis f"
to "Trigger_Delay" and "Schwingkreis_f" or, if you want to tweak the default
configuration, "Trigger-Delay" and "Schwingkreis-f", but then you need to
add a '-' to cset_identifier_nth, and you can't parse constructs like

x = 5; y = x-2;

because the '-' would not be returned as a seperate character.



also, you have to adapt parse_symbol() so it parses tokens other than float as
well, instead of:

  /* expect a valid symbol */
  g_scanner_get_next_token (scanner);
  symbol = scanner->token;
  if (symbol < SYMBOL_DATE ||
      symbol > SYMBOL_T)
    return G_TOKEN_SYMBOL;

  /* expect '=' */
  g_scanner_get_next_token (scanner);
  if (scanner->token != ':')
    return '=';

  /* expect a float (ints are converted to floats on the fly) */
  g_scanner_get_next_token (scanner);
  if (scanner->token != G_TOKEN_FLOAT)
    return G_TOKEN_FLOAT;

  /* assign value, eat the semicolon and exit successfully */
  switch (symbol)
    {
    case SYMBOL_DATE:
      date = scanner->value.v_string;
      break;
    case SYMBOL_F_m:
      F_m = scanner->value.v_float;
      break;

which will only parse floats and then switch() on the symbols,
you need to do something like:

  /* expect a valid symbol */
  g_scanner_get_next_token (scanner);
  symbol = scanner->token;
  if (symbol < SYMBOL_DATE ||
      symbol > SYMBOL_T)
    return G_TOKEN_SYMBOL;

  /* expect '=' */
  g_scanner_get_next_token (scanner);
  if (scanner->token != ':')
    return '=';

  /* assign value, eat the semicolon and exit successfully */
  switch (symbol)
    {
    case SYMBOL_DATE:
      /* expect a string */
      g_scanner_get_next_token (scanner);
      if (scanner->token != G_TOKEN_STRING)
        return G_TOKEN_STRING;
      date = g_strdup (scanner->value.v_string);
      break;
    case SYMBOL_F_m:
      /* expect a float (ints are converted to floats on the fly) */
      g_scanner_get_next_token (scanner);
      if (scanner->token != G_TOKEN_FLOAT)
        return G_TOKEN_FLOAT;
      F_m = scanner->value.v_float;
      break;


to asure you get the correct tokens, matching the symbols. also, if you
retrive strings from the scanner, you have to copy them, as a gscanner
will of course free it's values on the fly again (i.e. when the next value
is put into scanner->value).

> 
> Kind regards
> 
>      Andreas.
> 

---
ciaoTJ



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]