glib / String handling difficult



Hello. I have always found string handling quite difficult in C.
It also looks like sed, awk, perl etc. cannot handle all of my
simple needs. Glib could be improved.

I need often an extractor which finds something from
an arbitrary data. I have routines file2buffer(), which
loads a file to buffer, and bufferfind(), which searches
for a string in the buffer. The file could be a html page
downloaded with wget or a mixed binary and ascii file.

Now more difficult example. I need to parse "12 pages" in
a middle of a html page. For example, sscanf("%i pages",) yields
nothing even there is only one "<number> pages" in the file.
Now I first find the "pages" and move back over the "12" but
this is not simple.

The lexicar scanner in glib is nice, but a similar parser would
be needed as well. A parser to which the rules are feeded at
runtime -- not compiled as with flex/bison.
  p = parser_new();
  parser_add_rule(p,"rule : token otherrule", callback);
  parser_add_rule(p,"rule : alttoken anotherrule", altcallback);
  <etc.>
  parser_prepare(p); // converts rules to efficient execution data structure
  parser_input(p,text);

Then in my application, I could mix the bufferfind() with several
miniparsers. And I could define miniparsers as needed in app's
dialog without compiling anything.

Juhana
-- 
  http://music.columbia.edu/mailman/listinfo/linux-graphics-dev
  for developers of open source graphics software



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]