Re: [Vala] Manipulating HTML tag soup in Vala
- From: "marcin saepia net" <marcin saepia net>
- To: Michael Gratton <mike vee net>
- Cc: vala-list <vala-list gnome org>
- Subject: Re: [Vala] Manipulating HTML tag soup in Vala
- Date: Tue, 2 Aug 2016 01:04:13 +0200
Hello,
how about 2-stage processing? Loading HTML into WebKitGtk, dumping DOM (
https://webkitgtk.org/reference/webkit2gtk/stable/WebKitWebPage.html#webkit-web-page-get-dom-document)
which contains already parsed structure, sanitizing DOM and displaying
serialized version of modified DOM for the future use?
It should be more secure, too.
m.
2016-08-01 10:01 GMT+02:00 Michael Gratton <mike vee net>:
Hey all,
I'm looking for an HTML tag soup library for Geary, that can load tag soup
HTML (i.e. possibly malformed) from a stream, allow some manipulation of
it, and re-serialise it for display in WebKitGTK. Ideally, a pull-parser
API like libxml2's TextReader or StAX[0] would be great, so the whole
document does not need to be kept in memory as it is processed.
These are the ones I know about:
libxml2:
- Pros: Has a pull parser API, has a HTML4 tag soup parser, installed
everywhere
- Cons: Pull parser doesn't work with HTML parser without reading whole
document into memory, HTML parser out of date(?)
GXml:
- Pros: Nice Vala API, uses libxml2 under the hood
- Cons: Not a pull parser, loads whole document into memory, doesn't seem
to be packaged for any distros, doesn't use the libxml HTML parser(?)
Others:
- WebKitGTK+: Great tag soup parser, no pull API, doesn't allow
manipulating the markup before displaying it (which is the main reason I
need to parse the HTML beforehand)
- XML Bird: Nice Vala API, but not a pull parser or a HTML parser
So none of these seem to completely fit the bill. Are there any other
options out there that I have missed? Has anyone else had parse tag soup in
Vala?
Ta!
//Mike
[0] - <https://en.wikipedia.org/wiki/StAX>
--
⊨ Michael Gratton, Percept Wrangler.
⚙ <http://mjog.vee.net/>
_______________________________________________
vala-list mailing list
vala-list gnome org
https://mail.gnome.org/mailman/listinfo/vala-list
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]