Re: non-gui usage of Gtk2::MozEmbed



I'd like to share the the final solution, may be it will save time for someone

for the demonstration how it works, see here.

===================
#!/usr/bin/perl

use strict;
use Glib qw(TRUE FALSE);
use Gtk2;
use Gtk2::MozEmbed '0.04';
use Mozilla::DOM '0.18';   # for NSHTMLElement

Gtk2->init_check();


my $url = "">

my $window = Gtk2::Window->new ('toplevel');
my $embed = Gtk2::MozEmbed->new();
my $listener = Mozilla::DOM::EventListener->new(\&dom_complete_listener);
my $error_listener = Mozilla::DOM::EventListener->new(\&dom_complete_listener);

my $listener_added = 0;

$embed->signal_connect(net_stop => sub {
   
    my $browser = $embed->get_nsIWebBrowser;
    my $jswindow = $browser->GetContentDOMWindow;
    my $doc = $jswindow->GetDocument;
   
    my $dom_complete = $doc->GetElementById('domcomplete');
   
    if ($dom_complete) {
        dom_complete_listener();
    } elsif (!$listener_added) {
        $listener_added = 1;
       
        my $iid = Mozilla::DOM::EventTarget->GetIID;
        my $target = $jswindow->QueryInterface($iid);
        $target->AddEventListener('domcomplete', $listener, 0);
        $target->AddEventListener('error', $error_listener, 0);
    }

    return FALSE;
});

$window->add($embed);
$embed->realize();

$embed->load_url($url);

Glib::Timeout->add(15000, \&dom_complete_listener );

Gtk2->main();


sub dom_complete_listener {
    # Get <html> element
    my $browser = $embed->get_nsIWebBrowser;
    my $window = $browser->GetContentDOMWindow;

    my $selection = $window->GetSelection;
    my $doc = $window->GetDocument;
   
    #    my $diid = Mozilla::DOM::NSDocument->GetIID;
    #    my $nsdoc = $doc->QueryInterface($diid);
    #    print "charset=", $nsdoc->GetCharacterSet, $/;
    #    print "location=", $nsdoc->GetLocation->ToString, $/;
    #    print "contenttype=", $nsdoc->GetContentType, $/;
    #    print "title=", $nsdoc->GetTitle, $/;
    #    print "lastmod=", $nsdoc->GetLastModified, $/;
    #    print "referer=", $nsdoc->GetReferrer, $/;
   
    my $docelem = $doc->GetDocumentElement;

    my $html_tag = '<' . $docelem->GetNodeName() . ' ';
    if ($docelem->HasAttributes) {
        my $attrs = $docelem->GetAttributes;
        for (my $i = 0; $i < $attrs->GetLength; $i++) {
            my $attr = $attrs->Item($i);
            $html_tag .= $attr->GetNodeName() . '="' . $attr->GetNodeValue() . '" ';
        }
    }
    $html_tag .= ">\n";

    # Switch that element to NSHTMLElement interface
    my $eiid = Mozilla::DOM::NSHTMLElement->GetIID;
    my $nshtmlelement = $docelem->QueryInterface($eiid);

    # Print out innerHTML.
    # Unfortunately Mozilla doesn't support outerHTML,
    # so we can't print the whole thing (not to mention
    # the DOCTYPE thing at the top)
    print($html_tag);
    print($nshtmlelement->GetInnerHTML());
    print("\n</html>\n");
    Gtk2->main_quit;
}
===================

this scrpit:
  - should be run in X server presence, its possible to use  Xvfb,  see also the xvfb-run sources for the way to provide all required authentications (you may also try xhost +)
  - requires the experimental features of Mozilla::DOM to be enabled (see the readme in distribution)
  - requires some addition on client-side (in page _javascript_), see below:

===============
//creates new event
                if ("createEvent" in document) {
                    DomCompleteEvent = document.createEvent("Events");
                    DomCompleteEvent.initEvent("domcomplete", true, false);
                }


//appends new <div id="domcomplete"/> element to document's body and fires the "domcomplete" event, this should be done when the page is ready
                             Ext.DomHelper.append(Ext.getBody(), {
                                tag : 'div',
                                id : 'domcomplete'
                            });
                           
                            if (typeof DomCompleteEvent != 'undefined') window.dispatchEvent(DomCompleteEvent);
===============

On Fri, Dec 12, 2008 at 11:29 PM, Nickolay Platonov <nickolay8 gmail com> wrote:
Ups, seems I was replying to each interlocutor personally, and not via mailing list, sorry..


Currently my progress is: I have a small separate perl script (http://rafb.net/p/IRsq5r59.html) which I'm using to receive an innerHTML of the particularly URL. 
This script works only in presence of X server (you were right Tadej), so I'm using Xvfb (many thanks to Dave for this clue).

But it works differently on my development machine (which is making requests to remote DB) and my production server (on which requests to DB are local).
It seems that for production server the "net_stop" event occurs a bit early - when the page is not fully complete yet.

It seems I'll need to replace the "net_stop" event to some kind of custom _javascript_ event, like described here:
https://developer.mozilla.org/En/Mozilla_Embedding_FAQ/How_do_I...
( "I need the _javascript_ inside the browser window to talk to my embedding client. How do I do it?" section)


Many thanks to everyone for the help already provided..

Regards, Nickolay

On Wed, Dec 10, 2008 at 6:39 PM, Tadej Borovšak <tadeboro gmail com> wrote:
Hi.

I'm guessing now, but what happens if you call realize method on your
embed object?


2008/12/10 Nickolay Platonov <nickolay8 gmail com>:
> Hello,
>
> Can someone please advise me on Gtk2::MozEmbed - can I use it in
> "standalone" way, without adding to window?
> I'm experimenting with simpliest case, like here:
>
> #!/usr/bin/perl
>
>
> use Glib qw(TRUE FALSE);
> use Gtk2 -init;
> use Gtk2::MozEmbed '0.04';
> use Mozilla::DOM '0.18';   # for NSHTMLElement
>
>
>
> my $embed = Gtk2::MozEmbed->new();
>
> $embed->signal_connect(net_stop => sub {
>
>     print "Callback called";
>     Gtk2->main_quit;
>     return FALSE;
> });
>
>
> $embed->load_url("http://ya.ru");
>
> print "Starting mainloop\n";
>
> Gtk2->main();
>
> and it seems the callback is never get called, meanwhile this variant works:
>
> #!/usr/bin/perl
>
>
> use Glib qw(TRUE FALSE);
> use Gtk2 -init;
> use Gtk2::MozEmbed '0.04';
> use Mozilla::DOM '0.18';   # for NSHTMLElement
>
> my $window = Gtk2::Window->new ('toplevel');
> my $embed = Gtk2::MozEmbed->new();
>
> $embed->signal_connect(net_stop => sub {
>       print "Callback called";
>     Gtk2->main_quit;
>     return FALSE;
> });
>
> $embed->load_url("http://ya.ru");
>
> print "Starting mainloop\n";
>
> $window->add($embed);
> $window->show_all;
>
> Gtk2->main();
>
> I'm trying to create a server-side rendering service for search-engines
> spiders, which cannot execute _javascript_. And I need to somehow extract
> innerHTML of the page (including html produced by _javascript_) in "non-gui"
> way, in the web-application's controller.
>
> Best regards, Nickolay
>
> _______________________________________________
> gtk-perl-list mailing list
> gtk-perl-list gnome org
> http://mail.gnome.org/mailman/listinfo/gtk-perl-list
>
>



--
Tadej Borovšak
00386 (0)40 613 131
tadeboro gmail com
tadej borovsak gmail com




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]