Re: Opera backend for Beagle



Wow. Just updated to the latest revision. Now Opera indexing works
much better with 9.50 Beta. Thank you very much.

The only page that is not indexed is kino.local.pp.ru which I
mentioned earlier. I looked through the page source and tried to run
beagle-extract-content for the saved file. It seems to me, that the
problem is that page's content is not static html, but the names of
the films are printed with javascript. And beagle-extract-content
ignores that part of the page. My explanation sucks, maybe this will
describe the situation better:

$ grep "Diamonds" kinolocal.html | wc -l
1
$ beagle-extract-content --mimetype text/html kinolocal.html | grep
"Diamonds" | wc -l
0

I can attach this .html file, but its huge, even bzipped it is 128Kb. Should I?


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]