Jon Trowbridge writes:

On Tue, 2005-10-04 at 23:24 +0100, Henry S. Thompson wrote:
Probably - I wasn't aware Beagle was vulnerable in this way -- where
can I find out more about 'objectionable' files?
> Sometimes files that are corrupted or malformed can trigger bugs in the
> filters, causing us to consume more memory and/or CPU than we'd like.
> This can also happens with certain types of extremely large files, and
> is usually a problem with the more complex file formats.  For example,
> indexing very large (i.e. 500 page) Word document tends to cause a nasty
> spike in CPU usage, and very large HTML files (like auto-generated
> tables with thousands of rows) require a lot of memory to index.
> These sorts of things are generally related to not-easily-fixable issues
> with third-party libraries that we use to process these file formats.
> But even if they aren't easily fixable, we'd like to try.  If you have a
> document that causes these sorts of problems, please let us know.
> Offending documents can be attached to bug reports at
>, or can be e-mailed directly to us if they contain
> private/sensitive information.

OK, that helps me understand, thanks.

I'm going to continue sending reports about the state of my efforts to
get this running, as I know from past experience that other newcomers
welcome evidence that they are not alone. . .

I'll file detailed bug reports via bugzilla.

Two main areas of difficulty and one minor at the moment:

1) Not all directories are indexed:

   If I start with one root, I get repeated messages that that root is
   done, but nothing gets indexed.

   If I start with one of its subdirectories, _some_, but not all, of
   the sub-subdirectories then get scanned and indexed.

   Any idea what I can look at to help debug this?

2) Null key exception crashes the indexer, and although beagled is
   still running, it's no longer indexing.  Details in Bugzilla.

3) The log fills up with 

   INFO: NetBeagleConfigurationChanged EventHandler invoked
   INFO: WebServicesConfigurationChanged EventHandler invoked

   every minute.  I suppose this may be because my .beagle directory
   is on NFS -- I'll try moving it.

- -- 
 Henry S. Thompson, HCRC Language Technology Group, University of Edinburgh
                     Half-time member of W3C Team
    2 Buccleuch Place, Edinburgh EH8 9LW, SCOTLAND -- (44) 131 650-4440
            Fax: (44) 131 650-4587, e-mail: ht inf ed ac uk
[mail really from me _always_ has this .sig -- mail without it is forged spam]
