On Wed, 2006-07-12 at 16:19 -0700, D Bera wrote: > > By default, beagle tries to index every file under your home-dirctory, > _except_ dot-dirs. Files and subdirs under .dir wont be indexed. Ahhh. Perhaps some of these files to which I am referring are under dot-dirs. How about a flag to toggle that? > Its hard to extract data from binary files. How about "strings" them first? > I am not even sure how to > extract all words from a db file. Strings does a pretty decent job. > But anyway, beagle relies on a huge > collection of filters to extract data from various types of files. Right. I had gathered that. > The > filters in beagle cover nearly all the possible formats from which > data extraction is possible e.g. html, doc, comments from jpeg. There > is no filter for 'binary db' files as of now; hence beagle would > ignore them. How about a "default" for binaries that simply does do just "strings"? > Similar to the way you examine Google's index and see what webpages > are in the index :) Maybe beagle does something similar but if not, i think my touche is coming... http://www.google.ca/search?hl=en&q=site% 3Abeagle-project.org&btnG=Google+Search&meta= :-) But that is not even really apples to apples. If I had Googles database, like I have Beagles, I probably could do exactly what I mean. > Jokes aside, the recommended way to examine if a file is indexed is to > query for the filename. Put the whole name in quotes and you should > get it in the results. Yeah, cool! So a manual search does indicate that a given file I am thinking of is indeed in a dot-dir. :-( And it is a .db file which file says: $ file .icq.old/history/6000006.db .icq.old/history/6000006.db: GNU dbm 1.x or ndbm database, little endian $ strings .icq.old/history/6000006.db b ssage throug rver Hi! I found them! URL: http://artistic.device.sh (Apart from being spam) Obviously useful information in those files, even filtered through "strings". Aside from being in a dot-dir it would be nice that Beagle could give me this. b. -- My other computer is your Microsoft Windows server. Brian J. Murrell
Attachment:
signature.asc
Description: This is a digitally signed message part