Re: stuck in file crawl task loop



OK.  As promised, an update.  Got a file that is getting indexed over
and over again.  Nothing interesting in the log though:

20070127 11:06:46.5566 29766 Beagle DEBUG: Adding directory '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009'
...

Crawler found the directory.

20070127 11:06:46.6872 29766 Beagle DEBUG: PostAddHook for uid:P_xriUY8Fk2GObz2aETlPw (/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009) and receipt uri=uid:P_xriUY8Fk2GObz2aETlPw
...

Indexing for directory is complete. Data is in lucene. Stage-1 of
post-processing.

20070127 11:06:46.6875 29766 Beagle DEBUG: PostChildrenIndexedHook for uid:P_xriUY8Fk2GObz2aETlPw (/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009) and receipt uri=uid:P_xriUY8Fk2GObz2aETlPw

Final stage of post-processing. Directory information will be added to
the filesystem backend's state. Indexing information will be written
to the xattr or fileattr.db

20070127 11:06:46.6888 29766 Beagle DEBUG: Registered directory '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009' (896bfc3f-3c46-4d16-8639-bcf66844e53f)
20070127 11:06:46.6890 29766 Beagle DEBUG: Created model '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009' with parent '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir'
...

Files backend registers the new directory as a part of the filesystem.
The directory will now be watched and will be scheduled for crawling.

??? - After this the indexing bookkeeping info should be written to
the xattr (which will fail since its NFS) or sqlite.

20070127 11:06:46.7006 29766 Beagle DEBUG: Running file crawl task
20070127 11:06:46.7044 29766 Beagle DEBUG: Starting crawl of '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009'
...
20070127 11:06:46.7278 29766 Beagle DEBUG: Done crawling '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009'
20070127 11:06:46.7325 29766 Beagle DEBUG: Running file crawl task
20070127 11:06:46.7935 29766 Beagle DEBUG: Starting crawl of '/data/jennifer_pc/c.bak/WINDOWS/TEMP/pft53A2~TMP/disk1/setupdir/0009'

If the bookkeeping info is not in the xattr or sqlite, this is bound to happen.

$ echo 'select * from file_attributes where filename = "0009";' | sqlite3 ~/.beagle/Indexes/FileSystemIndex/FileAttributesStore.db
...
Notice the file it's looping on is not in the database?  /data is an nfs
share BTW.

Huh! And there are no sqlite exceptions anywhere ?

Joe recently fixed one set of sqlite related problems. I am trying to
think if that will fix the problem. I will post another set of binary
sometime later today or tomorrow with joe's fix and some additional
sqlite debugging to verify that data is indeed been written. It will
still be based on 0.2.14 so that I can precisely confirm the cause of
bug.

Thanks again,
- dBera

--
-----------------------------------------------------
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]