Advisory: beagle-0.3.0 goes into a loop while indexing certain HTML emails

From: Debajyoti Bera <dbera web gmail com>
To: Beagle <dashboard-hackers gnome org>
Subject: Advisory: beagle-0.3.0 goes into a loop while indexing certain HTML emails
Date: Thu, 6 Dec 2007 09:24:29 -0500

Problem: For some HTML emails, the index-helper will cause 100% CPU for a long 
time and might end up using a lot of memory. This can happen for emails from 
any of the email backends or for email files in the file system. Saving these 
email on the disk and 
running "beagle-extract-content --mimetype=message/rfc822 /path/to/saved/email" 
will reproduce the problem.

Fix: Use the patch r4262. It a temporary workaround. The final fix is being 
discussed in 
http://bugzilla.gnome.org/show_bug.cgi?id=501803 (see the attached patch in 
this bug for the likely final fix).

Explanation: Earlier the email filter used to treat all non-text message parts 
as attachments, save them to temporary files on the disk and index them as 
attachments. The email filter was slightly modified in 0.3.0 to treat HTML 
message parts also as message body (with the side effect of not having to 
create a temporary file). A problem was detected in how the HTML filter that 
parses the HTML message part, uses the GMime.Stream for that message part. 
See gnome-bug #501803 for the detailed explanation). The workaround stores 
that message part in a memorystream and gives it to the HTML filter. The 
memorystream behaves similar to a filestream and so there should be no 
problem.

I have tested the workaround against my 19K emails (but I had also tested the 
unpatched version against my emails and that too passed!) and have not seen 
any hangs, crashes, memory problems or open file problems.

My sincere apologies for the trouble,
- dBera

-- 
-----------------------------------------------------
Debajyoti Bera @ http://dtecht.blogspot.com
beagle / KDE fan
Mandriva / Inspiron-1100 user

Follow-Ups:
- Re: Advisory: beagle-0.3.0 goes into a loop while indexing certain HTML emails
  - From: Kevin Kubasik

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]