[Evolution] Re: mbox parsing



Ok I had some more thoughts on this ...


One of the problems is a line of this format:

From: email host com <email host com>

in camel-mime-utils.c

It claims it is invalid (it looks like it parses the first address and
dislikes the second)

I thought about this, and well, it is invalid.  It gets parsed (correctly)
as

  atom @ atom . atom  + some rubbish

Which is an mbox specification of the form

 addr-spec: local-part @ domain

See Appendix D of rfc822, rule mailbox, (from rule address, from
rule destination).

In any event, it is only a warning, and the expected behaviour will
still happen (it will detect an address with no real name).

This line the parser thinks is a from line:
(Which it clearly isn't)

From the back of the rear balcony, a voice cries, "Give him
some chicken soup!"

Hmmm, that doesn't appear to be the killer either.

If you have an mbox-like file with non-munged From lines, then it
is just that, an mbox-like file.  mbox files must munge the From
line (camel for instance doesn't yet, and as such is quite broken).

I'll have another poke around (my system has some strange stuff on it) and
see where the segfault is coming from.

Note that a couple were fixed yesterday.  A missing Date hit a typo
in the code, and incorrectly formed rfc2047 (internationalised) header
strings could cause libunicode to segfault (i totally missed that you
said it segfaulted).

We've tested with mailboxes with the order of 6K and 20K messages,
and special 'problem messages' successfully.

The backtraces from the segfault were very useful in finding these
problems.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]