Re: [gmime-devel] Parsing of invalid files
- From: Jeffrey Stedfast <fejj gnome org>
- To: Vitaliy Didik <ariman creiac com>
- Cc: gmime-devel-list gnome org
- Subject: Re: [gmime-devel] Parsing of invalid files
- Date: Fri, 10 Jun 2011 20:08:19 -0400
Hi Vitaliy,
On 06/10/2011 04:46 AM, Vitaliy Didik wrote:
Hello,
I have an issue with GMime. I'm not sure that this is a bug because
what I'm trying to accomplish is probably out side if the scope of the
library.
Currently GMime considers any file that is supplied to parser as MIME
message. Even jpeg images, zip archives, etc. All those files treated
as MIME messages with text/plain content type.
Upon further investigation I've found that in function
'parser_step_headers' in 'gmime-parser.c' there is a check for invalid
input characters but reaction on them is rather limited.
I'm talking about following code:
if (! valid) {
if (priv->scan_from && (inptr - start) == 4
&& !strncmp (start, "From ", 5))
goto next_message;
if (priv->headers != NULL || *inptr == ':') {
/* probably the start of the content,
* a broken mailer didn't terminate the
* headers with an empty line. *sigh* */
goto content_start;
}
}
If I supply binary file to parser then first 2 conditions are not met
and despite valid flag still FALSE parser continues to process input
and subsequently creates new message object.
Is it an intended behavior or indeed a bug?
I've added 2 lines in code above (before last bracket):
priv->state = GMIME_PARSER_STATE_ERROR;
return 0;
and it seems to fix my problems. But may be this is not a right
solution and you can point me to more appropriate fix.
I'm not 100% sure this is "safe", but the more I look at that code, the
more I think it should be okay to do something like this. I'll keep
thinking about this for the weekend and probably commit something along
these lines.
I originally didn't do this because I didn't want to error out ever
under the assumption that it was best to do anything we could to parse
the input stream as a message/mbox file.
If I do add a ERROR fix for this case, I may need to relax it a bit in
the future (if I get any bug reports asking me to make the parser more
liberal in what it accepts), but I think it should be ok for now.
One possible change to make it a bit more relaxed is maybe only return
error if *inptr is a ctrl character? Could you check that for me? Just
check if is_type (*inptr, IS_CTRL) returns non-zero.
Jeff
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]