Re: [Evolution-hackers] Loading really large E-mails on devices with not enough Vm



Something like the attached patch might work, tho it is untested.

If this doesn't work, then I suspect the problem is that the seek
position might get changed out from under the mime parser (assuming it
is using either a CamelStreamFs or an fd).

Note that camel_stream_fs_new_with_fd[_and_bounds]() calls lseek() on
the fd passed in.

>From the dup() man page:

       After  a  successful  return from dup() or dup2(), the old and new file
       descriptors may be used interchangeably.  They refer to the  same  open
       file description (see open(2)) and thus share file offset and file sta‐
       tus flags; for example,  if  the  file  offset  is  modified  by  using
       lseek(2)  on one of the descriptors, the offset is also changed for the
       other.

So my guess is that this will break the parser :(

It might break in the stream case as well, you'd have to follow the code
paths a bit to know for sure. For instance, even if creating the
seekable substream doesn't perform an underlying seek on the original
stream, setting it in a data wrapper might call camel_stream_reset()
which /might/ do an lseek() on the source fs stream.

Not an insurmountable problem to solve, but it does make things a little
more difficult and possibly touchy.

Jeff



On Sat, 2008-01-26 at 22:48 -0500, Jeffrey Stedfast wrote:
> On Sat, 2008-01-26 at 22:12 -0500, Jeffrey Stedfast wrote:
> > On Sat, 2008-01-26 at 13:44 +0100, Philip Van Hoof wrote:
> > > This is what happens if you try to open a truly large E-mail on a device
> > > that has not as much memory available:
> > > 
> > > Is there something we can do about this? Can we change the MIME parsing
> > > algorithm to be less memory demanding for example?
> > > 
> > > Note that GArray is not really very sparse with memory once you start
> > > having a really large array. Perhaps we can in stead change this to a
> > > normal pointer array of a fixed size (do we know the size before we
> > > start parsing, so that we can allocate an exact size in stead, perhaps?)
> > 
> > eh, why would you change it to a GPtrArray? It doesn't hold pointers, it
> > holds message part content.
> > 
> > Unfortunately we don't know the size ahead of time.
> > 
> > I suppose you could use a custom byte array allocator so that you can
> > force it to grow by larger chunks or something, dunno.
> >
> >
> > The way GMime handles this is by not loading content into RAM, but 
> > that may be harder to do with Camel, especially in the mbox case.
> 
> er, I should probably explain this:
> 
> - writing the code should be relatively easy to do, but in the mbox
> case, the mbox may end up getting expunged or rewritten for some other
> reason which may cause problems, not sure how that would work.
> 
> I think in Maildir, as long as the fd remains open, the file won't
> actually disappear after an unlink() until the fd gets closed, so that
> might work out ok assuming you can spare the fd (which might be the
> other problem with Evolution?).
> 
> Jeff
> 
> 
> _______________________________________________
> Evolution-hackers mailing list
> Evolution-hackers gnome org
> http://mail.gnome.org/mailman/listinfo/evolution-hackers
Index: ChangeLog
===================================================================
--- ChangeLog	(revision 8425)
+++ ChangeLog	(working copy)
@@ -1,3 +1,8 @@
+2008-01-26  Jeffrey Stedfast  <fejj novell com>
+
+	* camel-mime-part-utils.c (simple_data_wrapper_construct_from_parser):
+	If possible, keep the content on disk.
+
 2008-01-24  Matthew Barnes  <mbarnes redhat com>
 
 	* camel-object.c (camel_object_cast):
Index: camel-mime-part-utils.c
===================================================================
--- camel-mime-part-utils.c	(revision 8425)
+++ camel-mime-part-utils.c	(working copy)
@@ -57,25 +57,47 @@
 static void
 simple_data_wrapper_construct_from_parser (CamelDataWrapper *dw, CamelMimeParser *mp)
 {
+	GByteArray *buffer = NULL;
+	CamelStream *stream;
+	off_t start, end;
+	int fd = -1;
+	size_t len;
 	char *buf;
-	GByteArray *buffer;
-	CamelStream *mem;
-	size_t len;
-
+	
 	d(printf ("simple_data_wrapper_construct_from_parser()\n"));
-
-	/* read in the entire content */
-	buffer = g_byte_array_new ();
+	
+	if (!(stream = camel_mime_parser_stream (mp)))
+		fd = camel_mime_parser_fd (mp);
+	else if (!CAMEL_IS_SEEKABLE_SUBSTREAM (stream))
+		stream = NULL;
+	
+	if ((stream || fd != -1) && (start = camel_mime_parser_tell (mp)) != -1) {
+		/* we can keep content on disk */
+	} else {
+		/* need to load content into memory */
+		buffer = g_byte_array_new ();
+	}
+	
 	while (camel_mime_parser_step (mp, &buf, &len) != CAMEL_MIME_PARSER_STATE_BODY_END) {
-		d(printf("appending o/p data: %d: %.*s\n", len, len, buf));
-		g_byte_array_append (buffer, (guint8 *) buf, len);
+		if (buffer != NULL) {
+			d(printf("appending o/p data: %d: %.*s\n", len, len, buf));
+			g_byte_array_append (buffer, (guint8 *) buf, len);
+		}
 	}
-
-	d(printf("message part kept in memory!\n"));
-
-	mem = camel_stream_mem_new_with_byte_array (buffer);
-	camel_data_wrapper_construct_from_stream (dw, mem);
-	camel_object_unref (mem);
+	
+	if (buffer == NULL) {
+		end = camel_mime_parser_tell (mp);
+		
+		if (stream != NULL)
+			stream = camel_seekable_substream_new ((CamelSeekableStream *) stream, start, end);
+		else
+			stream = camel_stream_fs_new_with_fd_and_bounds (dup (fd), start, end);
+	} else {
+		stream = camel_stream_mem_new_with_byte_array (buffer);
+	}
+	
+	camel_data_wrapper_construct_from_stream (dw, stream);
+	camel_object_unref (stream);
 }
 
 /* This replaces the data wrapper repository ... and/or could be replaced by it? */


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]