Re: Followup: opinions on Search services

From: Manuel Amador <rudd-o amautacorp com>
To: Miguel de Icaza <miguel ximian com>
Cc: Joe Shaw <joeshaw novell com>, gnome-devel-list gnome org, "John \(J5\) Palmieri" <johnp martianrock com>, Jamie McCracken <jamiemcc blueyonder co uk>
Subject: Re: Followup: opinions on Search services
Date: Tue, 26 Apr 2005 18:02:43 -0500

You guys probably know better than me, how does Beagle do to watch so
many directories at once?  I've resorted to using inotify, of course, as
well, but what I do is rather kludgy: once setting a dir watch fails, I
try and duplicate the inotify limit via /sysfs, then go on and retry the
operation.  Unfortunately, while this actually lets me watch tons of
dirs (167.000 at last count, primarily due to my mp3 collection), I am
not sure whether this is actually a "bright idea" (TM).

El jue, 07-04-2005 a las 12:13 -0400, Miguel de Icaza escribió:
> Hello,
> 
> > > Adding on to this, if one designs their programs correctly the actual
> > > call overhead is negligible.  The only reason one would optimize by
> > > using a lower level language is if a block of code, usually in some sort
> > > of long running loop, is taking too long to finish.  In that case most
> > > of the time is spent in the call itself rendering the overhead of making
> > > the call negligible.
> > 
> > The issue here is memory and the garbage collector rather than loops. 
> > The Boehm GC is particularly slow at allocating large objects on the 
> > managed heap and the resulting fragmentation causes both poor 
> > performance (the GC spends an inordinate amount of CPU time searching 
> > for free blocks) and excessive memory consumption.
> 
> Those statements in general make sense, but they do not apply to Mono or
> Java using Boehm GC.
> 
> The reason why this is not an issue with Mono/Java is because we use the
> "precise" framework of Boehm GC, where we explicitly register the types
> and layouts of objects allocated with it, so Boehm only scans the parts
> that actually can contain pointers instead of all the blocks (the
> default mode of execution).   
> 
> This has huge performance implications.  You are correct that naive use
> of Boehm is in general an underperformer, but the situation changes
> drastically when employed as a precise GC. 
> 
> Boehm still presents problems, the major one is the lack of a
> compacting GC.  This leads to a situation where you can fragment the
> heap.  Very much in the same way that every C++ and C applications
> fragment the heap today.
> 
> The situation could get bad if you allocate large blocks (multi-megabyte
> blocks) that you do not use and depend on the GC to free them.  This
> problem can be fixed problem by assisting the GC (clear your variables:
> a = null) or use the Dispose pattern for large objects (this in fact was
> the major source of issues in Beagle). 
> 
> > Indexing large files requires dynamic allocation of large amounts of 
> > memory hence my opinion that garbage collected languages are not optimal 
> > for this situation. Im not a luddite and I do like both python and C# 
> 
> The above is not true, you only need a few buffers to index it.
> 
> Let me illustrate with an example:
> 
> 	"To index a 1 gigabyte file, do I need 1 gigabyte of memory?"
> 
> Clearly if your answer is `yes', then you are not the most astute
> programmer, nor the sharpest knife in the drawer.
> 
> > and I would certainly use them for GUI stuff over C anyday. However for 
> > a back end service that is  both CPU and memory intensive I maintain 
> > that IMHO C in this particular case is a better choice.
> 
> Luckily, your ideology does not match reality.
> 
> As Beagle and the extensive set of applications built with Lucene in
> Java and .NET prove they are adequate languages for the task (and there
> is now this distributed open source search engine built with Java as
> well).
> 
> Miguel.
> 
> Miguel
-- 
Manuel Amador <rudd-o amautacorp com>
Amauta

References:
- Followup: opinions on Search services
  - From: Manuel Amador
- Re: Followup: opinions on Search services
  - From: Joe Shaw
- Re: Followup: opinions on Search services
  - From: Jamie McCracken
- Re: Followup: opinions on Search services
  - From: John (J5) Palmieri
- Re: Followup: opinions on Search services
  - From: Jamie McCracken
- Re: Followup: opinions on Search services
  - From: Miguel de Icaza
- Re: Followup: opinions on Search services
  - From: John (J5) Palmieri
- Re: Followup: opinions on Search services
  - From: Jamie McCracken
- Re: Followup: opinions on Search services
  - From: Miguel de Icaza

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]