Reference counting fixes. Prepared an API (it has a summary.h now). On Sun, 2007-12-09 at 13:19 +0100, Philip Van Hoof wrote: > This latest version of the future summary store uses a hashtable to > identify the duplicates. As expected makes it the writing of the summary > a lot faster (hashtables make a hash of each added key-string, which > makes searching-for-dups a lot faster then my former quickly made doubly > linked list code for this). > > I'm still hoping to somehow avoid the g_strdup in the code that dumps > (writes) the summaryblock. Making copies of strings is not really fast. > > I added the necessary locking. The reference counting of the > summaryblocks is not 100% correct, yet. Although you should get the > idea: > > o. Not individual summaryitems but its container summaryblock gets a > reference added. > > o. Initial load of a summary, which is a container for summaryblocks, > creates all summaryitems which will each add one reference to their > container summaryblock > > o. Unload of a summary unloads each summaryblock, which merely means > taking the one reference of each of its contained summaryitems. > > o. Requesting a summaryitem from the summary adds a reference to the > container summaryblock of the item. > > o. Stopping usage of that summaryitem takes away the reference on the > container summaryblock > > o. A summaryblock can live without its container summary > *** > o. A summaryitem can't live without its container summaryblock > ***** > o. A summaryblock contains on average 1000 items > > o. A summary contains (Item-Amount / 1000) summaryblocks > > > I'll repeat the core ideas of this summary store implementation: > > o. Localisation of mapped memory: often needed strings are placed in > the beginning of the mmap()ed file (they are sorted on usage-count > by the summaryitems) > > -> Current summary storage scatters strings around, no matter how > often they are used (they are simply duplicated each time in > stead resulting in a larger VmSize and in more mapping/swapping > operations that have to be done by the kernel) > > o. Duplicate strings are made unique in the mmap > > -> Current summary storage scatters duplicate strings around in the > mapping. Making the mapping more random (less localised memory > access), more redundant (larger VmSize). > > o. Blocks of 1000 items that can function individually (per block) > > o. Quick at appending new material: a new writable block is created > and the current block is written (if 1000 is reached) > > -> Current summary storage needs to rewrite the entire summary.mmap > file each 5000th item that gets added. This gets hideously slow > starting at 15,000 items. Especially on mobile ARM devices. > > o. Quick at changing the flags of items: a separate file per block is > utilised to store the item's flags. Only this sequential file needs > to be rewritten when flags of the item(s) changed. > > -> This is equally slow once you are at 15,000 items. > > o. Fast searching: because of ideal localisation will often requested > strings be mapped in real memory modules (by the kernel). > > o. Fast but low-memory browsing: currently visible items will be > mapped in real memory modules (by the kernel), items that didn't > get requested aren't mapped in. Not even touched at initial scan > stage. > > > > _______________________________________________ > tinymail-devel-list mailing list > tinymail-devel-list gnome org > http://mail.gnome.org/mailman/listinfo/tinymail-devel-list -- Philip Van Hoof, freelance software developer home: me at pvanhoof dot be gnome: pvanhoof at gnome dot org http://pvanhoof.be/blog http://codeminded.be
Attachment:
mytest6.tar.gz
Description: application/compressed-tar