Re: mmap()ing .bse file binary appendixes



   Hi!

On Mon, Aug 02, 2010 at 12:10:32PM +0200, Tim Janik wrote:
> On Fri, 30 Jul 2010, Stefan Westerfeld wrote:
>> Tim, I've been working on the SpectMorph file format in the last days, and
>> during performance optimization one important improvement is achieved by
>> mmap()ing the SpectMorph files. Its faster, I assume, because no syscalls need
>> to be made to parse the file, and reading big chunks of data (such as the
>> parameters of one SpectMorph Audio object) can be done using memcpy().
>>
>> So I was wondering if it would be possible for beast to provide a mmap()ed
>> version of binary appendixes, in case SpectMorph models are included within
>> .bse files. Of course it would be possible, but not necessarily as efficient,
>> to fall back to normal read operations in that case.
>>
>> If you think it should be done using mmap(), I volunteer to provide a merge
>> request for this.
>
> Yes, .bse files should be mmap()ed when possible as an optimization.
> This is why we have padding zeros at the start of the binary appendixes,
> so the actual data is 4byte aligned.
> We'll have to fallback to normal read()s though if:
> - no mmap is available or doesn't succeed (e.g. if address space is used up);
> - the binary appendix data isn't 4-byte aligned (e.g. due to manual file
>   edits).

Good, so you have already thought of this. Right now, I am working with test
data, because I don't have a really big SpectMorph instrument (the biggest is
the piano which has about 100 Mb, which loads pretty fast). But I suppose real
world instrument data, like a piano with three or more velocity layers could
easily have a size of 1 Gb SpectMorph data (more if more instruments are
involved). My benchmark results with test data so far are:

stefan quadcorn64:/big/stw/smtest$ ls -l ttt.smset 
-rw-r--r-- 1 stefan stefan 1221136270  3. Aug 19:54 ttt.smset        # 1,2G

stefan quadcorn64:/big/stw/smtest$ time SPECTMORPH_NOMMAP=1 smwavset list ttt.smset >/dev/null

real    0m5.175s
user    0m3.680s
sys     0m1.488s
stefan quadcorn64:/big/stw/smtest$ time smwavset list ttt.smset >/dev/null

real    0m3.662s
user    0m2.776s
sys     0m0.888s

This is with the file already in cache. So you can see that mmap() is quite a
bit faster, and SpectMorph code has two variants of the loading code, one for
mmap() and one without.

Right now the SpectMorphOsc (beast plugin) only accepts a file path, which
obviously is a usability problem. I think something similar than what I've done
for sound fonts should be done, that is a repo like the wave repo where
SpectMorph instruments are kept.

However, I'll have to deal with a few SpectMorph issues before improving BEAST
integration. In any case mmap() should be supported, then, of course with
read() as fallback. SpectMorph right now doesn't require any alignment, so
manual file edits would not be a problem.

   Cu... Stefan
-- 
Stefan Westerfeld, Hamburg/Germany, http://space.twc.de/~stefan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]