Re: Beast Jack driver (Re: Lockfree ringbuffer review)



On Thu, 1 Jun 2006, Stefan Westerfeld wrote:

On Thu, Jun 01, 2006 at 05:14:15PM +0200, Tim Janik wrote:

well, what's the use case for a non-float ringbuffer?

I've just read my way through the new jack midi API, and we'll need a
ring buffer to get the midi events from the realtime thread, where they
will be delivered by jack, and our midi thread. So yes, there is a use
case, which will probably have unsigned char as data type, or even
something like struct MidiEvent.

ok, that's interesting indeed.

Finally, if we were to optimize the JACK driver for maximum performance,
a more important factor would probably be that the ringbuffer even has
this API.

sorry? what API are you talking about?
void    bse_block_copy_float    (guint          n_values,
                                 gfloat        *values,
                                 const gfloat  *ivalues);
copying?

It means that we need to deinterleave and reinterleave the
data JACK supplies on the stack, and then read/write it to the
ringbuffer. If the ringbuffer would offer an API to access its memory
directly, the extra copy would disappear. But it makes the code harder
to maintain and for now my first priority has been getting things
implemented correctly.

i don't quite understand this point.
are you saying that replacing memcpy() by bse_block_copy_float() can't
be done due to interleaving? :-O

No. It can be replaced, and the new version has it. However, our data
flow looks like this:

 input

jack thread process() callback supplies N separate buffers
  |
  V
interleave on the stack (manually implemented, in process())       *jack-thread*
  |
  V
copy into the input ringbuffer (input ringbuffer write)            *jack-thread*
  |
  V
copy in bse engine supplied buffer (input ringbuffer read)         *engine-thread*
  |
  V

[ beast processing ]                                               *engine-thread*

note that this is processing in BSE, not beast (the GUI).
and BSE does deinterleave its data, process it and then reinterleave
because standard audio drivers want that. for jack this doesn't
make sense though, so the data flow should instead be:
  n-jack-buffers -> ringbuffer-write -> ||
  || -> ringbuffer-read -> n-bse-channel-blocks

[...]
and look at RingBuffer_GetWriteRegions). However if we get everything we
want - with the number of copies we're making right now - I am perfectly
happy. Every sensible bse network will eat much more cycles than we can
possibly spend in those extra Bse::Block::copy() calls.

i'm not happy with this. with every unneccessary copy you're doing in the
audio driver, you reduce the number of cycles the engine can spend on
audio processing eventhough it needs so much, like you just stated.

namespace {

/**
* The FrameRingBuffer class implements a ringbuffer for the communication
* between two threads. One thread - the producer thread - may only write
* data to the ringbuffer. The other thread - the consumer thread - may
* only read data from the ringbuffer.
*
* Given that these two threads only use the appropriate functions, no
* other synchronization is required to ensure that the data gets safely
* from the producer thread to the consumer thread. However, all operations
* that are provided by the ringbuffer are non-blocking, so that you may
* need a condition or other synchronization primitive if you want the
* producer and/or consumer to block if the ringbuffer is full/empty.

s/if/on the event that/ ?

I think not (but I am no native speaker). What I want to say is this:
Suppose the ringbuffer is empty. The consumer will, when calling the
read function, just read 0 frames. A common scenario is that the
consumer will want to wait until data is available. To implement this,
the consumer will need to use a synchonization mechanism (i.e.
condition).

that would need the suggested phrase "on the event that" then.
hoever, the scenario you describe is absolutely *not* common in
our application of the ringbuffer. the BSE engine does precisely
*not* want to wait on audio. it'll just come back later and read or
write again. for that, it just needs to be able to figure from the
driver how many samples later that would ideally be.

*
* Implementation: the synchronization between the two threads is only
* implemented by two index variables (read_frame_pos and write_frame_pos)
* for which atomic integer reads and writes are required. Since the
* producer thread only modifies the write_frame_pos and the consumer thread
* only modifies the read_frame_pos, no compare-and-swap or similar
* operations are needed to avoid concurrent writes.
*/

good comment btw, i now have an idea on why/how things are supposed
to work and can scrutinize your implementaiton accordingly ;)

template<class T>
class FrameRingBuffer {
//BIRNET_PRIVATE_COPY (FrameRingBuffer);

what's this comment for?

I wanted to call BIRNET_PRIVATE_COPY, but it didn't work, so I put //
marks there, so it compiled. So can I use BIRNET_PRIVATE_COPY or do I
need to put the code BIRNET_PRIVATE_COPY would put there if it worked
here manually?

i've not seen it "not working" yet. if that is indeed the case, you
need to investigate the cause. i can't say give you advice just on the
notion of "it didn't work" though... ;)

/**
 * Returns the maximum number of frames that the ringbuffer can contain.
 *
 * This function can be called from any thread.
 */
guint
size() const
{
  // the extra element allows us to see the difference between an
  empty/full ringbuffer
  return (m_buffer.size() - 1) / m_elements_per_frame;
}

ugh this is mighty confusing.
you should get rid of this function or at least clearly rename it.
seeing it, i had to go back and read your code all over to figure whether
you
used size() in every place or m_buffer.size().
if you need a name, call it total_n_frames(), allthough it seems this
function
isn't used/needed at all.

Well, its not used for internal purposes. But its provided for the code
which uses the ringbuffer. Basically if I am implementing C++ code for
some data structure that has a size (as the ringbuffer does), then I
should also provide an accessor that can be used to determine the size.

Just as std::vector has a resize() and size() function.

i think it's still bad because its very confusing in this context.
so yes, it should still be renamed.
(and no, i don't think the libc or std:: APIs picked good method names
in every place; consequently they may very well do you a disservice as
API naming reference)

void
clear()
{
  Atomic::int_set (&m_read_frame_pos, 0);
  Atomic::int_set (&m_write_frame_pos, 0);
}

is this really needed outside of resize() though?

The jack driver needs it (to get into a sane state after a dropout).

how are you avoiding concurrent access when clearing due to a drop
out then?

hm, don't you want review on the rest of the driver as well?
(that'd prolly also be the part that is more interesting for Paul)

Definitely. So here is an updated version of the ringbuffer and the
whole driver source. I think the driver already works pretty well, its
just that there are quite a few FIXMEs in there.

As I am only posting the C++ source to the list, if anyone is so brave
and wants to compile the driver (you need beast CVS for this), I've put
the whole driver to:

http://space.twc.de/~stefan/download/20060601-preview-bse-jack-0.7.0.tar.gz

(sorry that it extracts to the wrong directory, but I just used make
dist...).

Anyway, here is the C++ source (the interesting part, for the review):

well, i think this is better handled by another thread.
it doesn't help to make this email even longer ;)

  Cu... Stefan

---
ciaoTJ



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]