Re: (Re: Beast Jack driver)

On Mon, 5 Jun 2006, Stefan Westerfeld wrote:


Same comment as for the other reply: I'll post fixed code later, so I'll
just answer to discussion stuff.

On Fri, Jun 02, 2006 at 01:45:01AM +0200, Tim Janik wrote:

minor cosmetic, we usually align '='-symbols in enums.


you should provide a type name for this enum, which
can be used above to tell apart valid states of your
callback_state variable.

The problem is that callback_state is an atomic integer, so I think it
needs to be volatile int, and I can't declare it of the enum type; or am
I wrong?

nope, that's a good point, C++ doesn't guarantee that enums are stored
with int size, so you should indeed use a volatile int there to use the
atomic functions. however, the decl line should then also have a comment,
outlining the real type.

vector<jack_port_t *>	  input_ports;
vector<jack_port_t *>	  output_ports;

guint			  buffer_frames;	/* input/output
ringbuffer size in frames */

Cond			  cond;
Mutex			  cond_mutex;

erk, what exactly is protected by this cond and this mutex?
something like this definitely needs to be outlined in code
(in comments or by naming prefixes, look for BirnetMutex
in bseengineutils.c to see examples for that)

Well, I'll put a comment there, but if you want to know what it does
right now: the condition is signalled whenever the realtime callback has
been called and is thus be used to wait for events that depend on the
realtime callback (like waiting for termination or new data).

The mutex is mostly useless - its there because conditions require it
(you can't wait on a condition without one), but all code needs to work
without the mutex, because the realtime thread can not use lock() (only
trylock()) due to possible priority inversion.

ok. note that beast doesn't 100% perfectly prevent priority inversion in
other scenarios either. this issue is slowly being worked on (the idea
is to move all communication to atomic ops guarded by BirnetGuard, but
that's an entire new thread ;)
until then, the jack RT thread should use a pipe write() to notify the
engine and sequencer threads, which both create their own pipe fds that
they also poll() on.

jack_status_t status;

self->jack_client = jack_client_open ("beast", JackNoStartServer,
&status); /* FIXME: translations(?) */

what's this "beast" being used for? (especially since you
quesiton translation)
e.g. if it's similar to window titles, it should be "BEAST",
if it's going to be used as a program name in continueous text,
it should prolly be _("Beast") and if its meant as a technical
identifier (e.g. for serialization), it should be "beast".

maybe Paul can shed light on this?

Its the name of the client, which means that for the command line utils
for instance you'll use


as port name. So this suggests that you may not want to translate it.
However, for qjackctl, its also the label that is displayed in the GUI,
in the widget where the clients (and ports) are listed.

ok, it has to be untranslated "beast" then.

DEBUG ("attaching to JACK server returned status: %d\n", status);
/* FIXME: check status for proper error reporting

doesn't teh jack API have something like:
  const char* jack_status_to_string (jack_status_t status);


fine, then we need to roll our own function that does this.
for existing example code that stringifies foreign errors,
you can take a look at gslvorbis-cutter.c:ov_error_blurb().

it might be agood idea to submit this to jack once done btw.

static map<string, DeviceDetails>
query_jack_devices (BsePcmDeviceJACK *self)
map<string, DeviceDetails> devices;

/* FIXME: we need to make a difference here between
 *  - jackd is running, but no output ports can be found

curious, why exactly could that be the case?

I am not sure it can happen, as you need to load some driver to be able
to start jack at all, and the driver will probably always register some
terminal ports. However, in theory it could be possible to start jack
only to connect two applications; for instance one that produces audio
(beast) and another one that sends it over the net somewhere. Then,
which application you start first is arbitary, and it might as well be
that beast gets started first.

But... well... I don't know if jack is ever going to support starting
without a driver (it would need to do something to get the timing,
sample rate and so on setup; normally probably the driver does this).

ok, then the querying logic should simply treat a running jackd without
output ports as if no jackd was running.

 *  - jackd is not running

that should/will actually be determined by jack_client_open(), right?

Right. Then self->jack_client will be NULL. Or not, if jackd was once
running but crashed later...

that should be taken care of by the cleanup handler then (afaiu).

    free (jack_ports);

return devices;

ugh, isn't this a full fledged recursive map+strings+DeviceDetails copy?

Yes. So? Its not in any time critical code, so I don't bother to
optimize it.

well, it's not exactly "hard to optimize" here. just return a pointer.

static SfiRing*
bse_pcm_device_jack_list_devices (BseDevice *device)
BsePcmDeviceJACK *self = BSE_PCM_DEVICE_JACK (device);
map<string, DeviceDetails> devices = query_jack_devices (self);

SfiRing *ring = NULL;
for (map<string, DeviceDetails>::iterator di = devices.begin(); di !=
devices.end(); di++)
    const string &device_name = di->first;
    DeviceDetails &details = di->second;
    BseDeviceEntry *entry;
    entry = bse_device_group_entry_new (device, g_strdup
                                                g_strdup ("JACK Devices"),
                                                g_strdup_printf ("%s
                                                (%d*input %d*output)%s",
								   details.physical_ports == details.ports ?
								     " [
								     hardware ]" : ""));

hm, can you please elaborate the
  details.physical_ports == details.ports ?  " [ hardware ]" : ""
condition for me? ;)

Well... if all ports of a client are physical, then its got to be a
hardware device. For instance alsa_pcm has only physical ports here.

I am not sure what a client is that has *some* physical ports... I've
yet to see one.

a combined virtual alsa device? (combing a virtual and a hw device)

     * normal operation:
     * [bse thread]      writes some data to output_ringbuffer
     * [bse thread]      checks remaining space, there is no room left
     * [bse thread]      sleeps on the condition

BSE never sleeps, waiting for audio drivers. take a look at
alsa_device_write() and alsa_device_check_io() in bsepcmdevice-alsa.c.
i.e. allow full buffer writes if at all possible and provide
timeout information on how long it takes for enough buffer space
to be available.
that's the kind of semantic that needs to be implemented.

Yes, it does block in the ALSA driver. However, I think I'll come back
to it in a seperate mail, since you said it shouldn't or maybe should,
so I'll investigate.

the rationale provided here:
for why writes shouldn't block applies also to the bse-alsa driver.
the fact that bse-alsa could indeed block in its write function
since we set the device to nonoblock=0 is merely the guarding
i'm talking about at the end of the above email.

     * [bse thread]      writes some data to output_ringbuffer
     * [bse thread]      checks remaining space, there is no room left
     * [jack thread]     reads some data from output_ringbuffer
     * [jack thread]     signals the condition
     * [bse thread]      sleeps on the condition

such races are only possible when not using conditions properly.
you always have to follow the sequence: lock(); check(); wait(); unlock();
and on the pther side: lock(); update(); signal(); unlock();
this ensures that no update/signal goes by unnoticed by the check. see:

Cool. Just that I can't lock() in the RT thread, because of priority

I've read a little kernel code today, and fs/pipe.c has a
PIPE_MUTEX(...) which is being locked. So I am not sure I can use
write() to wakeup the master thread either.

well, forget about priority inversion for the moment. it's very hard
to trigger anyway. a much stronger reason to use a pipe is that the
engine and synthesizer threads use poll() and not cond_wait.

However pthread_cond_signal in glibc also seems to have a kind of a
mutex (so its also a problem), although I am not sure if I interpret the
code correctly. But if I do, then I am running out of ideas how to get
this right at all.

Well, I think I have to think/read/ask others more about it until I know
whats the right thing to do, to get the synchronization between the
engine thread and the jack realtime callback right.

just use a pipe wirte() and start to concentrate on other stuff ;)

    if (have_cond_lock)

when exactly is this branch tirggered? (lost track due to the excessive
inlined commenting above.)

There are different activation states. Thats the case where the callback
has marked to be inactive (shouldn't touch ringbuffers and such), so it
will just write out zero blocks.

this (and similar things) really need to go into a state machine description
comment at the start of the source file.

    port = jack_port_register (self->jack_client, port_name,
    JACK_DEFAULT_AUDIO_TYPE, JackPortIsInput, 0);

what is JACK_DEFAULT_AUDIO_TYPE? shouldn't we request floats here (i
dunno the jack API though).

JACK_DEFAULT_AUDIO_TYPE is float. There is no other audio type at all,
i.e. there is no negotaition or anything.

odd, should have a comment like:
  /* request the _only_ audio type jack has, which is float */
so rationale is preserved for future maintenance.

/* connect ports */

map<string, DeviceDetails> devices = query_jack_devices (self);

querying the devices could have been done before registering
the ports above.

So what? I can only connect the ports anyway.

"So what?" ? there's no point in connectiong ports if the
devices can't be found.

map<string, DeviceDetails>::const_iterator di;

di = devices.find (dname);
if (di != devices.end())
    const DeviceDetails &details = di->second;

    for (guint ch = 0; ch < handle->n_channels; ch++)
	  if (details.output_ports > ch)
	    jack_connect (self->jack_client,
	    details.output_port_names[ch].c_str(), jack_port_name
	  if (details.input_ports > ch)
	    jack_connect (self->jack_client, jack_port_name
	    (jack->output_ports[ch]), details.input_port_names[ch].c_str());

you're not setting error in case you failed to find
the required devices to connect the channels.

I don't want to. If you for instance start beast with


to use jamin for mastering, and later the user chooses to terminate
jamin, then we can still run. Its just that then the ports will be
initially unconnected, which is not a problem, because it can be easily
fixed in qjackctl.

sorry, would you care to enlighten me here please?
what's -pjack=jamin supposed to mean?
should the beast output ports be auto-connected to input ports named "jamin"?
and, if input ports named "jamin" are not present, that'll silently be
will the jack callback thread still properly run and request BSE engine data
if the beast output port is unconnected?

if that's the case, i think at least the behaviour of silently ignoring
unsuccessfull connections has to be changed. we also throw proper errors
in other drivers if unconectable devices are being adressed.
(if unsuccessful connections really are common usage with jack, we can
still add device flags, like we do with OSS, so you could say -pjack=jamin,auto for automated connections)

min_buffer_frames += BSE_PCM_DEVICE (device)->req_block_length;

couldn't you instead just align min_buffer_frames to 2 * bse buffer size,
to still fullfil the requirements you outlined, and yet yield a smaller
ring buffer (thus optimize latency)?

Don't know. Trouble comes when we miss the - racy - condition. That
produces one period of extra latency if things go badly wrong. Thus a
stateful synchronization which can't miss something (such as a pipe())
would be superior.

sorry? what do you mean by "stateful synchronization"?

Then I suppose we could do as you suggested.


// honor the user defined latency specification
// FIXME: should we try to tweak local buffering so that it corresponds
// to the user defined latency, or global buffering (including the jack
// buffer)? we do the former, since it is easier

we should do the latter. that shouldn't be too hard, Paul said that you
could query (and inform) jack about the latency on a port.
that then just needs to be subtracted from BSE_PCM_DEVICE

It can change. If the beast output is jamin, that'd be adding latency.
If the user reconnects jamin to a convolution which adds latency, then
the total output latency query would change while we're running. Thats
why I am a bit reluctant to do it, as its results are not well-defined.

i see. at the very least the latency tooltip has to be adapted then
to reflect this.

It could even happen that we are not connected initially (latency = 0),
and only get connected afterwards by a patchbay, where the user is
loading a session. Thus in such a case we would always misbehave.

what's that about being non-connected.
does jack still request audio from beast when unconnected?
and, you said jack clients would always auto connect to alsa_pcm
and that alsa_pcm should always be present. so why would we start
up disconnected?

guint user_buffer_frames = BSE_PCM_DEVICE (device)->req_latency_ms *
handle->mix_freq / 1000;
guint buffer_frames = max (min_buffer_frames, user_buffer_frames);

jack->input_ringbuffer.resize (buffer_frames, handle->n_channels);
jack->output_ringbuffer.resize (buffer_frames, handle->n_channels);

the way this is written, you're doubling the latency because you apply
it to the input and the output buffer. you should just apply it to the
output buffer (and also subtract the input buffer latency before doing

The input buffer should not produce latency. Ideally, the filled frames
of the input buffer and the filled frames of the output buffer should
add up to exactly buffer_frames. Note that the input_buffer is *empty*
initially while the output_buffer is *full* initially, and the jack
callback methodically only adds as many frames to the input_buffer as it
takes from the output_buffer, whereas the engine process cycle does the

that means input latency increases with every output underrun, because
in that case you're faking extra output values. basically, in duplex
scenrios under latency constraints you have to properly resync input
and output and can't just pad output values.

jack->buffer_frames  = jack->output_ringbuffer.get_writable_frames();

what is jack->buffer_frames good for? and why is that different from
total_n_frames() here?

Its not different. Don't know. Maybe I'll get rid of it in the new

"Don't know."? ;)
ok, if you need help here, take my word that if it's useless you should
get rid of it ;)

	       * FIXME: we need a way to indicate an error here; beast
	       should provide
	       * an adequate reaction in case the JACK server is down (it
	       should stop
	       * playing the file, and show a dialog, if JACK can not be

we don't need error indication here (see alsa_device_read()). it's
good enough if we return gracefully here.
error checking code (and returns) can be executed by check_io()
and/or retrigger().

Yes, we do. If JACK dies, I want to know as user. Otherwise I'll just
see that nothing happens. Why? And why is beast continuing to play if
still nothing happens. Why should it...?

i already said that. but i can say it again if that helps you.
you should gracefully return and ignore the error in write().
you can check for error conditions in check_io() and there we
can decide and undertake appropriate measures to deal with the
error. e.g. by resyncing the stream (that's at least interesting
for ALSA/OSS or by notifying the user, etc.)

	       * if we have a way to abort processing, this if can be moved
	       * the condition wait; however, right now moving it there
	       means that
	       * beast will render the output as fast as possible when jack
	       dies, and
	       * this will look like a machine lockup

no, endless loops don't look like lockups ;)
in the first case, your CPU meter runs at 100% and toggling Num-lock/etc.
keys still work (usually also the X server etc.) while in the latter case
*nothing* goes anymore.

But it isn't a situation I would want to throw the user into either.

right. simply fix check_io() then, beast won't busily write audio data
if check_io() doesn't tell it so.

	      Block::fill (frames_left * handle->n_channels, values, 0.0);
	      /* <- remove me once we can indicate errors */

nope should stay (the same way alsa_device_read() copeswith errors).

Its uncommon for a hardware device to go away while the application
runs. Its unfortunately common for the jack server to go away. That
might not be sloppy programming. The jack server might as well go away
intentionally because we missed a deadline. For instance because the
user was running jack and beast without rt prio enabled but had a
semi-low latency. Just stopping in such cases is rather uninformative,
and people may think its a beast bug or something.

We should be able to present a reason why this happened to the user
inside a nice dialog and then stop trying to play audio. Its pointless
after jackd terminated our connection or terminated itself or crashed or

ok, then put an informative g_printerr() containing all the relevant
error scenario information into check_io() with a FIXME next to it. we can then start discussing ways to treat processing errors in the engine
and possible recovery from such scenrios. (basically processing errors
are currently not supported by the engine design)

const gchar *info = g_intern_printf (/* TRANSLATORS: keep this text to 70
chars in width */
                                     _("JACK Audio Connection Kit PCM
                                     driver (\n"));

jack version printing? again, see what the alsa driver does here.

There is no jack version I can ask for in the API, so I can't print one.
I could print the version of libjack I was compiled against (i.e.
something that pkg-config produced or so), but the server version might
be different, as long as the protocol version of the server and the
protocol version of libjack match.

ok, if jack doesn't provide us more, we'll use the compiled-against
version then. after all, the runtime libjack should be compatible with
that (ELF ensures that) and also compatible with the running jackd
(libjack checks that).

the single most important issue that has to be fixed here is
the mutex and condition. neither the BSE nor the JACK thread
can afford acquiring it and wait for the other thread.
that's why we need a ringbuffer based only on atomic ops. we
got that now, so the next step is to modify the code to use
it up to its fullest potential.

basically, to get there, the jack thread needs to be modified
to act more similarl to e.g. the kernel ALSA driver wrg to
xruns. the kernel driver also has fixed-size ring buffers and
still deals with drop outs perfectly well. that kind of API
semantic has to be provided for the BSE thread in the bse-alsa
driver. then, the implementations of check_io(), retrigger(),
read() and write() can be much closer to the bse-alsa driver
implementation. also, this will eliminate the need for the
mutex and the condition.

I think I don't know enough yet to give a qualified answer on what to
do: especially I want to know, if

* write to pipe
* pthread_condition_signal
* (maybe semaphores)
* (maybe something else I forgot)

are equally capable in their realtimeness properties, and suitable for
notifying the bse master thread from the jack realtime thread, and so
which is the one to choose to let the bse thread know when to compute

well, let me simply tell you that the only viable way is to use pipe.
if not for any of the various reasons you've already heared from me on the
topic, just for the simple fact that the BSE engine thread and the
BSE sequencer thread *already* support them.
that is not the case for any of the "alternatives" you list.

Timeouts are a bad solution because they either occur too soon (in which
case the engine thread may wake up, but can't do the computation), or
too late, because there is always jitter. In any case you loose precious
time or waste precious CPU cycles or both.

note that the BSE audio driver model *and* the BSE engine model are based
on timeouts. that's what check_io() provides and what the engine needs.
you won't get any other programming model with BSE, regardless of whether
you like it or not ;)

I know that for practical purposes, there is no other solution the bse
framework is capable than write-to-pipe, but I have seen in kernel code
that the implentation is not lock-free. Can this lead to priority

you can ask on lkml to figure if/what liunx prevents priority inversion
for pipe access.

Can conditions do the same?

yeah, signaling via conditions also involves locks/syncronization
*somewhere* in the path between two threads.

What is safe and what is not?
Right now, I don't know.

simply stop worrying, like i recommended above ;)

  Cu... Stefan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]