Re: [Ekiga-list] A comparison ALSA-PULSE ( long)

Andrea wrote:
Alec Leamas wrote:
Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 2
  rate         : 44100
  exact rate   : 44100 (44100/1)
  msbits       : 16
  buffer_size  : 22050
  period_size  : 5512
  period_time  : 125000
  tstamp_mode  : NONE
  period_step  : 1
  avail_min    : 5512
  period_event : 0
  start_threshold  : 22050
  stop_threshold   : 22050
  silence_threshold: 0
  silence_size : 0
  boundary     : 1445068800

I attach more output.
What you have seen so far was the log when ekiga plays the ring tone (by far the most damaged sound).
When running the echo test the setup is different

  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 1
  rate         : 8000			<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  exact rate   : 8000 (8000/1)		<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  msbits       : 16
  buffer_size  : 800
  period_size  : 160
  period_time  : 20000
  tstamp_mode  : NONE
  period_step  : 1
  avail_min    : 160
  period_event : 0
  start_threshold  : 1
  stop_threshold   : 800
  silence_threshold: 0
  silence_size : 0
  boundary     : 1677721600

And I cannot see any underrun at all. My echo test uses PCMA. It is possible that with a better
codec (i.e. higher rate that 8000), we see them again. Don't really know how to test.

I would say that the quality of the echo test with or without pulse is the same (but being only 8000
Hz, it is already not perfect and more difficult to judge).

All my tests so far have been run in debug, so the speed of ekiga/opal/ptlib is already lower the
release. The quality of the ring tone is though more or less the same. I will try to rerun
everything in release.

I have 2 points

1) Is the following true: ekiga-pulse gives bad audio quality because there are underruns.
So, if for some connection there are no underruns (e.g. my echo test) then, the quality is not
expected to be lower than alsa-direct, and we should not complain about pulse.
Yes. And, whatever the problems are, I don't really think it's pulse. I think it's a problem how we handle alsa which is just not that visible today.
2) If underruns are (the) evil (or at least the biggest problem), then it would be good to print
some indication of how close to the underrun we are. Does alsa provide that? Is it already part of
my log?
Yes, in the max_avail, see below.

But bear in mind that it's not only a question about if underruns happens, it's also a question how they are handled. Actually, a correct working upper layer (opal...) should never allow alsa underruns, it should rebuffer (send previous data) if nothing else is available. It's sounds much better than an actual underrun.
I still have not fully understood your comments about the values printed in the log. I need to get
familiar with the terminology.

And I have not yet checked for overruns when reading from the microphone.


OK, as long you don't feel I occupy your territory, I'll make a try. After some reading my memories are coming back. But don't take what I say for granted, this *is* complicated. And if anyone who really knows alsa could review this, I would be more than happy...

First of all: Alsa is basically, in all interfaces, concerned with frames. A frame is what the hardware handles in parallel. So in a mono stream, a frame is the same as a sample. In a stereo stream, a frame is two samples. The sample is S16_LE (signed 16 bit litte endian) i e, two bytes. So a frame is four bytes when sending the sound (stereo) and two bytes when talking as above (one channel, mono).

The next entity is a period. A period is (in this context) a chunk of data transferred from user space to the alsa drivers' hw ringbuffer. The ringbuffer is normally an even number of periods. In the case above the period size is 160 frames. Since a frame is a sample ( mono), it's actually 320 bytes. But it's better to stick to frames, that's what alsa is all about.

Last we have the hw bufffer. It's actually a ring buffer, where the application stores data, and the driver/interrupt routines fetches it and transfers it to the sound card. The overrun/underrun condtions is really what happens when the two ringbuffer pointers becomes equal,

The period size is 160 frames. 1 frame takes 1/8000 seconds => 1000/8000 ms => period time 160/8 ms = 20 ms.

The buffer above is actually 800/8 ms => 100 ms. This is quite a large buffer, with added network latency it might be to large. A goal is to keep the overall latency below 150 ms.

The "avail" reflects the number of frames free to write in the hw buffer. When the buffer is full it's 0. When it's empty, it's the buffer size. The normal behaviour is that the application writes a period as soon as "avail" is >= 1 period. Sending routines should somehow (blocking I/O, event-driven) be sure that "avail" is indeed >= 1 period before it writes data.

The avail_min is a threshold (for avail) where alsa activates the application. That is, an application which has done blocking I/O, or is wating for an event, is activated at this point. Here, it happens when it's possible to write a frame, which is perfectly reasonable. In your logs this should normally mean that "avail" is roughly 160-170 frames, there are some delays before the actual transfer takes place.

The max_avail reflects indeed the maximum value of avail. It's really a quality measure. If it becomes as big as the buffer size (empty buffer) we are close to an underrun. If it's always much less than the buffer size, it should be possible to decrease the hw buffer.

I think we can leave the other stuff aside, it's not important as I see it. Let me know if you think different.

The strange thing in the alsa logs when sending the sound was that the "avail" was approaching 0 when the EPIPE signal triggered. Following the definitions above, this indicates an *overrun*. This is more than strange, it might very well be something I don't understand. Or something really weird somewhere. Perhaps the fact that this is a special case (stereo , sample rate) creates a mess?

The "direct" log indicated that the application was about to write when avail is just 10-20 frames every second attempt. This should not happen, there should be at least a period free in the buffer before the application makes an attempt to write. Overall, the application seem to write at moments it shouldn't.

How does the receiving party know the sample rate if the sounds are 16 kHz and the normal data 8 kHz? Looks odd to me, but I'm just a newbie in this... perhaps the RTP layer can handle this?

Note the Skype settings: 48 kHz, buffer 2560 frames, stereo. 1 frame 1/48000 s == 1000/48000 ms => buffer time 2560/48 = 53 ms. Half the latency, and a better sound. Skype also uses the start_threshold. It means that the hardware starts automatically when start_threshold frames are ready to be sent to the sound card. It's a better way to start the thing than to start it explicitly as opal seems to do, it's a question of race conditions in the very beginning.

Qutecom/Wengophone works with 16 kHz and a 60 ms buffer. Hey, we have some possibilities to improve Ekiga!

Now, this was a long message :-) It's partly self-education, forgive me...


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]