Re: [Ekiga-list] A comparison ALSA-PULSE ( long)

From: Alec Leamas <leamas alec gmail com>
To: Ekiga mailing list <ekiga-list gnome org>
Subject: Re: [Ekiga-list] A comparison ALSA-PULSE ( long)
Date: Mon, 23 Feb 2009 21:21:47 +0100

Andrea wrote:

Alec Leamas wrote:

Its setup is:
  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 2
  rate         : 44100
  exact rate   : 44100 (44100/1)
  msbits       : 16
  buffer_size  : 22050
  period_size  : 5512
  period_time  : 125000
  tstamp_mode  : NONE
  period_step  : 1
  avail_min    : 5512
  period_event : 0
  start_threshold  : 22050
  stop_threshold   : 22050
  silence_threshold: 0
  silence_size : 0
  boundary     : 1445068800


I attach more output.
What you have seen so far was the log when ekiga plays the ring tone (by far the most damaged sound).
When running the echo test the setup is different

  stream       : PLAYBACK
  access       : RW_INTERLEAVED
  format       : S16_LE
  subformat    : STD
  channels     : 1
  rate         : 8000			<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  exact rate   : 8000 (8000/1)		<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
  msbits       : 16
  buffer_size  : 800
  period_size  : 160
  period_time  : 20000
  tstamp_mode  : NONE
  period_step  : 1
  avail_min    : 160
  period_event : 0
  start_threshold  : 1
  stop_threshold   : 800
  silence_threshold: 0
  silence_size : 0
  boundary     : 1677721600

And I cannot see any underrun at all. My echo test uses PCMA. It is possible that with a better
codec (i.e. higher rate that 8000), we see them again. Don't really know how to test.

I would say that the quality of the echo test with or without pulse is the same (but being only 8000
Hz, it is already not perfect and more difficult to judge).

All my tests so far have been run in debug, so the speed of ekiga/opal/ptlib is already lower the
release. The quality of the ring tone is though more or less the same. I will try to rerun
everything in release.

I have 2 points

1) Is the following true: ekiga-pulse gives bad audio quality because there are underruns.
So, if for some connection there are no underruns (e.g. my echo test) then, the quality is not
expected to be lower than alsa-direct, and we should not complain about pulse.

Yes. And, whatever the problems are, I don't really think it's pulse. Ithink it's a problem how we handle alsa which is just not that visibletoday.

2) If underruns are (the) evil (or at least the biggest problem), then it would be good to print
some indication of how close to the underrun we are. Does alsa provide that? Is it already part of
my log?

Yes, in the max_avail, see below.

But bear in mind that it's not only a question about if underrunshappens, it's also a question how they are handled. Actually, a correctworking upper layer (opal...) should never allow alsa underruns, itshould rebuffer (send previous data) if nothing else is available. It'ssounds much better than an actual underrun.

I still have not fully understood your comments about the values printed in the log. I need to get
familiar with the terminology.

And I have not yet checked for overruns when reading from the microphone.

Andrea

OK, as long you don't feel I occupy your territory, I'll make a try.After some reading my memories are coming back. But don't take what Isay for granted, this *is* complicated. And if anyone who really knowsalsa could review this, I would be more than happy...

First of all: Alsa is basically, in all interfaces, concerned withframes. A frame is what the hardware handles in parallel. So in a monostream, a frame is the same as a sample. In a stereo stream, a frame istwo samples. The sample is S16_LE (signed 16 bit litte endian) i e, twobytes.So a frame is four bytes when sending the sound (stereo) and two byteswhen talking as above (one channel, mono).

The next entity is a period. A period is (in this context) a chunk ofdata transferred from user space to the alsa drivers' hw ringbuffer. Theringbuffer is normally an even number of periods. In the case above theperiod size is 160 frames. Since a frame is a sample ( mono), it'sactually 320 bytes. But it's better to stick to frames, that's what alsais all about.

Last we have the hw bufffer. It's actually a ring buffer, where theapplication stores data, and the driver/interrupt routines fetches itand transfers it to the sound card. The overrun/underrun condtions isreally what happens when the two ringbuffer pointers becomes equal,

The period size is 160 frames. 1 frame takes 1/8000 seconds => 1000/8000ms => period time 160/8 ms = 20 ms.

The buffer above is actually 800/8 ms => 100 ms. This is quite a largebuffer, with added network latency it might be to large. A goal is tokeep the overall latency below 150 ms.

The "avail" reflects the number of frames free to write in the hwbuffer. When the buffer is full it's 0. When it's empty, it's the buffersize. The normal behaviour is that the application writes a period assoon as "avail" is >= 1 period. Sending routines should somehow(blocking I/O, event-driven) be sure that "avail" is indeed >= 1 periodbefore it writes data.

The avail_min is a threshold (for avail) where alsa activates theapplication. That is, an application which has done blocking I/O, or iswating for an event, is activated at this point. Here, it happens whenit's possible to write a frame, which is perfectly reasonable. In yourlogs this should normally mean that "avail" is roughly 160-170 frames,there are some delays before the actual transfer takes place.

The max_avail reflects indeed the maximum value of avail. It's reallya quality measure. If it becomes as big as the buffer size (emptybuffer) we are close to an underrun. If it's always much less than thebuffer size, it should be possible to decrease the hw buffer.

I think we can leave the other stuff aside, it's not important as I seeit. Let me know if you think different.

The strange thing in the alsa logs when sending the sound was that the"avail" was approaching 0 when the EPIPE signal triggered. Following thedefinitions above, this indicates an *overrun*. This is more thanstrange, it might very well be something I don't understand. Orsomething really weird somewhere. Perhaps the fact that this is aspecial case (stereo , sample rate) creates a mess?

The "direct" log indicated that the application was about to write whenavail is just 10-20 frames every second attempt. This should not happen,there should be at least a period free in the buffer before theapplication makes an attempt to write. Overall, the application seem towrite at moments it shouldn't.

How does the receiving party know the sample rate if the sounds are 16kHz and the normal data 8 kHz? Looks odd to me, but I'm just a newbiein this... perhaps the RTP layer can handle this?

Note the Skype settings: 48 kHz, buffer 2560 frames, stereo. 1 frame1/48000 s == 1000/48000 ms => buffer time 2560/48 = 53 ms. Half thelatency, and a better sound. Skype also uses the start_threshold. Itmeans that the hardware starts automatically when start_threshold framesare ready to be sent to the sound card. It's a better way to start thething than to start it explicitly as opal seems to do, it's a questionof race conditions in the very beginning.

Qutecom/Wengophone works with 16 kHz and a 60 ms buffer. Hey, we havesome possibilities to improve Ekiga!


Now, this was a long message :-) It's partly self-education, forgive me...

--a

References:
- [Ekiga-list] A comparison ALSA-PULSE
  - From: Andrea
- Re: [Ekiga-list] A comparison ALSA-PULSE
  - From: Alec Leamas
- Re: [Ekiga-list] A comparison ALSA-PULSE
  - From: Andrea
- Re: [Ekiga-list] A comparison ALSA-PULSE
  - From: Alec Leamas
- Re: [Ekiga-list] A comparison ALSA-PULSE
  - From: Andrea

[Date Prev][Date Next] [Thread Prev][Thread Next] [Thread Index] [Date Index] [Author Index]