Re: [Ekiga-list] A comparison ALSA-PULSE ( long)
- From: Alec Leamas <leamas alec gmail com>
- To: Ekiga mailing list <ekiga-list gnome org>
- Subject: Re: [Ekiga-list] A comparison ALSA-PULSE ( long)
- Date: Mon, 23 Feb 2009 21:21:47 +0100
Andrea wrote:
Alec Leamas wrote:
Its setup is:
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 2
rate : 44100
exact rate : 44100 (44100/1)
msbits : 16
buffer_size : 22050
period_size : 5512
period_time : 125000
tstamp_mode : NONE
period_step : 1
avail_min : 5512
period_event : 0
start_threshold : 22050
stop_threshold : 22050
silence_threshold: 0
silence_size : 0
boundary : 1445068800
I attach more output.
What you have seen so far was the log when ekiga plays the ring tone (by far the most damaged sound).
When running the echo test the setup is different
stream : PLAYBACK
access : RW_INTERLEAVED
format : S16_LE
subformat : STD
channels : 1
rate : 8000 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
exact rate : 8000 (8000/1) <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
msbits : 16
buffer_size : 800
period_size : 160
period_time : 20000
tstamp_mode : NONE
period_step : 1
avail_min : 160
period_event : 0
start_threshold : 1
stop_threshold : 800
silence_threshold: 0
silence_size : 0
boundary : 1677721600
And I cannot see any underrun at all. My echo test uses PCMA. It is possible that with a better
codec (i.e. higher rate that 8000), we see them again. Don't really know how to test.
I would say that the quality of the echo test with or without pulse is the same (but being only 8000
Hz, it is already not perfect and more difficult to judge).
All my tests so far have been run in debug, so the speed of ekiga/opal/ptlib is already lower the
release. The quality of the ring tone is though more or less the same. I will try to rerun
everything in release.
I have 2 points
1) Is the following true: ekiga-pulse gives bad audio quality because there are underruns.
So, if for some connection there are no underruns (e.g. my echo test) then, the quality is not
expected to be lower than alsa-direct, and we should not complain about pulse.
Yes. And, whatever the problems are, I don't really think it's pulse. I
think it's a problem how we handle alsa which is just not that visible
today.
2) If underruns are (the) evil (or at least the biggest problem), then it would be good to print
some indication of how close to the underrun we are. Does alsa provide that? Is it already part of
my log?
Yes, in the max_avail, see below.
But bear in mind that it's not only a question about if underruns
happens, it's also a question how they are handled. Actually, a correct
working upper layer (opal...) should never allow alsa underruns, it
should rebuffer (send previous data) if nothing else is available. It's
sounds much better than an actual underrun.
I still have not fully understood your comments about the values printed in the log. I need to get
familiar with the terminology.
And I have not yet checked for overruns when reading from the microphone.
Andrea
OK, as long you don't feel I occupy your territory, I'll make a try.
After some reading my memories are coming back. But don't take what I
say for granted, this *is* complicated. And if anyone who really knows
alsa could review this, I would be more than happy...
First of all: Alsa is basically, in all interfaces, concerned with
frames. A frame is what the hardware handles in parallel. So in a mono
stream, a frame is the same as a sample. In a stereo stream, a frame is
two samples. The sample is S16_LE (signed 16 bit litte endian) i e, two
bytes.
So a frame is four bytes when sending the sound (stereo) and two bytes
when talking as above (one channel, mono).
The next entity is a period. A period is (in this context) a chunk of
data transferred from user space to the alsa drivers' hw ringbuffer. The
ringbuffer is normally an even number of periods. In the case above the
period size is 160 frames. Since a frame is a sample ( mono), it's
actually 320 bytes. But it's better to stick to frames, that's what alsa
is all about.
Last we have the hw bufffer. It's actually a ring buffer, where the
application stores data, and the driver/interrupt routines fetches it
and transfers it to the sound card. The overrun/underrun condtions is
really what happens when the two ringbuffer pointers becomes equal,
The period size is 160 frames. 1 frame takes 1/8000 seconds => 1000/8000
ms => period time 160/8 ms = 20 ms.
The buffer above is actually 800/8 ms => 100 ms. This is quite a large
buffer, with added network latency it might be to large. A goal is to
keep the overall latency below 150 ms.
The "avail" reflects the number of frames free to write in the hw
buffer. When the buffer is full it's 0. When it's empty, it's the buffer
size. The normal behaviour is that the application writes a period as
soon as "avail" is >= 1 period. Sending routines should somehow
(blocking I/O, event-driven) be sure that "avail" is indeed >= 1 period
before it writes data.
The avail_min is a threshold (for avail) where alsa activates the
application. That is, an application which has done blocking I/O, or is
wating for an event, is activated at this point. Here, it happens when
it's possible to write a frame, which is perfectly reasonable. In your
logs this should normally mean that "avail" is roughly 160-170 frames,
there are some delays before the actual transfer takes place.
The max_avail reflects indeed the maximum value of avail. It's really
a quality measure. If it becomes as big as the buffer size (empty
buffer) we are close to an underrun. If it's always much less than the
buffer size, it should be possible to decrease the hw buffer.
I think we can leave the other stuff aside, it's not important as I see
it. Let me know if you think different.
The strange thing in the alsa logs when sending the sound was that the
"avail" was approaching 0 when the EPIPE signal triggered. Following the
definitions above, this indicates an *overrun*. This is more than
strange, it might very well be something I don't understand. Or
something really weird somewhere. Perhaps the fact that this is a
special case (stereo , sample rate) creates a mess?
The "direct" log indicated that the application was about to write when
avail is just 10-20 frames every second attempt. This should not happen,
there should be at least a period free in the buffer before the
application makes an attempt to write. Overall, the application seem to
write at moments it shouldn't.
How does the receiving party know the sample rate if the sounds are 16
kHz and the normal data 8 kHz? Looks odd to me, but I'm just a newbie
in this... perhaps the RTP layer can handle this?
Note the Skype settings: 48 kHz, buffer 2560 frames, stereo. 1 frame
1/48000 s == 1000/48000 ms => buffer time 2560/48 = 53 ms. Half the
latency, and a better sound. Skype also uses the start_threshold. It
means that the hardware starts automatically when start_threshold frames
are ready to be sent to the sound card. It's a better way to start the
thing than to start it explicitly as opal seems to do, it's a question
of race conditions in the very beginning.
Qutecom/Wengophone works with 16 kHz and a 60 ms buffer. Hey, we have
some possibilities to improve Ekiga!
Now, this was a long message :-) It's partly self-education, forgive me...
--a
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]