Re: Pulseaudio
- From: "Gustavo J. A. M. Carneiro" <gjc inescporto pt>
- To: Lennart Poettering <mztabzr 0pointer de>
- Cc: matteo member fsf org, desktop-devel-list gnome org
- Subject: Re: Pulseaudio
- Date: Sat, 13 Oct 2007 00:23:26 +0100
Hi,
Thanks for the explanations, but I still have one doubt.
But I would be interested to know why PA hogs the sound device. No
one explained this yet, and is the #1 question in my mind. As far as I
can tell, though I admit I'm no expert and could be wrong, PA should be
able to access the sound device without locking it, like any normal
application. To put it simply, PA over ALSA/dmix. If dmix was
underneath PA, couldn't you still keep doing all those neat things
anyway? The way I see it, only applications using ALSA directly don't
benefit from PA's features, but at least it would work.
Every time I bring this issue the reply is usually "but you don't need
to do that because of this and that", not an actual explanation of why
it can't be done.
One use case for the above would be (non-free) flash plugin running as
32-bit program with nspluginwrapper attached to an amd64
epiphany/firefox. Unless you have also PA libraries and alsa plugin
replicated in /usr/lib32, audio won't work if you are redirecting
everything from ALSA to PA... Yes, I know non-free flash plugin is
evil, but you can't easily escape it these days... :-(
Thanks in advance.
On Fri, 2007-10-12 at 21:20 +0200, Lennart Poettering wrote:
> On Mon, 08.10.07 20:02, Matteo Settenvini (matteo-ml member fsf org) wrote:
>
> Hi!
>
> Ok, a little bit late because I was travelling, but here's my reply to
> the whole PA thread on desktop-devel (as the
> maintainer of PA).
>
> This is a long reply, so you might to want to grab yourself of a cup
> of coffee (or mango lassi?) before you start reading through it.
>
> I'd like to thank davidz, matteo, haddess, jan for jumping in the
> discussion for PA's defense and replying more quickly than I did. Thanks,
> dudes!
>
> > It has been a while since esound has received some attention - releases
> > are almost stalled. Looking at the GNOME wiki, it seems that Pulseaudio
> > is the stronger candidate between alternatives, and that it allows for
> > quite a lot of nifty things.
> >
> > I'm running pulseaudio since four or five months now on two of my
> > desktop systems, both x86 and PPC, and I must say that I'm really
> > satisfied by it.
> > It's quite stable and has very few compelling bugs for the normal user
> > (e.g. when using it as an esound replacement on a machine with more than
> > a logged in user it doesn't share the esd socket, or similar).
> >
> > It also seems to be actively developed, and is shipped by default with
> > Fedora 8.
> >
> > Can it be eligible for inclusion in GNOME 2.22?
>
> Coincidentally we discussed just this during the GNOME Summit on
> sunday. Here are my 10� on this and all the issues raised in the whole
> thread following Matteos proposal.
>
> I am not sure that PA should become "part" of GNOME. A blessed
> dependency sure, but really a new module of GNOME? Probably not.
>
> Fedora now ships PA by default, and SuSE is moving to PA as
> well. (Of the big distros only that spaceboy distro doesn't love us
> anymore as it seems, as I haven't heard from them in a while) There
> are still a couple of rough edges, like that we ship two volume
> controls: one being the native PA volume control, which can do all
> kinds of nifty things like per-stream volumes and moving streams
> between devices. And gnome-volume-control which is much sexier UI-wise
> (i18n, ...), but exposes a lot of cruft we'd prefer to get rid of
> (i.e. all kinds of stupid alsa mixer tracks and checkboxes nobody
> really understands, shows every devices thrice, ...) Resolving this
> duplication probably needs a little bit tighter integration of PA and
> GNOME: either the volume control tool in GNOME would need to link
> directly against PA -- or we'd have to wrap all the special PA
> features in GST's mixer interfaces -- which I think doesn't make that
> much sense. Too many abstraction layers are bad, especially if there'd
> only be a single backend driver which would implement most of it.
>
> A couple of direct replies to what people brought up in their emails:
>
> Martin Meyer suggested that PA was "heavy-weight". This is quite
> frankly bullshit. It depends how you compile PA. Sure, PA is a little
> bit bigger than ESD, but not that much. It of course becomes bigger,
> if you compile *all* the modules we ship. But you don't have to do
> this -- if you just want the core, then just compile the core and PA
> is tiny. A lot of embedded people now start to adopt PA -- people
> which a lot stronger constraints that we generally have on GNOME as a
> desktop for a PC. So, the "bloat", "heavy-weight" issue is
> nonsense. You can compile the SVN PA fine with just two external
> dependencies (ALSA and liboil -- both libraries are nowadays installed
> on all distros anyway -- so they don't really count) and it works
> fine. Everything else is optional, and can be split off in seperate
> packages. And even without those extra modules PA is still very
> useful.
>
> Regarding GST vs. PulseAudio: there is just no "vs."! Gstreamer does
> muxing/demuxing/decoding/encoding of media streams. PA is a low-level
> PCM-only sound server. They're too different things. You could
> compare this to X11 and GTK: X11 just does a bit of windowing and
> drawing for you; GTK does all those UI things on top. PA does just a
> bit of buffering, mixing, filtering for you; GST does all those nice
> decoding/encoding/muxing/demuxing things on top.
>
> Regarding the PA vs dmix issue, Sven Neumann brought up. Yes, if you
> only care about the simplest form of mixing, then dmix is sufficient
> for you. However, if we want to provide anything that remotely comes
> near to what Vista or MacOS X provides -- then we need some kind of
> sound server, just like they are shipping one. (MS likes to call the
> sound server a "userspace sound system", though, but that's just the
> terminology. The imporant fact is that they have a real-time process
> which serializes access to the PCM devices). So what does PA offer you
> beyond dmix right now? From a user perspective this is: moving streams
> on-the-fly between devices; distributing audio on multiple audio
> devices at the same time; per-stream volumes; fast-user-switching
> support; automatic saving/restoring of per-application devices,
> volumes; sensible hotplug support; "rescueing" streams to another
> audio device, if you accidentaly pull your usb cable; network support;
> ... the list goes on and on and on. Also, ALSA is Linux specific
> (though personally I think this doesn't really matter)
>
> Gustavo brought up the issue that PA "hogs" the sound device. Sure we
> do. The idea is having everything go through PA, so that we can treat
> everything the same. However, since there are some APIs that are
> notoriously hard to virtualize (e.g. OSS with mmap) and some areas
> where you don't want the extra context-switching PA adds (pro audio,
> for now), there's now a tool called "pasuspender" which when passed a
> command line it will execute that, but before doing so suspend PA's
> sound card access and afterwards resume it again. So, prefix your
> "quake2" invocation with "pasuspender" and everything should be
> fine. Also, we now close all audio devices after 1s of idle time by
> default. We do this mostly to save power. However this also has the
> side effect of releasing the audio device quickly for other apps. The
> drawback of course is that many sound cards generate pops and clicks
> everytime you open/close the device (some intel hda for example), but
> that can probably be worked around in the drivers (according to
> Takashi) and I guess you cannot have everything at the same time, so
> power saving is more important for now. In practice you probably
> shouldn't notice PA's presence at all -- unless you try to play a ALSA
> stream to hw:0 and a PA stream at the same time. And last but not
> least, we have been shipping a PA plugin for libasound for a while
> now. It's enabled by default in F8 and redirects all ALSA audio to
> PA -- unless some borked app hard codes "hw:0" as device name.
>
> Regarding Flash and PA: As Bastien pointed out, in F8 we ship a plugin
> for the flash player which makes it compatible with PA. With that
> plugin Flash and PA are perfectly compatible.
>
> Gustavo repeatedly brought up the compatibility with current
> (closed-source) stuff: PA is also "the compatible sound server". We
> provide compatibility with OSS, ALSA, ESD, GST, LIBAO, Xine, MPlayer,
> ... (in various degrees, but mostly pretty high-quality). Right now
> Quake2 is the only relevant app I know that doesn't really work on top
> of PA, but for those cases we have pasuspender. Basically, I think
> this is a non-issue these days. And for almost all of the remaining
> apps we have compat problems with, we can fix our compat layers for
> them. Most of the time the applications are misusing the APIs, but
> we're happy to try to add the necessary stuff to out compat layers to
> get them working with them.
>
> Regarding hardware mixing support: this is bullshit. You know, a while
> back all sound cards had wavetable stuff built in hw. And then this
> became obsolete - because it could be done with less effort and
> without problems in software, with faster CPUs. Then, there where MPEG
> decoder cards which soonishly became obsolete -- because it could be
> done with less effor in software, with faster CPUs. And then, some
> vendors added hw mixing to their cards. But that was 6 years ago -- if
> you look at current sound card designs (HDA) you'll notice that they
> only support a single stream. They are high-quality but very
> feature-limited DAC. HW mixing is dead technology, it's out of
> fashion, made redundant by stuff that nowadays is available in the
> CPU: MMX, SSE. Using hw mixing imposes a greater burden on your USB,
> PCI busses, might generate more IRQs. The place to do mixing is
> nowadays the CPU -- it's one of the reasons MMX, SSE where added to
> the CPU in the first place. Accelerating mixing in hw is really not
> what you want to do these days. But, if you really insist that you
> want to use this obsolete technology in your sound system the you're
> welcome to send me a patch or add a module to PA. But honestly, the
> next one who comes up with the hw mixing issue should please do his
> homework and read up what happened in sound card design in the last 10
> years, thank you very much. Asking for hw mixing in PA is like asking
> for support for MPEG decoder cards in GST.
>
> Also, never forget: PA does much more than just mixing audio. That's
> just the tiniest part of it.
>
> Gustavo then played the latency card: yes, PA increases the latency
> over direct hw access. But so does dmix, because it enforces fixed
> fragment settings for all apps. What you really want to do (which
> however right now is only partially implemented in PA) is allowing
> per-stream fragment settings, by scheduling audio based on timer
> interrupts instead of sound io interrupts (based on fixed fragment
> settings). Those timer interrupts can be dynamically changed so we can
> change the wakeup points dynamically during playback without too much
> effort. However this needs some kind of kernel support (hrtimers,
> HPET), which only has become available very recently and on x86 only
> (not even amd64 yet), so until we get this fully implemented a few
> months will pass. If we have that however, we basically get the same
> PCM pipeline that Vista and MacOS have: a huge mixing buffer managed
> by a real-time userspace sound server which allows rewriting at any
> time and notifying clients dynamically, scheduled via timer
> interrupts. In essence, in the long run we really *need* something
> like PA, if we want to provide low latencies (i.e. short fragments ==
> frequent interrupts) and low power consumption (i.e. few interrupts ==
> huge fragments) at the same time and switch between them
> dynamically. Yes, right now, PA increases your achievable latencies a
> bit (but just a bit), but in the end we *need* a process that does the
> audio scheduling based on timers -- something that PA will then do. Of
> course, PA doesn't fully implement yet, which is partially PA's fault
> and partially the kernel's fault that sucks when it comes to timers,
> right now. We're getting there.
>
> Then, Gustavo played the stability card: Yepp, sure, PA is relatively
> new code. But I mean, esd is more than ten years old these days. And
> you'd call it stable? Come on! PA is stable enough for inclusion in
> F8, and it is actively maintained. And that should be all that
> counts. Oh, and sound is not really life-depending, is it? If you
> lose audio on your desktop all you lose is a bit of background music,
> it's not that PA eats all your files for breakfast. The "stability"
> argument is just a trick to disallow innovation.
>
> Gustavo, PA in F8 is very much different then PA 0.9.6. As suggested
> by Matthias, please try it in F8. You know, Gustavo, that RH did a lot
> of work on PA before we included it in F8, to make it seamless and as
> bug-free as possible? Sure there might be an issue left here and
> there. But that's in every software.
>
> So, to the next big technical issue Gustavo found in PA: he thinks its
> developers are stubborn. Thank you very much, Gustavo, I love you
> too. Maybe it is you that is stubborn here, with spreading all this
> FUD?
>
> (Just as a side note: do you know that Takashi, the upstream ALSA
> maintainer also maintains PA in Suse? Maybe you're more Catholic than
> the Pope in your insistance on ALSA dmix?)
>
> Regarding CPU load: the version of PA that ships in F8 uses exactly
> 0.00% CPU when idle -- unless some stupid app polls for the volume all
> the time, which might raise it a bit -- but that should be fixed in
> the app.
>
> Frederic still loves ESD. ESD is bad, in latency, in features, in
> code, in everything. I am not sure if you, Frederic, noticed that ESD
> only supports 2ch, 16bit, 44khz audio. Have you noticed all those 5.1
> sound systems popping all around you? Have you noticed that everyone
> hates esd? And that the most well known trick to get your audio
> working on your Linux desktop is called "killall esd"? Noone wants to
> maintain ESD -- do you? There are just so many reasons why ESD should
> be obsoleted... Dude, the next one who seriously suggests ESD as our
> path to the future in desktop I audio I will personally buy a ticket
> for a time machine, so he can fast-forward for 10 years or so and join
> the rest of us in 2007!
>
> Regarding cross-desktop support: I personally don't care too much
> about KDE, but apparently you can set it up just fine like described
> here: http://pulseaudio.org/wiki/PerfectSetup Xine (which I think is
> what amarock -- or whatever that awful media player everyone but me
> loves so much is called -- uses for the hard stuff) also ships a native
> PA driver.
>
> Ronald, you say: "Userspace daemons are out." This is completely
> bogus. Just have a look on other OSes. Like MacOSX, like Vista. One of
> the new Vista features is the new "userspace sound system". In Unix
> nomenclatura this translates to "daemon". A user sound system is the
> way it needs to be, it's the way the systems do it which currently
> ship more powerful and useful sound systems then we do. As mentioned
> earlier, the PCM pipeline you really want is one RT thread per device
> that drives all streams based on timers, not on IO IRQs, managing a
> large, rewritable playback buffer.
>
> HW mixing is dead, and the lock-free magic dmix does is not really
> powerful enough for what is required from a sound system these days.
>
> PA is an implementation of the aforementioned ideal audio server
> design. (Not complete, as mentioned above, though).
>
> This is a very good read about the design of CoreAudio, and basically
> does what we want to do in PA as well.
>
> http://developer.apple.com/DOCUMENTATION/DeviceDrivers/Conceptual/WritingAudioDrivers/AudioFamilyDesign/chapter_3_section_3.html#//apple_ref/doc/uid/TP30000731-CJBIDABE
>
> Ronald, you claim: "sound daemon is the right solution _only_ for
> networked audio". This is also bogus. There's a lot of stuff you want
> to do in a sound server. For example: policy decisions like "everytime
> I plug in my USB headset in I want all voip playback streams to
> automcatically switch to it, and everytime i start my voip app i want
> its stream to go through the usb headset". Then, doing all this kind
> of "compiz for audio" stuff. For example, what I will probably make
> available in PA pretty soon is the ability to do "spacial" event
> sounds, i.e. if you press a button on the left side of your screen its
> event sound goes out of the left speaker, and vice versa. Or stuff
> like automatically sliding down the volume of all windows that are
> currently not in the foreground. (i.e. you start two totems and only
> the one in the foreground is at 100% volume, the other one at 30% or
> so. And when you switch windows the volumes automatically slided to
> the opposite). Right now, PA basically just provides the
> infrastructure for these kind of things, but after the groundwork is
> now done, I can now focus on the "earcandy" part.
>
> In short: there are both user-visible (like these effects, moving
> streams between devices, per-stream volumes) and technical (doing
> low-latency and low power-consumption at the same time) reasons why a
> userspace sound daemon is the way forward.
>
> Ronald, the "alsa-plugin" ships a OSS backend, just as a side note.
>
> Regarding GSmartMix: some parts of gsm live on, like the the new
> sound preferences dialog which allows per-class devices and stuff. The
> problem I saw with gsm is that it was limited to GST. And yeah, not
> all apps use GST, and many apps never will. I hope to work with
> Marc-Andr�o get the remaining ideas of gsm into PA, as soon as I
> export the necessary meta information for all streams in PA.
>
> Ronald, in a way PA is just a reimplementation of dmix. You can
> autolaunch it via libasound, and you shouldn't notice much of a
> difference, except that you suddenly can do device aggregation,
> per-stream volumes with just a few clicks, and so on.
>
> Jan: dmix doesn't involve a daemon anymore. They now do some
> atomic ops magic of mixing everything lock-free with a single mix
> buffer and a couple of saturation buffers. It's a technically
> brilliant solution, though probably not the best for your CPU
> caches, and it falls back to locking mode on multicore and non-x86.
>
> Gustavo: PA by default uses pretty large playback buffers which apps
> can rewrite at any time. This is the very definition of what MS calls
> "GlitchFree", and is the way to go to provide never-drop-out
> guarantees and quick reaction when seeking. We don't really pass those
> large buffer down to the hw yet, but that's mostly because of the
> hrtimer mess mentioned above. PA in F8 should not drop out, unless
> you configure it manually to some strange settings. If you ship a
> shitty HZ=100-with-no-preemption kernel, then yes, this increases the
> chance of a drop-out. But really, if you want to shoot yourself in the
> foot then go for it, but don't blame PA for it, don't do it the ESR
> way. In any reasonable setup PA shouldn't drop out.
>
> The way forward, to get something like "GlitchFree" on Linux is called
> "PulseAudio", and in contrary what you are claiming, ALSA dmix is not.
>
> Gustavo: as I tried to make clear above the way to go is a userspace
> sound server. And we have that, then it's perfectly fine to do network
> support in it as well.
>
> And again: no modern sound card supports hw mixing anymore. That's the
> past, get over it.
>
> Gustavo: OSS is only dead -- as an implementation of a kernel sound
> system (though some people from 4front might even claim the contrary
> here), OTOH it is very alive -- as an API, and (unfortunately) it is
> going to stay around for a long time still. It's a much smaller API
> then ALSA, and portable and used in a lot of commercial apps. That's
> why we support it for compatibility in PA.
>
> Regarding RT support in PA: Right now on F8 rt for pa is not enabled
> by default, due to security. I'd really love to enable it by default,
> which we could do if we had a safe process babysitter daemon which
> would supervise PA and is running on a higher rtprio than
> PA. Hopefully eventually someone will replace init/gnome-session which
> something which can babysit processes very well, and this thing should
> then do rt-supervising as well.
>
> Also, contrary to what Gustavo says, you don't need to be root to do
> RT, all you need is RLIMIT_RTPRIO set to something > 0.
>
> Regarding event sounds: Yes, I disable them too by default, I think
> everyone reasonable (except davidz, maybe :-)) does that. But why do
> we do that? Partly because the sounds we have right now in GNOME suck
> big time and are annoying like hell. And partly, because they are
> truggered far too often. If you ever used a MacOS machine you probably
> know that the event sounds there are lot more subtle and ... useful. I
> can think of a couple of places where sound events make a lot of
> sense, if they are high-quality:
>
> - when you get an email a human voice should say something like
> "You've got mail", instead of some stupid "ding" sound noone knows
> what it means.
>
> - when long-running actions complete you might also want a human voice
> saying "CD burning finished", or "downloaded finished".
>
> - For incoming IMs you should have a subtle "ping" sound. Having a
> human voice everytime probably is too much, given their frequency.
>
> - Some UI actions like workspace swiutching/fast-user switching, and
> minimizing/maximizing might be good candidates for event sounds too.
>
> So basically, what I try to say is: just because current sound events
> suck, there's no reason they *have to* suck. I hope someone will
> eventually give the sound theming spec another shot and provide us
> whith more useful, internationalized default sound samples.
>
> OK, so much about defending PA. I hope I answered to every single
> question, comment, FUD spread. If not, just give me a ping!
>
> So, where do we go from here?
>
> At the Summit and internally at RH we discussed a little how we should
> go on with PA and GNOME. So, here basically what I plan:
>
> There are basically three areas where GNOME currently interfaces with
> PA via compat layers only and where we should replace the relevant
> code with something newer:
>
> 1. Currently esd is explicitly started via gnome-session. In F8 we
> provide a compat script called "esd" that starts up PA. So,
> g-s thinks it starts esd, while it actually starts PA. This is OK,
> but this hard coded dependency on a binary called "esd" should go away. Instead PA
> should be started via XDG autostart or suchlike. This would require some serializing of
> sound events to fix the race we get when one app wants to play a
> sound event and pa is not fully started yet. Not too difficult. This
> removes the hard dep on ESD doesn't even replace it with a PA
> specific one. Gustavo, Ronald, I hope you rejoice?
>
> 2. Sound events are generated directly via libesd from libgnome. This
> hard dep sucks as well. What I propose instead is this: I will
> introduce a new sound event API called "libcanberra", which is
> intended to be cross-platform, cross-toolkit and well-supported on
> PA. It basically exports just a single variadic function:
>
> cbr_play(c, id,
> CBR_META_ROLE, "event",
> CBR_META_NAME, "click-event",
> CBR_META_SOUND_FILE_WAV, "/usr/share/sounds/foo.wav",
> CBR_META_DESCRIPTION, "Button has been clicked",
> CBR_META_ICON_NAME, "clicked",
> CBR_META_X11_DISPLAY, ":0",
> CBR_META_X11_XID, "4711",
> CBR_META_POINTER_X, "46",
> CBR_META_POINTER_Y, "766",
> CBR_META_LANGUAGE, "de_DE",
> -1);
>
> If that function is called, the caller should pass as many
> properites as possible. Then, libcanberra will try to find the right
> sound file for this event, and contact the sound server for
> playback. The meta information is passed: to do transparent i18n, for
> a11y, for sound effects (i.e. the spacial sound effects I mentioned
> earlier with the POINTER_X and POINTER_Y props).
>
> (In reality the API will probably have a couple of more functions,
> for cacheing, and for predefining properties so that you don't have
> to specify them for each event again. So maybe 5 functions or so.)
>
> As soon as I have a version of this library I will write a small
> module for gtk (the kind of you can load into every gtk app with
> --gtk-module) which will basically do what libgnome currently does:
> hooking into a couple of signals -- but instead of direct calls to
> libesd it will call the aforementioned libcanberra function with the
> appropriate parameters.
>
> Advantages: suddenly sound events work for non-gnome apps (i.e. only
> gtk-using apps) too. We can remove yet another part from libgnome,
> and last but not least, yet another hard dep on ESD is gone, and
> not even replaced by one on PA. Not even libcanberra becomes a hard
> dep of Gtk. Gustavo, Ronald, this is where should rejoice, again.
>
> 3. Mixer APIs. There are thre mixer control tools right now: the OSD
> that is shown when you press your volume-up/volume-down keys; the
> mixer applet; and gnome-volume-control. The OSD is supported fine
> through gst-pulse (our rocking PA plugin for gst), but for the
> applet and the standalone mixer i'd like to see a replacement. Right
> now both use the gst mixer abstraction API, which only exposes a
> very limited set of what our PA mixer can do and which quite frankly
> is a big mess. We'd have two options here: fix the gst mixer api, so
> that it exports the whole functionality that PA offers. Or, just
> make the mixer depend directly on the PA libs. I'd vote for the
> latter. Why? Because abstraction APIs in most cases suck, and
> especially if a large part of the API is only implemented in a single
> backend (which would be PA). That's why in F9 we will probably drop
> g-v-c and replace it with pa's specific mixer tool called
> "pavucontrol", that we already ship. (I mentioned this already
> above). So, what I'd like to see is that pavucontrol could become a
> part of GNOME proper eventually, and for that to work PA would need
> to become a blessed dependency. While I see not much worth in
> developing two volume control tools in parallel, we could even keep
> g-v-c around for those who prefer to stick with their bare
> 90s-style audio systems. (Ronald, Gustavo, that's
> again where you should rejoice). The question of course remains,
> which mixer app to maintain in GNOME. My own pavucontrol is quite
> featureful, but I think it's not the best thing UI-wise (though some
> people seem to disagree with me -- and do like it). I'd be happy if
> someone would pick this up. If noone picks it up, I will probably hack up some
> pa-specific applet and stick it together with pavucontrol in GNOME
> SVN, and then suggest it for inclusion into GNOME proper.
>
> So far my plans. When we have dealt with these three issues, GNOME
> should work fine on both PA and without PA. Will take some time to
> implement them all. But I hope that even people like Gustavo and
> Ronald can live with it.
>
> Oh, and I hope that my comments on Gustavo's and Ronald's position
> didn't sound too harsh. It's just that I consider your positions
> badly-informed and a bit FUDish, it's not intended to be personal.
>
> Any questions?
>
> Yours,
> the stubborn Lennart
>
--
Gustavo J. A. M. Carneiro
<gjc inescporto pt> <gustavo users sourceforge net>
"The universe is always one step beyond logic" -- Frank Herbert
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]