Just my 0.02$ on sound and stuff...




This is maybe not really related Gnome itself but since the problem
has to be solved somewhere, I guess this is as good place to start
as anywhere.... :-) (I hope)

I have been thinking of how Gnu/Linux sound should work. I belive
that the best approach should be to use ideas from the X-windows
design. This is not new, many people have tried to do this for some
time now, NAS and Xaudio are some examples.

The following comments describe how I think the sound system should
work and why. Sorry if this is old stuff.

* Networked sound

	This is a necessity. We have had networked graphics
	for a looong time which is probably much more difficult to
	implement. Networked sound is slightly more sensible to
	network lag, but is in my opinion still easier to implement.
	There are a lot of users running linux as x-terminals that
	would like to have sound support and I often use remote programs.

* The sound should be managed by a sound server

	All different graphics cards are managed by the XServer. This
	server can be extended to handle new cards. No need to modify the
	kernel for this. I believe that this should apply to sound too. 
	Today we have to rebuild the kernel or modules to get sound.
	It would also be much easier if some company would like to offer a 
	separate server for a special sound card for proprietary, 
	non-disclosure and other such horrible reasons. Of course some 
	support in the kernel is necessary for example to support switching 
	of virtual terminals so the sound is related to a terminal and not 
	global. Maybe I am totally lost here and it turns out that
	sound must be in the kernel but we can still build a server.
	(In fact the first server probably uses the standard
	methods.)

* The sound server also handles multiple clients

	Today only one user at a time can write to /dev/dsp. There
	have been several suggestions to solve this. A sound server would be
	designed from the beginning to serve several clients just like the
	Xserver serves several programs. I believe that a sound server is a
	cleaner solution.

* Base it on the X attitude of low round trip numbers

	Design the protocol so that it can be buffered and shipped
	blockwise. You might wonder how this works with audio, but
	wait, read on. A sound server should not primarily be based
	on streaming data just as the Xserver is not based on streaming 
	bitmapped graphics. Of course there is a need for streamed
	sound as well as graphics so we have 
	something similar to Ximages and sharedmem Ximages for sound 
	too.

* Which features to implement first?

	For some reason NAS and Xaudio have concentrated on streaming
	audio. 	Xaudio (even though the project is slightly in coma) has 
	also stated that it should not be for games! Huh? Almost all
        the sound I have seen (heard) has been in games. Also other
	appliances is for synth programs and sequencers which, just
	like games, require some kind of realtime approach.

	1> So I propose that the simplest sound server should work
	   like a remotely controllable modplayer. We can have a table of
	   samples on the server side. The samples have names just like colors
	   on the Xserver has names. (Yep, it sure looks like we should take
	   a look at the general midi tables to design these.) Also some
	   new sample names are good to have like:
		Error, Ok, WaitAMoment, Start, Stop

	   We connect to the sound server and request that a certain 
	   sample should be played at a certain time, volume, pitch,
	   panning.
	   With this we can do quite a lot of things and it cuts away 
	   network traffic since its not streamed audio but instead
	   the sample names which are much smaller and a playing sequence
	   can be buffered. Of course the play sample command can skip the
	   timing and request that it should be played as soon as it arrives
	   at the server. This would be used for effects and the 
	   buffered,timed approach for music.

	2> When we got this working we can add the possibility to 
	   upload clientspecific samples. Just like the XCreatePixmap
	   creates a pixmap that resides on the server and can be used several
	   times. Now the server is getting really useful.

	3> The audio server should also be able to control the mixer.
	   This sort of corresponds to the window manager. It manages
	   different volumes from different devices and different programs. 

	4> Then we can add support for streaming data over the
	   network or through shared memory. Getting sound back from the
	   microphone could be interesting. 

	5> Remote control of music cds is probably also interesting,
	   and harddisk recording and all the other things in the
	   universe...

* The protocol is language independent just as X

	The sound server is most likely written i C and also the
	first sound access libraries. Then we can use these raw or
	build a wrapper like GDK around for later use higher up in the 
	hierarchies like GTK.
	
* The protocol can be extended just like the X protocol.

	In the beginning though it is more likely that the protocol
	changes before it stabilizes. A normal modplayer would only require 
	a simple protocol but if we use Timidity as a base we could use some
	sort of modified midi over the wires. Whatever.

	Timidity can be a little bit heavy on the processor so the
	server might only be able to support certain features
	(corresponding to visuals) attack/decay part of sample,
	sustain part of sample, reverb, panning, number of simultaneously 
	playing samples and so on.
  
	I have not spoken of FM modulated music because it is boring.
 	But maybe someone wants to write a sound server for an adlib
	card that does its best (or worst) to simulate the
	different standard instruments with the FM synth.

Just my 2 cents. Please drop me a line if you have comments, want to
code this or have coded this already! :-)

Fredrik Ohrstrom
d92-foh@nada.kth.se










[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]