formats for Free sounds



I did a bit more research into this -- seeing what formats support
embedded textual data (and specific annotations such as Copyright
information), and also which free sound tools actually let you
manipulate this.

While most popular PCM audio formats allow extremely large amounts
of textual data to be embedded, support for actually setting or even
retrieving this in existing free software tools is pretty much
nonexistant.

The details are attached, and also available at:

  http://www.cse.unsw.edu.au/~conradp/sounds/textual_data.txt

Conrad.
Manipulating textual data in sound file formats

Conrad Parker <conrad vergenet net>
Last modified: Thu Oct  5 2000
http://www.cse.unsw.edu.au/~conradp/sounds/textual_data.txt


Introduction
============

This information has been assembled in order to aid creators of sound
data in applying copyright attribution and licensing information to
sound files. In particular, this document is intended for creators of
free sound files, and was assembled after a suggestion from Dan Mueth
for the Gnome Sound Project.

This document examines some popular sound file formats to see which have
the ability to usefully store textual data. It also surveys some available
free sound tools for the manipulation of these data.


1. Support for textual data in sound file formats
=================================================

Much of this information has been extracted from the Audio File Formats
FAQ [1].

Notes:

  1. the following large constants [2] are used:

  USHRT_MAX = 65535
  LONG_MAX  = 2147483647
  ULONG_MAX = 4294967295

  2. a "null-terminated string" is a block of text which can be of any
  length. Typically this can be no longer than LONG_MAX characters due
  to programming conventions.

1.1 AIFF-C
----------

  AIIF-C [3] is an IFF chunk-based format. Various types of chunk exist
  for holding textual information.

  The standard text chunks are for Name, Author, Copyright, and
  Annotation. These can each store up to LONG_MAX characters.

  A COMT chunk can store up to USHRT_MAX comments, each of which contains
  a timestamp and up to USHRT_MAX characters of text.

1.2 NeXT/Sun audio
------------------

  An "optional text information" field is included as the last element of
  the header and can be of arbitrary size (between 4 and LONG_MAX-24
  characters).

1.3 RIFF (WAV)
--------------

  RIFF audio files [4] can contain a chunk of "associated data", which
  can be one of:

	* a 'label' or 'note' chunk which holds only a null-terminated
  string
	* a 'text with data length' chunk which can contain a "Purpose",
  a "Country Code", "Language" and "Dialect" information, and the
  "CodePage" (which names a Windows character set mapping) to use.
	* an embedded file, of any other RIFF-compliant type, including
  text.

1.4 IFF/8SVX
------------

  This is an IFF chunk-based format with support for an Annotation
  chunk.

1.5 Creative Voice (VOC)
------------------------

  Has "data blocks" of various types, one of which is for text up to
  a length of 16K characters.

1.6 SampleVision
----------------

  Can contain up to 60 characters of text.

1.7 MPEG Audio (MP3)
--------------------

  The MPEG Audio format is a streaming data format containing minimal
  header information per frame. (The common term MP3 refers to the
  storage of MPEG Audio Layer III data). No textual data is stored in
  an MPEG Audio stream.

  However, a conventional annotation format known as ID3 exists for MP3
  data files. This allows for information including Songname, Artist, 
  and Album, each up to 30 characters, a Year (4 characters), and a
  Comment of up to 28 characters to be stored.

1.8 Ogg Vorbis
--------------

  Ogg Vorbis [5] is a free streaming audio data format. No textual data
  is stored in an Ogg Vorbis bitstream. No conventional annotation format
  exists, though it is likely that ID3 formatted annotations would be
  used precisely as they are with MP3 files.


2. Support for manipulating textual data in free sound file tools
=================================================================

2.1 Sox
-------

  Sox [6] is a commandline tool for playing, converting and applying
  effects to sound files. As of version 12.17 it does not allow the user
  to edit textual comments within a sound file, however it does contain
  code required for writing comments to various types of sound file.

2.2 Audio File Library
----------------------

  libaudiofile [7] is an implementation of SGI's Audio File Library API [8].
  The API provides a means for accessing "Miscellaneous Data Chunks", which
  can be of various types including copyright, author, name and annotation.
  As of version 0.1.10 libaudiofile supports miscelleneous data chunks for
  the AIFF-C format only.

  The 'sfinfo' tool which comes with libaudiofile is able to print out
  the Copyright chunk if present in a file. I am not aware of any other
  sound tools using libaudiofile which are able to manipulate Miscellaneous
  Data Chunks.

2.3 libsndfile
--------------

  libsndfile [9] is "a C library for reading and writing files containing
  sampled sound". As of version 0.0.21, it does not have the ability to
  manipulate textual comments in sound files.

2.4 ID3 manipulation tools
--------------------------

  Many tools exist which can manipulate ID3 tags. A number of these are
  listed at Freshmeat.net [10].


References
==========

[1] Audio File Formats FAQ, Section 11 "File Formats"
v4.0, 14 Nov 1998, Chris Bagwell <cbagwell sprynet com>
http://home.sprynet.com/~cbagwell/AudioFormats-11.html

[2] ISO C Standard: 4.14/2.2.4.2 Limits of integral types <limits.h>
See /usr/include/limits.h on most systems. Note that as we are talking
about file formats, these limits are _not_ processor specific. Eg.
<limits.h> shipped with GCC has a different definition for ULONG_MAX
on Alpha (64-bit) processors which should be ignored when reading
these file format specs.

[3] AIFF-C specification, available at
http://home.sprynet.com/~cbagwell/aiff-c.txt
Apple Computer, Inc. 1991

[4] ftp://ftp.cwi.nl/pub/audio/RIFF-format, excerpted from "Multimedia
Programming Interface and Data Specification v1.0"

[5] Ogg Vorbis, http://www.xiph.org/ogg/vorbis/index.html
Xiphophorus

[6] SoX (Sound eXchange), http://home.sprynet.com/~cbagwell/sox.html
Chris Bagwell <cbagwell sprynet com>

[7] Audio File Library, http://www.68k.org/~michael/audiofile/
Michael Pruett <michael 68k org>

[8] Audio File Library specification, available at
http://ask.ii.uib.no/ebt-bin/nph-dweb/dynaweb/SGI_Developer/DMediaDev_PG/@Generic__BookTextView/11986
SGI

[9] libsndfile, http://www.zip.com.au/~erikd/libsndfile/
Erik de Castro Lopo <erikd zip com au>

[10] Freshmeat.net, http://freshmeat.net/search/?q=id3


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]