Re: Party Monster audio test
- From: Stefan Westerfeld <stefan space twc de>
- To: Tim Janik <timj gtk org>
- Cc: beast gnome org
- Subject: Re: Party Monster audio test
- Date: Thu, 25 May 2006 16:21:15 +0200
Hi!
On Thu, May 25, 2006 at 01:58:54PM +0200, Tim Janik wrote:
> On Thu, 25 May 2006, Stefan Westerfeld wrote:
> >Index: plugins/bsenoise.cc
> >===================================================================
> >RCS file: /cvs/gnome/beast/plugins/bsenoise.cc,v
> >retrieving revision 1.2
> >diff -u -p -r1.2 bsenoise.cc
> >--- plugins/bsenoise.cc 7 Mar 2005 07:40:39 -0000 1.2
> >+++ plugins/bsenoise.cc 24 May 2006 21:51:44 -0000
> >@@ -53,7 +53,8 @@ class Noise : public NoiseBase {
> > {
> > g_return_if_fail (n_values <= block_size()); /* paranoid */
> >
> >- ostream_set (OCHANNEL_NOISE_OUT, &(*noise_data)[rand() %
> >(noise_data->size() - n_values)]);
> >+ //ostream_set (OCHANNEL_NOISE_OUT, &(*noise_data)[rand() %
> >(noise_data->size() - n_values)]);
> >+ ostream_set (OCHANNEL_NOISE_OUT, &(*noise_data)[0]);
> > }
> > };
> >public:
> >@@ -66,8 +67,12 @@ public:
> > const int N_NOISE_BLOCKS = 20;
> > noise_data.resize (block_size() * N_NOISE_BLOCKS);
> >
> >+ GRand *random_generator = g_rand_new_with_seed (0xdeadbeaf);
> >+
> > for (vector<gfloat>::iterator ni = noise_data.begin(); ni !=
> > noise_data.end(); ni++)
> >- *ni = 1.0 - rand() / (0.5 * RAND_MAX); // FIXME: should have class
> >noise
> >+ *ni = g_rand_double_range (random_generator, -1, 1);
> >+
> >+ g_rand_free (random_generator);
> > }
>
> to do something like this, we need:
> - a new bse option --static-random (or similar name) to switch off real
> random
> numbers and use deterministic streams instead
> - a new framework that allowes to open up several new random data streams
> and get data from those
> - use seperate random data streams everywhere we could possibly need
> deterministic results at some point (i.e. bsenoise and davxtalstrings.c,
> possibly other modules)
> - possible future randomized behaviour (e.g.l implementation of a humanizer)
> will also have to honour --static-random
Agreed. I'd like to add that the g_random style functions are very
convenient (that is, for instance if you need a random value between -1
and 1 the code is much more readable when written with g_random than
when written without), so I'd suggest that the bse random streams should
have a similar API.
Of course I would be happy with a C++ API like
http://www.gtkmm.org/docs/glibmm-2.4/docs/reference/html/classGlib_1_1Rand.html
except of course that seeding should go away (the framework decides
whether to seed or not and how) and be replaced by a "rewind" or "reset"
function, which - if the --static-random option was given - resets the
random generator to an initial state.
> > void
> > reset1()
> >Index: plugins/davsyndrum.c
> >===================================================================
> >RCS file: /cvs/gnome/beast/plugins/davsyndrum.c,v
> >retrieving revision 1.29
> >diff -u -p -r1.29 davsyndrum.c
> >--- plugins/davsyndrum.c 16 May 2006 10:20:47 -0000 1.29
> >+++ plugins/davsyndrum.c 24 May 2006 21:51:44 -0000
> >@@ -259,7 +259,7 @@ dmod_process (BseModule *module,
> > /* trigger drum */
> > dmod_trigger (dmod,
> > freq_in ? BSE_FREQ_FROM_VALUE (freq_in[i]) :
> > dmod->params.freq,
> >- ratio_in ? ratio_in[1] : 1.0);
> >+ ratio_in ? ratio_in[i] : 1.0); /* FIXME: <- bug?
> >*/
> > spring_vel = dmod->spring_vel;
> > env = dmod->env;
> > freq_rad = dmod->freq_rad;
>
> could you be bothered to elabortate here?
> all i can gather from your mail is "bug?" and "I think I found a bug here,
> and the fix can be committed right away". i'm sorry, but for me that isn't
> particularly telling, so i'd apprechiate an explanation of why you think
> this is buggy (and possible investigation of source history to figure how/
> if a bug was introduced here). and why you think you know what the fix would
> be.
Well, the point is that we're receiving a trigger event here, and the
event is at position "i" in the stream. However, the code in the CVS
doesn't use ratio_in at position "i" to determine the strength of the
trigger event, but ratio_in at position "1". Thus it doesn't use the
right ratio associated with the trigger event (and possibly even
accesses memory beyond the buffer size, although accessing the "1"
element shouldn't be particularily bad).
When it comes to my patch, what is confusing is maybe that I left the
FIXME comment there. With "ratio_in[i]" (instead of "ratio_in[1]"), as
of after applying the patch, there is no problem.
> >Index: slowtests/audio/Makefile.am
> >===================================================================
> >RCS file: /cvs/gnome/beast/slowtests/audio/Makefile.am,v
> >retrieving revision 1.20
> >diff -u -p -r1.20 Makefile.am
> >--- slowtests/audio/Makefile.am 21 May 2006 18:04:12 -0000 1.20
> >+++ slowtests/audio/Makefile.am 24 May 2006 21:51:45 -0000
> >@@ -80,3 +80,13 @@ velocity-test:
> > $(BSEFEXTRACT) $(@F).wav --cut-zeros --channel 1 --avg-spectrum
> > --spectrum --avg-energy >> $(@F).tmp
> > $(BSEFCOMPARE) $(srcdir)/velocity.ref $(@F).tmp --threshold 99.99
> > rm -f $(@F).tmp $(@F).wav
> >+
> >+# the BEAST demo song
> >+FEATURE_TESTS += partymonster-test
> >+EXTRA_DIST += partymonster.ref
> >+partymonster-test:
> >+ $(BSE2WAV) $(top_srcdir)/library/demo/partymonster.bse $(@F).wav
> >+ $(BSEFEXTRACT) $(@F).wav --cut-zeros --channel 0 --avg-spectrum
> >--spectrum --avg-energy --end-time > $(@F).tmp
> >+ $(BSEFEXTRACT) $(@F).wav --cut-zeros --channel 1 --avg-spectrum
> >--spectrum --avg-energy --end-time >> $(@F).tmp
> >+ $(BSEFCOMPARE) $(srcdir)/partymonster.ref $(@F).tmp --threshold 99.99
> >+ rm -f $(@F).tmp $(@F).wav
>
> below, you're saying 99.99% is too tight to be matched in the presence of
> real noise. wouldn't something like 99.8% make sense here then, until
> something like --static-random is implemented?
Yes, it should work. I took 99.9% now, which should pass. Of course,
since we're relying on random data here, I can not guarantee that the
test will always pass, as for some obscure random sequence it may sound
more different than for those sequences I tested it with. But three test
runs give me:
average similarity rating: 99.951312% => good match.
average similarity rating: 99.952625% => good match.
average similarity rating: 99.955167% => good match.
Which doesn't look like there is much danger of this average value
dropping below 99.9%.
> >Index: tools/bsefextract.cc
> >===================================================================
> >RCS file: /cvs/gnome/beast/tools/bsefextract.cc,v
> >retrieving revision 1.28
> >diff -u -p -r1.28 bsefextract.cc
> >--- tools/bsefextract.cc 21 May 2006 01:24:11 -0000 1.28
> >+++ tools/bsefextract.cc 24 May 2006 21:51:54 -0000
> >@@ -176,16 +176,25 @@ struct Feature
> > const char *description;
> > bool extract_feature; /* did the user enable this feature
> > with --feature? */
> >
> >+ string
> >+ double_to_string (double data) const
> >+ {
> >+ char *x = g_strdup_printf ("%.17g", data);
> >+ string s = x;
> >+ g_free (x);
> >+ return s;
> >+ }
> >+
> > void print_value (const string& value_name, double data) const
> > {
> >- fprintf (options.output_file, "%s = %f;\n", value_name.c_str(), data);
> >+ fprintf (options.output_file, "%s = %s;\n", value_name.c_str(),
> >double_to_string (data).c_str());
> > }
> >
> > void print_vector (const string& vector_name, const vector<double>&
> > data) const
> > {
> > fprintf (options.output_file, "%s[%ld] = {", vector_name.c_str(),
> > data.size());
> > for (vector<double>::const_iterator di = data.begin(); di !=
> > data.end(); di++)
> >- fprintf (options.output_file, " %f", *di);
> >+ fprintf (options.output_file, " %s", double_to_string
> >(*di).c_str());
> > fprintf (options.output_file, " };\n");
> > }
> >
> >@@ -210,7 +219,7 @@ struct Feature
> > const vector<double>& line = *mi;
> >
> > for (vector<double>::const_iterator li = line.begin(); li !=
> > line.end(); li++)
> >- fprintf (options.output_file, " %f", *li);
> >+ fprintf (options.output_file, " %s", double_to_string
> >(*li).c_str());
> > fprintf (options.output_file, " }\n");
> > }
> > fprintf (options.output_file, "};\n");
>
> i think you can commit that right away.
Ok, in CVS now (with modifications, see below).
> >Some comments on the patch:
> >* eliminating random data in bsenoise.cc doesn't change the similarity
> > score too much, but is of course necessary for 100% matches
> >* davsyndrum.c: I think I found a bug here, and the fix can be
> > committed right away
> >* the main randomness problem really is in the davxtalstrings.c module,
> > after eliminating this source of random, my compare runs were much
> > more similar than before
> >
> >As for the last change, I think it might be useful to increase the
> >precision with which bsefextract outputs its extracted features. As a
> >user trying to read the feature file with an editor, I dislike using the
> >%.17g format, because its much harder to read than %f.
>
> i can't say i agree here. i much prefer 1.3e-69 (%g) over
> 0.0000000000000000000000000000000000000000000000000000000000000000000013
> (%f). and for figures with smaller exponent, %g equals %f anyway.
As for what bsefextract and bsefcompare do right now, such a small
number would usually be meaningless anyway; in a frequency spectrum for
instance, a frequency with this amplitude couldn't be percieved by
anybody.
So what I am saying is that for your number storing 0.0 would be
perfectly all right. But yes, planning for any possible future use, I
can't guarantee that we won't have doubles with such exponents which do
have a meaning (although right now I can't think of any), so %g is
probably superior.
> >The other thing
> >that I dislike about it as a casual user is that matrix features
> >(--spectrum) are not aligned any more, that is, the numbers of two
> >different lines no longer start at the same column.
>
> huh? but that can easily be fixed, have you tried something like %15.9g?
Ok, I use it now. Actually I use %-15.9g for the matrix and vector
formats and %.9g for the normal format; I think the files look somewhat
readable, although of course the lines became much longer.
> >However, as a developer, I see that %f and %.17g features are so
> >different that changing the bsefextract output from one to another
> >actually breaks some existing tests. This leads me to the conclusion
> >that %f's implicit quantization could have the same effect in some
> >real world scenarios (i.e. 0.000003 vs. 0.000002 in a spectrum feature
> >could bring similarity below the threshold, although the "real" distance
> >- the "double" distance before writing the feature file - between both
> >extracted features might be much smaller than 0.000001).
>
> %f can't be used to properly represent a double. at least not with much
> less than 600 character strings. that's why our serialization functions
> which can't allow for data loss are using:
> if (g_option_check (hints, "f")) /* float hint */
> gstring_puts (gstring, g_ascii_formatd (numbuf,
> G_ASCII_DTOSTR_BUF_SIZE, "%.7g", sfi_value_get_real (value)));
> else
> gstring_puts (gstring, g_ascii_formatd (numbuf,
> G_ASCII_DTOSTR_BUF_SIZE, "%.17g", sfi_value_get_real (value)));
> and why we have extra API to write out flots to data streams:
> sfi_wstore_putf();
> sfi_wstore_putd();
>
> speaking of which, you need to change your code to use those anyway,
> because currently the reference files will break in non-C locales.
Ok, I am using g_ascii_format_d now, as you suggested.
Cu... Stefan
--
Stefan Westerfeld, Hamburg/Germany, http://space.twc.de/~stefan
[
Date Prev][
Date Next] [
Thread Prev][
Thread Next]
[
Thread Index]
[
Date Index]
[
Author Index]