[Easytag-mailing] Easytag Major Performance Fix



Greetings,

I've patched src/base64.c. There was a major problem when it was being
called to decode very large buffers in orders of magnitude of a megabyte
or more (e.g. cover art in a FLAC / Vorbis / etc. tag is frequently this
size for some people). The base64_decode() routine had a cubic running
time, since every time the decode pointer shifted forward in the stream,
token_decode() would recompute the buffer length every time. e.g. If the
buffer length was n originally...

n = 10 -> 10(10 + 9 + 8 + ... + 1) = 100 + 90 + 80 + ... + 10

Where the 10 is example buffer length and the descending sequence are
the number of things token_decode() scale wise does on each iteration
within base64_decode() of the stream.

This is the same as...
n( n(n + 1) / 2) = (n^2 (n + 1) / 2) = (n^3 + n^2) / 2 ϵ Θ(n^3)

Before when I was loading a directory containing, say, some K-OS tunes
with album art in them it took almost 8 minutes to load a track that had
large album art in it. It takes less than a second or two now.

I've also made some minor changes in the interest of better portability.

Hope this helps.

PS Is there no subversion server? It seems weird that I'm generating a
diff on stuff in a tarball.

-- 
Kip Warner -- Software Developer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com
--- base64.c.upstream	2009-06-13 01:10:37.000000000 -0700
+++ base64.c	2009-06-13 01:07:13.000000000 -0700
@@ -104,9 +104,18 @@
 #include <string.h>
 #include "base64.h"
 
+/* Useful for encoding */
 static char base64_chars[] = 
     "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
 
+/* For constant time lookups */
+static int
+is_base64_char(char c)
+{
+    return (('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z') || 
+            ('0' <= c && c <= '9') || (c == '+' || c == '/'));
+}
+
 static int 
 pos(char c)
 {
@@ -156,16 +165,16 @@ base64_encode(const void *data, int size
     return strlen(s);
 }
 
-#define DECODE_ERROR 0xffffffff
+#define DECODE_ERROR ((unsigned) -1)
 
 static unsigned int
-token_decode(const char *token)
+token_decode(const char *token, const size_t size)
 {
     int i;
     unsigned int val = 0;
     int marker = 0;
     
-    if (strlen(token) < 4)
+    if (size < 4)
         return DECODE_ERROR;
     for (i = 0; i < 4; i++)
     {
@@ -185,13 +194,15 @@ token_decode(const char *token)
 int
 base64_decode(const char *str, void *data)
 {
-    const char *p;
-    unsigned char *q;
+    const char     *p;
+    unsigned char  *q;
+    unsigned int    size;
 
     q = data;
-    for (p = str; *p && (*p == '=' || strchr(base64_chars, *p)); p += 4)
+    size = strlen(str);
+    for (p = str; *p && (*p == '=' || is_base64_char(*p)); p += 4, size -= 4)
     {
-        unsigned int val = token_decode(p);
+        unsigned int val = token_decode(p, size);
         unsigned int marker = (val >> 24) & 0xff;
         
         if (val == DECODE_ERROR)
@@ -204,3 +215,4 @@ base64_decode(const char *str, void *dat
     }
     return q - (unsigned char *) data;
 }
+

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]