[Easytag-mailing] Easytag Major Performance Fix



Hey list,

I am reposting this because I checked git source and it still hasn't
been merged, despite having been accepted since 2009-08-25. Also, the
debianization needs to be updated. The changelog is still for 2.1.4.
Whoever is maintaining the git repository, please apply this patch when
you get a chance.

I've patched src/base64.c. There was a major problem when it was being
called to decode very large buffers in orders of magnitude of a megabyte
or more (e.g. cover art in a FLAC / Vorbis / etc. tag is frequently this
size for some people). The base64_decode() routine had a cubic running
time, since every time the decode pointer shifted forward in the stream,
token_decode() would recompute the buffer length every time. e.g. If the
buffer length was n originally...

n = 10 -> 10(10 + 9 + 8 + ... + 1) = 100 + 90 + 80 + ... + 10

Where the 10 is example buffer length and the descending sequence are
the number of things token_decode() scale wise does on each iteration
within base64_decode() of the stream.

This is the same as...
n( n(n + 1) / 2) = (n^2 (n + 1) / 2) = (n^3 + n^2) / 2 ϵ Θ(n^3)

Before when I was loading a directory containing, say, some K-OS tunes
with album art in them it took almost 8 minutes to load a track that had
large album art in it. It takes less than a second or two now.

I've also made some minor changes in the interest of better portability.

-- 
Kip Warner -- Software Developer
OpenPGP encrypted/signed mail preferred
http://www.thevertigo.com

--- base64.c.upstream	2009-06-13 01:10:37.000000000 -0700
+++ base64.c	2009-06-13 01:07:13.000000000 -0700
@@ -104,9 +104,18 @@
 #include <string.h>
 #include "base64.h"
 
+/* Useful for encoding */
 static char base64_chars[] = 
     "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789+/";
 
+/* For constant time lookups */
+static int
+is_base64_char(char c)
+{
+    return (('A' <= c && c <= 'Z') || ('a' <= c && c <= 'z') || 
+            ('0' <= c && c <= '9') || (c == '+' || c == '/'));
+}
+
 static int 
 pos(char c)
 {
@@ -156,16 +165,16 @@ base64_encode(const void *data, int size
     return strlen(s);
 }
 
-#define DECODE_ERROR 0xffffffff
+#define DECODE_ERROR ((unsigned) -1)
 
 static unsigned int
-token_decode(const char *token)
+token_decode(const char *token, const size_t size)
 {
     int i;
     unsigned int val = 0;
     int marker = 0;
     
-    if (strlen(token) < 4)
+    if (size < 4)
         return DECODE_ERROR;
     for (i = 0; i < 4; i++)
     {
@@ -185,13 +194,15 @@ token_decode(const char *token)
 int
 base64_decode(const char *str, void *data)
 {
-    const char *p;
-    unsigned char *q;
+    const char     *p;
+    unsigned char  *q;
+    unsigned int    size;
 
     q = data;
-    for (p = str; *p && (*p == '=' || strchr(base64_chars, *p)); p += 4)
+    size = strlen(str);
+    for (p = str; *p && (*p == '=' || is_base64_char(*p)); p += 4, size -= 4)
     {
-        unsigned int val = token_decode(p);
+        unsigned int val = token_decode(p, size);
         unsigned int marker = (val >> 24) & 0xff;
         
         if (val == DECODE_ERROR)
@@ -204,3 +215,4 @@ base64_decode(const char *str, void *dat
     }
     return q - (unsigned char *) data;
 }
+

Attachment: signature.asc
Description: This is a digitally signed message part



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]