Fish slow uploads - Final patch



  Several days ago I have sent a patch which tried to solve too slow FISH uploads problem. The idea also has been discussed in RedHat bugzilla . Now the solution has changed, and it is the best variant which I could find.
 
  Full description of the issue:

  Midnight Commander has a FISH virtual filesystem (FIle transfer over SHell). There is one old problem in MC implementation of it -- uploads onto remote system.
  To do such an upload, MC should invoke on the remote side a command, which reads appropriate number of bytes from stdin and store them in the target file. Initially, `dd' command had been chosen for this purpose.

  The decision was such:
( dd bs=4096 count=<size/4096> ; dd bs=<size%4096> count=1 ) | ( cat 
>target_file ; cat >/dev/null )
  (Note: additional cat to /dev/null is needed to flush input on write errors).
  But unfortunately, this variant appeared unreliable. The design of `dd' does not assume full filling of input buffers on read (see, for example, "conv=sync" description in `dd' manual). The design of `ssh' is those that sometimes data can be written into pipe by portions of different size. Therefore the part of data could remain not read by `dd' ...
   In the current version, the problem is solved as:
dd bs=1 count=<size> | ..... 
i.e., `dd' reads input byte-by-byte. It is robast, but is very slow and grabs a lot of cpu time on the remote system.
                                                                              
  The better decision is possible.
  There is `head' command. It has `-c <number>' option, which does what we really need. The necessary amount of data is read from stdin reliably.
  Unfortunately, the remote system (any *NIX-like) may not have `head' command, or have another implementation of it. But for such systems we can still use `dd' .

  The first solution was:
( head -c <size> -q - || dd bs=1 count=<size> ) | ( cat >target_file ; 
cat >/dev/null )
  Either fast `head' is used, or `dd' as a fallback. Using `-q' and `-'  we separate GNU `head' from another `head' implementations, which might be unreliable (for example, AIX4 has `-c' option but add extra newline byte after EOF).

  But "dd bs=1" seems to be too slow, even for fallback...

  The better solution has been found:
1. First, try GNU `head', as the fastest variant:
 ( head -c %lu -q - ) | ( cat >/%s ; cat >/dev/null )
2. Second, use `dd'-based fallback:
  rest=%lu
  while [ $rest -gt 0 ]
  do
    cnt=`expr \( $rest + 255 \) / 256`
    n=`dd bs=256 count=$cnt | tee -a /%s | wc -c`
    rest=`expr $rest - $n`
  done
  Notes:
    - `bs=256' is used, because more large buffer does not give real effect.
    - we do not read tail (dd bs=<remaining> count=1) explicitly, because MC never send more bytes until receives a status code.
    - `expr' is used as most portable way for expressions.
  In ideal world, only one `dd' invokation is needed. Unfortunately, `dd'  can receive less bytes than `bs' is, and does not report exact byte number (only full/partial block counts are reported).
Therefore we use `tee' and then `wc' to count bytes properly.

   `dd'-based variant is fast enough, but using of `head' (when possible) can reduce remote cpu time in several times. Therefore it is better to try `head' first, anyway.

   The final script is as this:
  >/%s
  res=`exec 3>&1; ( head -c %lu -q - || echo DD >&3 ) | ( cat >/%s; cat >/dev/null)`
  [ "$res" = DD ] && {
     rest=%lu
     while [ $rest -gt 0 ]
     do
       cnt=`expr \( $rest + 255 \) / 256`
       n=`dd bs=256 count=$cnt | tee -a /%s | wc -c`
       rest=`expr $rest - $n`
     done
  }
  and similar for file append case, with some little changes.

  `echo DD' etc. looks ugly, but it is portable :-)

 
    I have tested both head and fallback successfully with remote host under Linux, and fallback with remote AIX4. I also checked GNU  and some old (1984) UNIX sources of `dd', `tee', `wc' -- all things should be OK.

-- 
		Dmitry Butskoj <buc AT odusz.so-cdu.ru>
		Saint-Petersburg, Russia
		Red Hat Certified Engineer 809003662809495
diff -Nrbu mc-4.6.1-20041201-pre1a/vfs/fish.c mc-4.6.1-20041201-OK/vfs/fish.c
--- mc-4.6.1-20041201-pre1a/vfs/fish.c	2004-12-08 16:19:25.000000000 +0300
+++ mc-4.6.1-20041201-OK/vfs/fish.c	2004-12-08 17:03:28.000000000 +0300
@@ -502,7 +502,31 @@
 	close (h);
 	ERRNOR (EIO, -1);
     }
-    /* Use this as stor: ( dd block ; dd smallblock ) | ( cat > file; cat > /dev/null ) */
+
+    /* First, try this as stor:
+     *
+     *     ( head -c number ) | ( cat > file; cat >/dev/null )
+     *
+     *  If `head' is not present on the remote system, `dd' will be used.
+     * Unfortunately, we cannot trust most non-GNU `head' implementations
+     * even if `-c' options is supported. Therefore, we separate GNU head
+     * (and other modern heads?) using `-q' and `-' . This causes another
+     * implementations to fail (because of "incorrect options").
+     *
+     *	Fallback is:
+     *
+     *	   rest=<number>
+     *	   while [ $rest -gt 0 ]
+     *	   do
+     *	      cnt=`expr \( $rest + 255 \) / 256`
+     *	      n=`dd bs=256 count=$cnt | tee -a <target_file> | wc -c`
+     *	      rest=`expr $rest - $n`
+     *	   done
+     *
+     *	`dd' was not designed for full filling of input buffers,
+     *	and does not report exact number of bytes (not blocks).
+     *	Therefore a more complex shell script is needed.
+     */
 
     print_vfs_message(_("fish: store %s: sending command..."), name );
     quoted_name = name_quote (name, 0);
@@ -513,25 +537,45 @@
 		 "#STOR %lu /%s\n"
 		 "> /%s\n"
 		 "echo '### 001'\n"
+                 "res=`exec 3>&1\n"
 		 "(\n"
-		   "dd bs=1 count=%lu\n"
+		   "head -c %lu -q - || echo DD >&3\n"
 		 ") 2>/dev/null | (\n"
 		   "cat > /%s\n"
 		   "cat > /dev/null\n"
-		 "); echo '### 200'\n",
+		 ")`; [ \"$res\" = DD ] && {\n"
+			"rest=%lu\n"
+			"while [ $rest -gt 0 ]\n"
+			"do\n"
+			"    cnt=`expr \\( $rest + 255 \\) / 256`\n"
+			"    n=`dd bs=256 count=$cnt | tee -a /%s | wc -c`\n"
+			"    rest=`expr $rest - $n`\n"
+			"done\n"
+		 "}; echo '### 200'\n",
 		 (unsigned long) s.st_size, name, quoted_name,
+		 (unsigned long) s.st_size, quoted_name,
 		 (unsigned long) s.st_size, quoted_name);
     else
 	n = fish_command (me, super, WAIT_REPLY,
 		 "#STOR %lu /%s\n"
 		 "echo '### 001'\n"
+		 "res=`exec 3>&1\n"
 		 "(\n"
-		   "dd bs=1 count=%lu\n"
+		   "head -c %lu -q - || echo DD >&3\n"
 		 ") 2>/dev/null | (\n"
 		   "cat >> /%s\n"
 		   "cat > /dev/null\n"
-		 "); echo '### 200'\n",
+		 ")`; [ \"$res\" = DD ] && {\n"
+			"rest=%lu\n"
+			"while [ $rest -gt 0 ]\n"
+			"do\n"
+			"    cnt=`expr \\( $rest + 255 \\) / 256`\n"
+			"    n=`dd bs=256 count=$cnt | tee -a /%s | wc -c`\n"
+			"    rest=`expr $rest - $n`\n"
+			"done\n"
+		 "}; echo '### 200'\n",
 		 (unsigned long) s.st_size, name,
+		 (unsigned long) s.st_size, quoted_name,
 		 (unsigned long) s.st_size, quoted_name);
 
     g_free (quoted_name);


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]