Re: [PATCH] Add --disable-fsync option to both commit and pull (non-local) commands.



Thanks a lot for looking at this and for the patch!

There are clear uses for disabling fsync().  I wrote up some of
that rationale in the commit message and tweaked the whitespace a bit:
https://git.gnome.org/browse/ostree/commit/?id=f22fa92aef0cf7334d09addacfdfc70c9f10e075

We should add a configuration file option for this somewhere, right?
Otherwise people doing "ostree admin upgrade" (or via a higher level
tool like "rpm-ostree upgrade", or an eventual PackageKit backend)
won't have a way to set this.  (Or even if we made it a command
line option, it'd be annoying to re-specify).

We have per-remote configs now, perhaps a disable-fsync=true
option?  Or avoid double negatives and fsync=false?

Below are my timings.  My laptop has a fairly high end
Samsung 840 Pro 512GB SSD, with the RHEL7 defaults of
XFS on LVM.  No thin provisioning.

  ostree default (fsync after every write):
./write-files.py sync = ~7.5 minutes

real    0m45.190s
user    0m0.703s
sys     0m3.028s
 
  no fsync, similar to --disable-fsync
./write-files.py fast = 0.3 of second (yes)

real    0m0.484s
user    0m0.149s
sys     0m0.335s

  batch fsync calls up into groups of 222
./write-files.py gsync = ~7:30

real    0m38.885s
user    0m0.517s
sys     0m2.114s
 
  batch fsync calls up into groups of 222, and dir sync first
./write-files.py dsync = ~7:30

real    0m38.830s
user    0m0.461s
sys     0m2.137s

./write-files.py psync = ~6 seconds

real    0m7.926s
user    0m0.866s
sys     0m9.704s

...git pull/commit defaults to the --disable-fsync behaviour (this
should be obvious from just timing it ;). However while the later is
still ~18x slower than no fsync, it's a huge improvement and might be
possible to implement if we need the fsyncs to stay around by default.

I think it'd very much be worth investigation into improving the
fsync path performance.  Hmm.  This is a good thread:

http://oldblog.antirez.com/post/fsync-different-thread-useless.html

There's also of course:
http://shaver.off.net/diary/2008/05/25/fsyncers-and-curveballs/

Your "async_num" is an interesting approach.  It's implementing
something similar to what Firefox ended up doing where they
have an internal queue and only call fsync() after a certain
amount of change.

Basically we don't need to fsync() immediately after you add a bookmark.
But we do want to eventually.  After you add a hundred?  After 5
minutes?

The other option is to implement a custom journal - record which objects
have been sync'd, and on recovering from a crash, delete and re-fetch
Or re-checksum as you suggested.





[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]