Alexander Larsson wrote:
On Sat, 2009-03-14 at 19:21 +0100, Alexander Larsson wrote:We all understand that its is per-spec to not guarantee data-before-metadata on rename, we're not stupid and able to read a manpage as well as you. But we still think its a bad idea and not a sign of robust software.Additionally, I'm not saying glib should not fsync in the rename case. We should of course follow the spec so that glib apps don't lose data on such filesystems. But that doesn't make such filesystem behaviour a good idea. Let's be clear about what you think is a bad idea. You think it's a bad idea for a file system to optimize a situation that is legal under the specification, but not well known by the application developers. You think that because it is commonly done, therefore the operation system and/or file system should guess what your intent is, and disable the optimization for this situation. Let's take this away from file systems for a second and to a similar situation. A lot of Java designers still don't know about the Java memory model, and how changes to variable in one thread may not be visible to another thread, or if visible, the changes may not be made in the same order unless synchronization primitives are used. This is the exact same situation. It's an allowance in the specification that is not well known by application developers, that requires careful use of synchronization primitives in order to function reliably and portably. fsync() is a synchronization primitive. In case people don't like Java here - similar things can happen in C, which is why compiler instructions like 'volatile' become necessary. It's a newbie mistake of sorts. Here's another one - people who don't check the return result of close()/fclose(). I bet if you look through many of the people who do close()/rename(), you'll find that a lot of them don't check the result of close() before they call rename(). I note that the glib/gio as referenced in the patch properly checks the result of fclose(). It's very common for applications to not check this. Will you also say that any file system where the last write() succeeds, but the final close() fails is somehow broken? You used the argument that if every application must do it - shouldn't the file system implement it? Well, why doesn't this apply to close()? Why stop at rename()? I'm not calling you stupid or saying you can't read a manpage. I am saying that any expectation on your part that file systems everywhere will one day universally implement your particular expectations is unreasonable, and not everybody agrees with you that the behaviour is wrong. I don't agree with you. The specification does not agree with you. I think fsync() is absolutely necessary to be explicit in this situation, because the application needs to assert that all data is written *before* using rename to perform the atomic-change-in-place effect. I think that anybody who thinks fsync() is unnecessary is failing to see the principle that fsync() exists solely for the purpose of guaranteeing this state, and that if you think fsync() should be unnecessary here, you should also think fsync() should be unnecessary anywhere else. Why have an fsync() at all? Why shouldn't all operations be synchronous by nature? Change the specification to force all I/O operations to be ordered that way no application developer will ever have to be surprised or ever call a synchronization primitive again. Right? I value asynchronous operations, and see synchronization primitives as being a necessary evil to allow for asynchronous operations to be possible. write() and close() never promise to store content to disk. rename() has nothing to do with content. The only way to guarantee the operation is safe is using fsync() before close(). Relying on the file system to guess your intent is unreasonable. Cheers, mark -- Mark Mielke <mark mielke cc> |