Re: [Deja-dup-hackers] [Duplicity-talk] (Feature Request) inotify integration?
- From: Aaron Whitehouse <lists whitehouse org nz>
- To: kenneth loafman com, duplicity-talk nongnu org, deja-dup-hackers lists launchpad net
- Subject: Re: [Deja-dup-hackers] [Duplicity-talk] (Feature Request) inotify integration?
- Date: Wed, 03 Feb 2010 21:51:21 +1300
Jenn Hulley-Miller wrote:
>> Is it possible to provide duplicity with a list of changed files
>> [...]built via linux's inotify subsystem.
> I have looked at this before and it would be a good solution for
> detecting changed files. [...]
> Please enter a request on the Launchpad site and we'll look at
> integrating this and retiring the old scanning method.
I have been giving this a lot of thought over the last few days and
cross-post it to the Deja Dup list as it is relevant to them as well. I
apologise that it has become longer than intended!
Conceptually, inotify (http://en.wikipedia.org/wiki/Inotify) is a dream
addition to a backup program. As Jenn says, scanning for changes is
often the vast majority of the work for an incremental backup.
TimeVault talks about the move to event-driven backup here:
"A snapshot delay is set for each directory to be watched (say, 1 min.).
When a file changes in that directory, a snapshot is taken after the
configured delay time. If the file changes before the snapshot is taken,
then the timer is reset repeatedly until you're done fiddling with the
file, or some other specified time runs out (say, 60 min.) and a
snapshot is forced. This mini-snapshot is called a delta. It only
involves the file in question and so should be relatively fast (my
benchmarks put it at around 50-150ms for a 1MB file, less for smaller
files), and can be done in the background."
This really could change the way that backups are done, as the notion of
"scheduling" backups would become redundant if we could rely on events
showing that files had changed.
However, if duplicity moved to inotify and it failed to notify the
backup daemon that a file had changed, that file would never get backed
up (unless it was later changed again and noticed). The key ways in
which I can see it failing are:
(a) if files to be backed up are on removable drives and changed on a
(b) if files to be backed up are on a partition modified by another OS
install on the same computer (for example, one of our computers
dual-boots Windows and deja dup monitors one folder on the Windows
(c) if we hit some bug in inotify (I have no idea how reliable inotify
is in cases, say, of power cuts or files changed over SAMBA etc.)
The best generic option that I have been able to come up with would be
to use inotify for changes to partitions used by the system (/, /home
etc.) and to still do a system scan for removable drives and other
partitions. This still would not solve the use-case of someone who
mounts and edits their /home partition files from another (say,
Gnu/Linux) install on the same machine.
I would guess that inotify would be a better way to determine changes in
99% of cases for 99% of users, but a backup program really needs to
cater for those 1% cases. If there are times that we cannot trust
inotify, then duplicity really needs to be doing a system scan -- at
least occasionally to pick up anything missed. Potentially duplicity
could run scans (as it does now) and compare the results to inotify
results for the first x backups to try and profile the system and user.
If inotify is often wrong then full scans could be done each time and
if inotify is always the same, then it could rely on this with a full
scan every y backups/months.
A different approach, to obtain some of the benefit of inotify with no
risk, would be to follow TimeVault's lead and instantly (after some
delay to prevent 100 versions of a file while you are editing it) backup
files noticed by inotify. This would be better than the current
approach as files would always be backed up, even if they changed
between scheduled backups. However, the current approach to scheduled
backups (full scans) could be maintained and this would mop up anything
missed by inotify -- though for 99% of users this scan would likely
yield no changed files.
I think it would be great if we could come up with an option that would
be best in nearly all cases, rather than adding additional configuration
options. This is something that a lot of users would not be able to
make an informed decision about.
I thought it was worth discussing these issues on the list, so that the
best approach can be embodied in a useful feature request.
Thanks again to all who have made such an excellent program.
] [Thread Prev