Re: [Evolution] Errors receiving mail via POP

On Thu, 2014-09-11 at 02:16 +0200, Ángel González wrote:
Patrick O'Callaghan wrote:
Any distributed system is subject to partial failure, i.e. one part
stops working or loses communication with another part. If data is in
the process of being moved between the two parts when that happens, the
designer has to decide between allowing data to be lost and opening the
possibility of duplicating it. Systems which aim for reliability
invariably choose the latter option, because the mess can always be
cleaned up later. Sometimes the cleanup is automatic (e.g. transactional
databases, many remote filesystems) and sometimes it's "manual" (i.e.
visible to the end user). AFAIK all email systems fall into the latter
category because a) it's not that big a deal, and b) doing it
automatically would be complicated and could introduce other errors.


It's not hard to think on an IMAP server where the client is moving the
emails using MOVE, the server is storing the files in maildir and thus
it simply performs a rename() and the underlying filesystem is journaled
(nothing fancy, just ext3 would do) and makes rename(2) atomic even in
case of a server crash.

Sometimes it's not hard to make things that work properly. But then
nobody would notice :)

It's relatively easy to make a single system withstand crashes up to a
given level of severity (we're excluding hardware failure, terrorist
attacks and meteor strikes here). Now make it work when the client
crashes, when the client and server are disconnected for an
indeterminate time, when the user is moving mail between accounts held
on different servers using different backend implementations with no
central administration and which of course may become disconnected from
each other, ...

In other words, on the Internet as it actually is.

It's useful to remind ourselves that no service on the Internet
guarantees absolute reliability. We tend to forget this because on the
whole it is incredibly reliable for something so complex. A large part
of that is due to it not being over-designed, and to the notion that we
can live with the occasional error because we'll notice it and do
something to correct it. I recommend Saltzer's seminal 1984 paper "End
To End Arguments in System
Design" (


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]