Re: (missing) pre-up and pre-down



On Wed, 2009-08-05 at 11:30 +0100, Marc Herbert wrote:
> Dan Williams a écrit :
> > 
> > There are two reasons I've not yet added pre-up and pre-down.  They are:
> > 
> > 2) appropriateness
> 
> Hmmm, the good old "just do not do this" answer... the best answer to
> any feature request ever ;-) Especially to people having using this
> feature for ages and being suddendly deprived of it.

Please note I didn't say *all* uses were inappropriate.  Just that
because we've done something the same way forever, doesn't *necessarily*
mean that it should always be done that way until the end of time.

> 
> >     b) by the time any pre-down script will run, often the connection
> > has already gone away (the AP is out of range, the cable has been
> > unplugged already, etc) so any operation a pre-down script does *cannot*
> > depend on the interface being up; it must gracefully fail.  Common
> > things people wanted to do here were unmount network shares;
> > but since the script must always handle unexpected disconnects (which
> > not all network file systems do well), you might as well just run this
> > from post-down anyway.
> 
> I think "pre-down" cleanup scripts could (should?) simply NOT be run on
> unexpected disconnects (as opposed to explicit disconnection
> requests). Simply because they are called PRE-down, not AT-down.

I did think about this a lot while composing the mail, and couldn't come
up with a good reason to not run pre-down scripts on unexpected
disconnect.  I don't really care either way.

> 
> 
> > Basically, allowing arbitrary "pre-up" and "pre-down" scripts opens the
> > door to more bug reports and requires more diagnostics when stuff goes
> > wrong.  Thus, the requirement to *do it right* and ensure that when
> > somebody writes these scripts incorrectly, that the user does not suffer
> > the consequences, and that the guilty party (the script) is correctly
> > identified.
> > 
> > And, as always happens with timeouts, somebody will inevitably ask for
> > the timeout to be extended because "my use-case just takes a second
> > longer" without thinking about the actual impact of their request for
> > everyone else (ex DHCP timeouts), and without fixing the actual root
> > cause why they need a longer timeout.  That means yet more time spent
> > writing mails and replying to bug reports.
> 
> This looks like a storm in a teacup... there is an infinitely simpler
> solution: just blame the actual culprit. Implement pre-up and pre-down without
> any parallel execution nor timeouts, not anything complicated. Dead
> simple, except for this: when any script hangs for more than one
> seconds, just hang with it, and print its name prefixed with "ERROR
> FROM:" capital letters all over the place: in the logs, in pop-ups,
> etc. Then trust me, not you but the author of this script will receive
> the bug reports and the angry emails.

Haha.  Wrong.  That's actually not the way it works.  Users don't
actually care, they just want stuff to work.  Most of the bug reports
will go to me, *not* to the actual script author.  Time and again,
that's the way it happens.

I get bug reports for things that Ubuntu custom-patched NetworkManager
to do (back in 0.6 days with "managed" versus "roaming" mode).  I get
bug reports for binary wifi drivers that aren't in the upstream kernel
that have never been guaranteed to work.  I get bug reports for problems
with NTP because I have something to do with networking.  I get bugs
from people who misconfigure DHCP servers and expect everything to work
just fine.

By and large, users don't actually investigate the real source of the
problem.  They leave that to me, or to the distributor.  And I can see
why, they often don't have the expertise or the knowledge or the time to
figure out where the problem really is.

> And to even further reduce the chance to receive bug reports you can
> also make this "pre-" feature disabled by default and flag it (again) as
> "experimental" in the logs every time NM starts with it explicitely
> enabled.

Again, that doesn't actually help.  People often find advice on forums
and just turn stuff on blindly.  This also happens quite often in my
experience, more often than you may expect.  If you add an option to a
program, you *have* to expect that people will use it.  And you have to
account for that.

> Then you can always plan to implement fancy parallel execution and
> configurable timeouts later in the long term, but at least knowledgeable
> people recently deprived of pre-up and pre-down have another solution
> than dumping NetworkManager and using something else (which admittedly
> does reduce the amount of feedback you get...)
> 
> 
> By the way, speaking of reducing the flow of bug reports and angry
> emails, the current approach of not providing the full set of features
> and transparency of the tools NM is meant to replace does not seem to
> work very well either :-) The distributions are probably more to blame
> than NM on this (by rushing things through the door as they usually do),
> but well, it seems the angriness unfortunately trickles down here,
> doesn't it?

That depends on the people that want these features; they are fewer in
number than the total # of users of NetworkManager, and they are often
more knowlegable about the system as a whole, and we can figure out some
solution or workaround.  That's not the case if some feature gets in and
tons of people start using it, irregardless of whether they know they
are using it or not.

All I'm saying is that whenever somebody requests a feature, there are
often more considerations in adding that feature than the requestor
thinks about.  Often, they are thinking of only their own use-case, and
not how it will impact others.  That doesn't mean the feature shouldn't
be added.  It means that after careful consideration, the feature should
be added the *right* way, that minimizes the risk of user confusion and
general chaos should something go wrong :)

Dan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]