Re: [ck] [ANNOUNCE] Staircase Deadline cpu scheduler version 0.46



On Fri, 4 May 2007 19:56:00 +0100
Ash Milsted <thatistosayiseenem gawab com> wrote:

> On Mon, 30 Apr 2007 20:11:24 +0100
> Ash Milsted <thatistosayiseenem gawab com> wrote:
> 
> > On Mon, 23 Apr 2007 01:03:14 +1000
> > Con Kolivas <kernel kolivas org> wrote:
> ...
> > > 
> > > http://ck.kolivas.org/patches/staircase-deadline/2.6.21-rc7-sd-0.46.patch
> > > http://ck.kolivas.org/patches/staircase-deadline/2.6.20.7-sd-0.46.patch
> > > 
> ...
> > 
> > Hello,
> > I've been using sd-0.46 on 2.6.21 and seem to have encountered a
> > strange regression. The symptom appears to be a communication problem
> > between dhcdbd - a dbus dhcp client frontend - and networkmanager, in
> > that with SD patched in dhcdbd can't seem to persuade networkmanager
> > that it has received a lease. e.g.
> > 
> > Apr 30 19:55:35 joker NetworkManager: <info>  Activation (eth0) Beginning DHCP transaction.
> > Apr 30 19:55:35 joker NetworkManager: <info>  Activation (eth0) Stage 3 of 5 (IP Configure Start) complete.
> > Apr 30 19:55:35 joker NetworkManager: <info>  DHCP daemon state is now 12 (successfully started) for interface eth0
> > Apr 30 19:55:36 joker NetworkManager: <info>  DHCP daemon state is now 1 (starting) for interface eth0
> > Apr 30 18:55:36 joker avahi-daemon[11150]: Registering new address record for fe80::20e:2eff:fe66:4560 on ra0.*.
> > Apr 30 19:55:37 joker dhclient: DHCPDISCOVER on eth0 to 255.255.255.255 port 67 interval 3
> > Apr 30 19:55:37 joker dhclient: DHCPOFFER from 129.234.219.254
> > Apr 30 19:55:37 joker dhclient: DHCPREQUEST on eth0 to 255.255.255.255 port 67
> > Apr 30 19:55:37 joker dhclient: DHCPACK from 129.234.219.254
> > Apr 30 19:55:37 joker dhclient: bound to 129.234.219.68 -- renewal in 6111 seconds.
> > Apr 30 19:56:20 joker NetworkManager: <info>  Device 'eth0' DHCP transaction took too long (>45s), stopping it.
> > *****networkmanager doesn't realise eth0 got bound to an IP, and falls back to zeroconf...******
> > Apr 30 19:56:20 joker dhclient: There is already a pid file /var/run/dhclient-eth0.pid with pid 11827
> > Apr 30 19:56:20 joker dhclient: killed old client process, removed PID file
> > Apr 30 19:56:20 joker dhclient: DHCPRELEASE on eth0 to 129.234.4.86 port 67
> > Apr 30 19:56:20 joker dhclient: send_packet: Network is unreachable
> > Apr 30 19:56:20 joker dhclient: send_packet: please consult README file regarding broadcast address.
> > Apr 30 19:56:21 joker NetworkManager: <info>  Activation (eth0) Stage 4 of 5 (IP Configure Timeout) scheduled...
> > Apr 30 19:56:21 joker NetworkManager: <info>  DHCP daemon state is now 14 (normal exit) for interface eth0
> > Apr 30 19:56:21 joker NetworkManager: <info>  DHCP daemon state is now 11 (unknown) for interface eth0
> > Apr 30 19:56:21 joker NetworkManager: <info>  DHCP daemon state is now 14 (normal exit) for interface eth0
> > Apr 30 19:56:21 joker NetworkManager: <info>  Activation (eth0) Stage 4 of 5 (IP Configure Timeout) started...
> > Apr 30 19:56:21 joker NetworkManager: <info>  No DHCP reply received.  Automatically obtaining IP via Zeroconf.
> > 
> > This does not happen every time networkmanager starts, 
> > but certainly often enough that I have to flip 'enable networking'
> > on and off a few times after a resume from hibernation more often
> > than not with SD. The problem doesn't seem to occur on a vanilla kernel.
> > I have tried running the dbus test-suite but I'm not really able
> > to interpret the results properly.. certainly both vanilla and SD
> > get 4 'test passed' messages...
> > 
> > Ash
> > 
> > PS: I have tested both kernels close together timewise so as to eliminate
> > network problems as far as possible.
> 
> 
> Hello,
> 
> I've now tested with 0.48 and hit the problem as soon as the system
> started (so it's nothing to do with suspend.. or the fact that I had
> some modules compiled against the vanilla kernel last time). I can also
> confirm it does not occur with CFS (as well as the vanilla kernel).
> 
> I have CCed the NetworkManager list (anyone who replies from there,
> please CC me). Could this be a sched_yield related issue? I will test
> loading NM with sched_yield disabled when I get a chance...
> 
> Ash
> 
> PS: Apart from this problem SD is silky-smooth to use. Under CPU load,
> desktop-behaviour is just like using a slower PC - no nasty spikes or
> latency blips.

Ok, some more info. My attempts to disable sched yield didn't shed much
light on this business (using LD_PRELOAD technique) but I did find that
running dbus-monitor (dumping system bus message info to the console)
during the inter-daemon communication 'fixes' the problem. I guess
that's consistent with this being a 'timing issue'.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]