Re: Automatic rollback in case of boot failure?





On Wed, Jul 1, 2015, at 10:19 AM, Leandro Santiago wrote:
I wonder if there is any kind of automatic recovering system on ostree
and if not, how to implement it.

There isn't one built in, and there's not likely to be one because it's
highly dependent on the OS content.  I see OSTree as a shared infrastructure
piece here.

The first issue here is defining what a boot failure is. A dependency
loop is clearly one kind, but there are for sure more.

That kind of thing seems like it should be quickly caught in e.g. a virtualization-based
testing workflow prior to deployment on bare metal.  

But say that an updated kernel's network driver fails on one of the hardware
variants you support.  That's something not easily caught in VM testing,
and would mean you couldn't manually log in remotely to roll back.

How to detect that?  Well...it'd have to be part of the "rollback scenario"
that is owned by the person creating the OS.  If the network driver is oopsing,
and the previous boot wasn't emitting kernel oops, that could be a rollback
trigger.

As for actually implementing it, I think a wrapper around
`rpm-ostree rollback` (which I should also copy to ostree...it's just not done),
and avoid a potential rollback loop by creating a stamp file in /var/lib/myos/rollback-stamp
that if it exists, don't roll back.  And clear the stamp file after being up
for long enough or so.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]