Re: Announcement/RFC: jhbuild continuous integration testing

First, this sounds like really interesting stuff, great news.

On Tue, Feb 12, 2013 at 3:43 PM, Martin Pitt <martin pitt ubuntu com> wrote:
> Hello fellow GNOME developers,
> this already came up as a side issue recently[1], but now we are at a
> point where have reasonably stabilized our GNOME jhbuild continuous
> builds/integration test server to become actually useful:
> This is building gnome-suites-core-3.8.modules, which currently
> consists of 160 modules. Builds are updated every 15 minutes, and
> triggered whenever there was a new commit in a module or any of its
> dependencies. This mostly uses the smarts of jhbuild, we just have
> some extra scripts around to pick the results apart for Jenkins and
> drive the whole thing [2]. You can click through all the modules, all
> their builds, and get their build logs.
> Right now there are 151 successes (blue), 5 modules fail to build
> (red), and 4 modules build but fail in "make check" (yellow). It's
> been like that for a week or two now, so I'd say we are doing
> reasonably well for now. Some details:
> Build failures:
>  - colord: recently started depending on libsystemd-login, which we
>    don't have yet; that's a fault on the Ubuntu side
>  - e-d-s: calls an undeclared g_cond_timed_wait(), not sure what this
>    is about
>  - folks: this started failing very recently, and thus is a perfect
>    example why this is useful (unqualified ambiguous usage of
>    HashTable)
>  - gst-plugins-bad: unknown type GStaticRecMutex; this might be due to
>    recent changes in streamer? That smells like a case of "broken by
>    change in dependency, needs updating to new API"
>  - mutter: worked until Jan 7, now failing on unknown XIBarrierEvent;
>    that might be a fault in Ubuntu's packages or upstream, I
>    haven't investigated this yet
> Test failures:
>  - gst-plugins-good, empathy: one test failure, the other tests work
>  - realmd: This looks like the test suite is making some assumptions
>    about the environment which aren't true in a headless server?
>  - webkit: I don't actually see an error in the log; we'll investigate
>    this closer on our side
> This was set up by Jean-Baptiste Lallement, I mostly help out with
> reviewing the daily status and cleaning up after some build/test
> failures which are due to broken checkouts, stale files, new missing
> build dependencies, and so on. It's reasonably maintenance intensive,
> but that's something which the two of us are willing to do if this
> actually gets used.
> The main difference to Colin's ostree builds is that this also runs
> "make check", which is one of the main points of this: We want to know
> as soon as possible if e. g. a new commit in glib breaks something in
> gvfs or evolution-data-server. Where "soon" is measured in minutes
> instead of days/weeks, so that the knowledge what got changed and why
> is still fresh in the developer's head. That's also why I recently
> started to add integration tests to e. g. gvfs or
> gnome-settings-daemon, so that over time we can cover more and more
> functionality tests in these.
> To make this really useful, we can't rely on developers checking this
> every hour or every day, of course; instead we need push notifications
> as soon as a module starts failing. That's the bit which needs broader
> discussion and consent.
> I see some obvious options here what to do when the status of a module
> (OK/fails tests/fails build) changes:
>  (1) mail the individual maintainers, as in the DOAP files
>    (1a) do it for everyone, and let people who don't want this filter
>    them out on a particular mail header (like "X-GNOME-QA:")
>    (1b) do this as opt-in
>    This most often reaches the people who can do something about the
>    failure. Of course there are cases where it's not the module's fault, but a
>    dependency changed/got broken. There is no way we can automatically
>    determine whether it was e. g. a deliberate API break which modules
>    need to adjust to, or indeed a bug in the depending library, so we
>    might actually need to mail both the maintainers of the module that
>    triggered the rebuild, and the maintainers of the module which now
>    broke.

Upon reading this particular part (and I noticed before you are
using mostly jhbuild mechanics), it leads me to wonder, how
granular exactly are these rebuilds ?

I think ideally it would be great if builds could be triggered by
commit. In other words, commits are serialized chronologically and
each and every commit should trigger an entire rebuild, each rebuild
should build everything in the moduleset "up to the latest commit"...
separately, one after the other.

I know, it sounds like some CPU will be melting quickly
at the rate gnome-wide commits are made... but it would be
simply awesome, if we could automatically pull out the exact
commit which introduced exactly which failed build report in
which module (and then as you mentioned, we probably need
to notify both the author of the commit, and the maintainer
of the effected module).

The way I imagine this works now (and this is a big assumption,
correct me if I'm wrong), is that a commit in a given module triggers
a jhbuild build, which would mean that:

   a.) Several commits could have been made in a given module
        by the time jhbuild actually runs... meaning we dont know
        which of the given commits in that lapse of time actually
        caused the fault.

   b.) Module "foo" triggers a rebuild... and while jhbuild builds,
        it also pulls in new changes from module "bar", in this
        case it's possible that a recent commit in module "bar"
        caused another module "baz" to be effected,  but in the
        end it's module "foo" who is blamed (since module "foo"
        essentially /triggered a rebuild/)

Don't get me wrong, this is a great thing to be working on,
but at which level can we be sure about which commit breaks
which module ?


>  (2) one big mailing list with all failures, and machine parseable
>      headers for module/test
>    This might be more interesting for e. g. the release team (we can
>    CC: the release team in (1) as well, of course), but will be rather
>    high-volume, and pretty much forces maintainers to carefully set up
>    filters.
> My gut feeling is that we might start with (2) for a while, see how it
> goes, and later switch to (1) when we got some confidence in this?
> Opinions most welcome!
> Also, I'll gladly work with the developers of the currently failing
> modules to get them succeeding. I have full access to the build
> machine in case errors aren't reproducible.
> Thanks,
> Martin
> [1]
> [2]
> --
> Martin Pitt                        |
> Ubuntu Developer (  | Debian Developer  (
> _______________________________________________
> desktop-devel-list mailing list
> desktop-devel-list gnome org

[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]