Re: Discussion on source mirroring (with counter proposal)



Hi Tristan,

On Tue, Mar 20, 2018 at 8:34 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
Hi Sander,

First I have responded to your replies, below this I have some
extension to this proposal.

Let me try an step back a bit to what we are trying to achieve.  It seems useful for us to fully agree on that.
I'll reply to your replies in a separate post.

> On Mon, Mar 19, 2018 at 6:47 AM Tristan Van Berkom <tristan vanberkom codethink co uk> wrote:
>
> [...]
> > What are we trying to achieve ?
> > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> > Before talking about how to achieve this, I want to pause and think
> > about what exactly we want to achieve - I feel that we are not all on
> > the same page about this.
> >
> > I can see two separate ideas of what "mirror" means here:
> >
> >   A.) Because a specific third party server proves to be unreliable,
> >       we want to be able to have a fallback for that server, which we
> >       can rollover to in the case that the upstream doesnt work.
> >
> >       So this is a quick fix / bandaid for a specific pain point that a
> >       given organization experiences while using BuildStream; this
> >       allows us to have a tarball server for unreliable tarballs, or
> >       rollover to a github mirror for an unreliable upstream github.

I think we can summarize A as "alternative locations".  The alternative locations do not necessarily have to full mirrors of eachother.  An example:
I have a p.bst which uses a git source with the following properties:
  ...
  track: master
  ref: 9585191f37f7b0fb9444f35a9bf50de191beadc2

Now as an organization I need to retain this specific ref.  To do so I take a copy of the ref and store it in a separate git repo.
Imagine that the project owners now decide to rewrite history (it's git after all), and/or destroy the ref.
I am still perfectly fine with using the normal location to do track.  But if I want to recover the ref I had, I need to go to my alternate location.

> >   B.) For the same reasons, an organization may just never want to
> >       experience unreliable access to source code ever again.
> >
> >       The cost of hosting all sources which their BuildStream projects
> >       require on a single server; or even mirrored in some
> >       strategically placed locations (so that one can choose a mirror
> >       that is geographically closer when building), is a relatively low
> >       cost.
> >
 
> >       Instead of many points of failure on various servers scattered
> >       across the globe, a single point of failure *that is under the
> >       control of the organization in question*, is much more desirable.

I wouldn't want to describe it as a single point of failure, because that would be a deliberate design flaw :).  But agreed if you assume that these mirrors are set up in a resilient fashion, and your client is able to fall back to a different instance in case of failure (which you are alluding to with the "mirrored in some strategically placed locations").

I am not convinced about the need of a single concentrated mirror for all sources.  For instance if you are hosting your own version control systems, then those sources do not need to be included in the mirror.

Considering that this concentrated mirror is accumulating all history over time, I can see some scalability concerns.  A sharding approach is not an option in this case.  If you do want to shard, your back to custom "mirroring" solutions anyway.

I think that B is about implementing a mirror server as well as a client.  I think that A is just looking at a [more generic] client.

Both have merit, but I feel that we're probably being too optimistic about the investment needed in BuildStream to have B be useful long term.

Make sense?

Cheers,

Sander


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]