Discussion on source mirroring



I've been giving some thought on source mirroring, recently, after reading the discussion at https://gitlab.com/BuildStream/buildstream/issues/179.

Source mirroring will be valuable for us because:
* The canonical upstream may disappear without warning
* The canonical upstream may be slow to access due to limited infrastructure or geographical distance.

I briefly considered whether it's possible to do a "one size fits all" source mirror, but I don't think it's doable. Each source is permitted to store files in the local source cache in whatever format they feel is appropriate - as a result, merging the remote and the local cache is dependent on which methods are suitable for each kind of source.

Since we will have to do separate work for each source, we have the opportunity to make fetching use the same protocol as we use for fetching sources normally, so I suggest the following:

In the project.conf, the aliases dict can key to a list of URLs instead of just a single URL, e.g.

aliases:
  github:
  - https://mirrorsrv.example.com/github
  - https://github.com
  sourceforge: http://downloads.sourceforge.net

The implementation of being able to fetch from multiple sources is not trivial, however. At its simplest, we update all sources' fetch and track methods to use multiple repo aliases.

To reduce the amount of complexity that we expect plugin authors to write, we might do one of the following: * Create a method that takes an aliased URL and yields every URL it can generate from the aliases it knows. * Where we currently call fetch and track, iterate over every possible URL and keep calling fetch/track as long as they return an appropriate return value / exception to indicate that it failed because it couldn't access that URL.

Known issues:
* Are we likely to see a URL that uses multiple repo aliases?
* We are likely to see one mirror alias per type of source.
Users who mix many kinds of source with multiple mirrors with have a lot of boilerplate configuration.

Does anyone have a better idea of what we could do?

Best regards,

Jonathan


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]