Re: Downloading from generated mirror



On Tue, 2018-05-01 at 17:46 +0100, Jonathan Maw wrote:

On 2018-04-30 12:56, Valentin David wrote:

I see two specific things required for this to work. And I would to
make sure I am correct. So please tell me if I am wrong.

Hi Valentin,

Can you confirm that my understanding of these issues is correct
(when I
paraphrase them below)


 1. Path
 -------

Path from mirror root are generated based on original URL. But
really
the plugins have some freedom about it. This path should be
generated
by the plugin in order to generate the correct URL.

Since handling of URL for #328 is done as higher level, I suppose
plugins should provide a method returning the generated path.

Currently, by convention we cache sources to 
{cache_dir}/sources/{source_type}/{URL_directory_name}

This is aided by Source.get_mirror_directory(), and 
utils.url_directory_name()

In your current MR, you have similar functionality with 
Source.find_mirror_directory()
as a helper, and a given source's _get_mirror_file() can be relied
upon 
to return the
correct directory. This is close to what we want, but the numbered 
component that gets
appended by find_mirror_directory() makes it less useful for
fetching.

We can of course provide another method to get the base path without
the last component. And make Source.find_mirror_directory() use it.


An important note for fetching, which goes beyond the usual logic of 
finding the correct
URL by replacing the alias with a different URI is that the name of
the 
repository after
the colon is also different (e.g. github:foo gets transformed into 
either github_foo or
http___github_com_foo, I'm not sure which).


 2. Multiple directories
 -----------------------

For some plugins, in #330, there is an extra component in the path.
It
is a number incrementing from 0. There are some plugins that might
get
new incompatible repositories or files for a same URL. And when we
detect this, we create a new directory.

 - ostree and git should not suffer of this problem, because we can
just compose repositories and collisions are unlikely.
 - zip and tar need it when the file is changed.
 - bzr needs it, when we detect that remote repository does not
contain
at least all the commits we have locally.

The loop that tries mirrors should retry while counting on the last
path component for those plugins requiring it.

There must be now two exceptions raised from source plugins:
 - one that says the URL works but the reference is not available,
 - another one for any other error.
The loop should count until it find the second error.

Here, I'd be expecting sources to exist at URLs like:

http:///mirror.com/github_foo/0
http:///mirror.con/github_foo/1

and should be trying to fetch with sources, expecting a
RefNotFoundError 
when fetching
from http:///mirror.com/github_foo/0, using that as an instruction
to 
try
http:///mirror.con/github_foo/1, and if that doesn't work, trying
with
http:///mirror.con/github_foo/1 (which will return a SourceError
instead 
of a
RefNotFoundError, which is a clear indicator to stop trying.


Correct.

It should be clear by the configuration whether I should be fetching 
like it's a
buildstream mirror or a non-buildstream mirror.

It seems required, yes.

Would I expect a value set by the source to indicate whether sources
are 
mirrored with
numbers appended, or is it safe to assume a number is appended, even
if 
multiple
repositories to store all the incompatible refs is unlikely to ever 
happen (so there'd
be "github_foo/0", but never "github_foo/1"?


If it can make things simpler, we could make it so it always have such
a component, and the plugin that do not need multiple paths, then will
always use 0 and never raise RefNotFoundError.



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]