Re: [BuildStream] Proposal: Allowing download-only sources to work on local files



Nice write up, Very readable!

I am a bit biased as I have been having the same battle but trying to be impartial this seems sensible to me. I have also added some extra context at the end.

On 02/12/2019 10:19, Tom Mewett via buildstream-list wrote:
Hi list,

[...]

- All source plugins which operate on files are unified as subclasses of
   a single base class, say FileBasedSource
- This base class handles the 'url' and 'ref' keys of the source config
- First it checks whether 'url' is a fully-qualified URL or is just a
   relative path. If it's the former, it is fetched as necessary and
   stored in the source cache
- If the URL is a relative path, specifying a ref is optional. If it is
   given and is different from what is calculated, an error is throw

This seems like the path of smallest disruption to existing users I would be tempted by not letting local not have a ref unless specified in project.conf but dose that sound too disruptive to others?


The 'local' and 'remote' sources could then also be unified to some
kind of 'copy' source, as 'copy' would act as either one depending on
whether a local path or full URL was given.

If desirable, the 'url' key could be split into mutually-exclusive
'url' and 'path' keys which would decide the behaviour.

Having two seems more explicit which seems more sensible but I don't have strong feelings here. With `path` for `relative to project` then url/uri as it works now.

For the single url key, would it work like: `uri: file://../bob.txt` for relative vs absolute `uri: file:///bob.txt`, this seems workable to me.


----

In my opinion, this could improve the usability of the plugins in
question. The user thinks of plugins as taking a "location" of either
local or remote files, only requiring refs/checksums for remote ones,
and caching them for convenience. It provides a sort of network
transparency.

Another advantage would be that local files imported into the project
can be given refs, meaning that they would not need to be present to
compute cache keys of depending elements. (This is not possible with
the current 'local' source.)

Whether or not they are present is useful, especially in CI but for big files the real advantage for me is that you do not have to go through the CPU and IO intensive action of hashing the file for every invocation of bst, the act of running show on a built top level element could require the Hashing of many large files and end up taking a large chunk of the time required to run show.

Im not 100% sure but for a recent project it looks like hashing large local files is mostly responsible for taking our `Resolving cached state` from 1 second to 1 minute and 30 seconds, for bst1.4.

Regards
Will






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]