Re: [BuildStream] Proposal: bst artifact subcommand group



On Tue, 2018-07-24 at 13:51 +0100, Paul Sherwood wrote:
On 2018-07-24 11:48, Tristan Van Berkom via Buildstream-list wrote:
<snip>
It was recently raised that we cannot determine the build date of an
existing artifact either[1], as it's not a part of artifact metadata,
this discussion evolved to proposing that we have a subgroup of CLI
commands for dealing with specific artifacts.

Perhaps a dumb question, but why isn't creation date of the artifact 
file/archive good enough?

Because an artifact is defined as a directory which is rather
independent of it's storage unit. It is possible to export an entire
artifact as a tarball, but no such feature exists at this time, in any
case we still need a way to know the creation date.

This could be built into CAS now that we wont need any alternative
artifact cache implementations (CAS being multi platform), but I still
think it's preferable to continue storing all relevant data in the
content, not the container.

As such I'd like to propose the following, probably uncontroversial
enhancements:

  o Add the build date to the artifact metadata, as a timezone neutral
    unix timestamp (seconds since epoch in UTC).

Don't forget to bump the 'cache-key algorithm' (does bst have a 
different name for this?) to avoid future bst implementations making 
incorrect assumptions about what's in a give artifact

Yes, I did not mention this but it is certainly implied.

  o Standardize on the name of an artifact, this is not really exposed
    in the UX except the user may see "artifact names" at times.

I agree this has to be done.

    An artifact name is essentially:

       ${project-name}/${element-name}/${cache-key}

I believe that $(project-name) is not relevant. If you encode that, (it 
will at least seem like) artifacts can't be shared between projects.

This is an interesting argument; I think however as soon as you are
working with multiple projects you might prefer using an interface
which allows such namespacing and tab completion of artifact names.

Is there a reason for choosing "/" ?

For ybd I chose "." so that artifacts can be stored in a single 
directory. Maybe you are expressly trying to establish that the 
artifacts should all sit under one subdir for the name, but that also 
seems to be a wrong turn to me.

Given that I don't understand the ostree or CAS implementations I may be 
conflating filesystem names with artifact names, though.

I think you are and the artifact name itself is rather arbitrary (there
is no real reason to prefer "/", but we have been expressing them this
way for some time so I doubt it's necessary to change).

    Exposing this to the user essentially means that we have a handy
    syntax for the user to express an artifact on the command line.

 From a user POV, the name.key standard (with creation time on the 
artifact itself) has proved useful in practice

  o Add the `bst artifact` subgroup with the following commands:

    o bst artifact list <artifact name glob pattern>

As you know I'm rather against adding lots of commands and options since 
it increases the documentation and learning surface. Also, I'm not 
convinced that people (maybe you included) are using bst in enough 
scenarios to be clear about what functionality will really be useful.

Indeed, this resonates with me too.

If you follow https://gitlab.com/BuildStream/buildstream/issues/416,
where this idea was brewed up, you can see some of the motivations, but
we should be careful to not add surface that was not requested.

Remember that with BuildStream, the fact that we might have a directory
where we extract artifacts at ~/.cache/buildstream/artifacts/extract is
a rather arbitrary internal thing; which might be changed with the
upcomming on demand staging approaches.

We do not guarantee:
  o That the shas in that directory are cache keys
  o That an extract directory even exists if we have an artifact
    for a given cache key.

Because this is all internal implementation details, I think that
having a way to peruse the artifacts in your cache (i.e. bst artifact
list) can be useful.

Asides from this, the most popular requests have been:

  o Ability to delete an artifact (mostly this is from developers
    who want to test/retest something, but occasionally by users
    who want to retry a build).

  o Ability to view the log of a specific artifact (really, we
    store the logs in the artifact but provide no way for the user
    to ever see them, so this should really be addressed).

I agree that listing contents and diffing really falls into the "nice
to have" territory, but they would be nice to have :)

Cheers,
    -Tristan



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]