Re: [BuildStream] Error consistency



Hi William :)

On Thu, 2019-07-11 at 10:17 +0100, William Salmon via buildstream-list wrote:
For a number of patches that I have worked on I have had to tweak our 
errors to make various errors more consistent.

Why are our Errors so inconsistent?

I can answer this, for some cases it's important, for others it's just
a consequence of how the code base evolved.

Let me break this series of questions into some sections:


Why are some errors defined in _exceptions.py and others elsewhere ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
There are two good reasons for this:

  (A) Some errors are public API and related to the plugin classes
      which plugin authors are supposed to raise.

      For documentation purposes, we prefer to declare SourceError()
      and ElementError() in source.py and element.py respectively,
      so that they appear in documentation related to the classes
      where plugin authors are supposed to raise them.

  (B) Not all exceptions are irrecoverable user facing errors, or
      at least some of them really should not inherit from BstError.

      For instance the `SkipJob` is a special semantic for our internal
      jobs to signal to the scheduler that the job was skipped in
      execution.

      This is currently in _exceptions.py but my personal thoughts
      are that it should not be - instead it should be hidden inside
      the _scheduler/ codebase and that implementation detail should
      not be exposed outside of the _scheduler/ subdirectory at all
      (it's better to keep implementation details as contained in
      their modules as possible, and not cloud up the global namespace
      with knowledge it should not have).


Asides from placement of the errors themselves, I'd like to highlight
two issues where errors are deriving from BstError and should not be:

   We should really be handling CASError and only raising a user facing
   ArtifactError where appropriate, as explained in detail here:

        https://gitlab.com/BuildStream/buildstream/issues/1016

   We should be doing approximately the same thing for PluginError,
   these are mostly not user facing errors but programming errors on
   the part of plugin authors, they should incur BUG messages with
   stack traces:

        https://gitlab.com/BuildStream/buildstream/issues/319

   Unfortunately for PluginError, I believe there may be some cases
   where we raise a PluginError and expect it to really be a user
   facing BstError, so some untangling is required to be able to
   sensibly split this up.


Why are some of the constructors of BstError derived exceptions
declared differently from others ?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I'll generalize here with your few separate questions about some
which use the `*` separator, and some which expose more or less of the
full set of BstError common properties.

The main answer here is that there is no good reason for this, this is
just how the codebase historically evolved and nobody considered it a
priority to make this consistent (i.e. it would be totally
uncontroversial to make these consistent).

Consider that the errors existed from almost the beginning of time, but
then we later added machine readable error domains and codes to the
base BstError, and we later added the "detail" string to add extra
explanations (to be consistent with how the logger displays Message
objects in the frontend).

Over time, we have added a parameter to the errors which already
existed to expose one of the underlying BstError properties, and we
really only bothered to do this when, for instance, it became
interesting to add a "detail" string to a specific error being raised.



And do we really want `LoadError` to have such different syntax to
every other error?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is a more tricky question, because with LoadError we do have a
case where many different code blocks have a valid reason to raise an
error with the same machine readable error code (e.g. INVALID_DATA),
or rather the "reason" as we call it in the codebase.

In this case, I would not like to lose the definition of
LoadErrorReason, because it's reused so much, it provides some real
value.

Asides from this, I think it would be fine to change the LoadError()
signature to match the others and to just have callers continue to use
the LoadErrorReason definitions, but supply them in a more consistent
order as we do with other errors.

We can even consider declaring things like ArtifactErrorReason if it
turns out that we are reusing reason strings elsewhere.



Why do we have temporary as a constructor argument and then ignore it 
when we call the base class constructor? I presume this is a bug that 
should be fixed but is there some hidden magic i am missing?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is the kind of question that is best answered with some research
with `git blame`, I didn't do the research to answer this question but
my best guess is that:

   (A) Someone decided that "All artifact errors stem from pull and
       push, and as such should be retried" (which *might* be correct,
       for now, or at least at the time which (B) happened).

   (B) Someone went ahead and addressed the problem by hardcoding
       the temporary parameter, without cleaning up the calling
       signature and related calling code blocks.

In any case, the answer here is that it is clearly a bug and should be
fixed.


Does this cover all of your questions ?

Cheers,
    -Tristan




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]