A lot of discussion has happened around https://bugzilla.gnome.org/show_bug.cgi?id=640667 and I would like to weigh in my point of view, to see where people agree/disagree, and get the API change done. For that matter, I have written a small document trying to specify the problem clearly and completely enough, and then explain my idea, which is basically what Juan and Iago proposed, with some little changes. Basically, I think the biggest/most controversial one is the idea of getting rid of supported_keys(). I think the other points have been discussed. Here it is, in text form so that it's easy to reply to, and attached in HTML for easier reading. Note that in the html version, section 3.2, the higher level enumerated list should be alphabetical (couldn't be bothered fixing that). Thanks in advance for your comments Guij ------8<-------- =========== Present API =========== :: const GList * (*supported_keys) (GrlMetadataSource *source); const GList * (*key_depends) (GrlMetadataSource *source, GrlKeyID key_id); There is also a ->slow_keys() with the same prototype as ->supported_keys(), but the speed issue (fast resolution against slow resolution) is considered out of scope here. ======================= Issues we want to solve ======================= (1) avoid calling $PLUGIN.resolve() if we can know in advance $PLUGIN will not be able to find that type of metadata (2) handle conditional abilities: a source may be able to provide a given metadata only if a given condition (other than the presence of another metadata key) is true. For instance: - a cover art source can only provide the THUMBNAIL key for a GrlMediaAudio - a provider using an API on the local file system (such as local-metadata) can only provide metadata for a GrlMedia that is on the local file system (url in file://) Also, dependencies (see (3)) can be conditional. (3) handle dependencies: suppose the following hypotheses: - we want to know metadata A for a media M of type T - source S1 can provide metadata A if metadata B is present - source S2 can provide metadata B - M already contains the following metadata: C, D, E then the core should be able to know all this information through the API, preferably before any call to resolve(), so that it can call S2 asking for B first, and then call S1 asking for A. The current API helps solving (1) with ->supported_keys() and (3) with ->key_depends(), but does not allow (2) to be solved. ======== Analysis ======== Let's try to formalise things a bit to be able to identify the different case that might happen. Parameters of a metadata resolution ----------------------------------- (that is, data we have when grl_metadata_source_resolve() is called) :M: a GrlMedia object, containing the following information: :M.T: the type of M (Box, Video, Audio or Image) :M.D: dictionary of key->values of metadata on M :K: key we want to resolve Only one key to resolve, and not a list is considered here. For a list, one should be able to just repeat the algorithm for each member of the list. Not sure it is optimal in speed, but it should guarantee to get the result in all cases where it is possible. We should note that core's resolution algorithm can potentially expand M.D (through calls to other sources' resolve()), meaning we can be inclined to try the query with that source again with an expanded M.D. M.T and K are fixed. .. note:: We suppose we are in the case with GRL_RESOLVE_FULL flag is set, else we just want a call to the source's ->resolve(). Considering the above, the ability of S to resolve K can depend on: - K - M.T - the set of keys available in M.D - the value of certain keys in M.D - external conditions that cannot be known without blocking (such as data available on a remote server) - any combination of the above The present API only takes K into account. An optimal solution would take K, M.T and M.D into account, and provide a way to resolve issues through more resolution (filling M.D) Algorithm Idea -------------- For source S, given the data above (M and K), here is how I see the resolution algorithm: Is resolution possible? (a) YES: let's call S->resolve() (or plan to) (b) NO: could it be possible with more keys available in M.D? (1) NO: give up (2) YES: what data should be added in M.D? -> a set of keys L need to be filled in M.D. Try to resolve these before resolving K We are necessarily in either one of the cases (a), (b1) or (b2). In cases (a) and (b1), we don't need more information than the fact we are in said case. In case (b2), we need S to give us a list of keys L. It should be noted that if we are in case (b2), after obtaining the needed metadata (supposing we can), we can still be either in case (a) or (b1), since the ability to resolve can depend on the _value_ of a given metadata, and not only on its presence. Likewise, if we are in case (a), the call to resolve() might still fail (dependency on external conditions). ================= Proposed solution ================= - remove supported_keys() and key_depends() - add:: gboolean (*may_resolve) (GrlMetadataSource *source, GrlMedia *media, GrlKeyID key_id, GList **missing_keys) How this solves our issues -------------------------- (1) if may_resolve() return FALSE, we know in advance $PLUGIN can't find that metadata, we won't call resolve() for nothing. (2) both media and needed key are passed as parameters, therefore may_resolve() can have its result depend on these. (3) if missing_keys is not NULL, a list of keys on which this resolution would depend is provided there. Representation of different cases --------------------------------- :: (a) <=> may_resolve() returns TRUE (b.1) <=> may_resolve() returns FALSE and *missing_keys == NULL after the call (b.2) <=> may_resolve() returns FALSE and *missing_keys is a non-empty list of GrlKeyID. Why no supported_keys()? ------------------------ Not needed ~~~~~~~~~~ supported_keys() is not strictly needed in a resolution algorithm, and could only provide a marginal performance improvement if its result is cached. Rationale: The resolution algorithm, with supported_keys(), would do something like:: for each source: key_list = supported_keys() if key_id in key_list: possible = may_resolve() if possible: resolve() else: /* try to resolve missing_keys first * / The alternative using only may_resolve() would do:: for each source: possible = may_resolve() if possible: resolve() else: /* try to resolve missing_keys first * / In the case where key_id is not in key_list, a clever enough may_resolve() implementation would return FALSE as fast as the first solution will take to call supported_keys() and check for key_id in it. In the case where key_id is in key_list, the first implementation will cost one function call more. If we nevertheless really want supported_keys() (i.e. we think this thing is worth optimising by caching these results into a dictionary key->list of sources that support it), an alternative could be to pass NULL as media and missing_keys to may_resolve(), but I feel this would make the semantic of may_resolve() too complex, and that in that case we should stick with supported_keys(). Less code for plugin ~~~~~~~~~~~~~~~~~~~~ That would be one less method to implement, knowing that what it does, or something very similar, has to be done in may_resolve() anyway.
const GList * (*supported_keys) (GrlMetadataSource *source); const GList * (*key_depends) (GrlMetadataSource *source, GrlKeyID key_id);
There is also a ->slow_keys() with the same prototype as ->supported_keys(), but the speed issue (fast resolution against slow resolution) is considered out of scope here.
avoid calling $PLUGIN.resolve() if we can know in advance $PLUGIN will not be able to find that type of metadata
handle conditional abilities: a source may be able to provide a given metadata only if a given condition (other than the presence of another metadata key) is true. For instance:
- a cover art source can only provide the THUMBNAIL key for a GrlMediaAudio
- a provider using an API on the local file system (such as local-metadata) can only provide metadata for a GrlMedia that is on the local file system (url in file://)
Also, dependencies (see (3)) can be conditional.
handle dependencies: suppose the following hypotheses:
- we want to know metadata A for a media M of type T
- source S1 can provide metadata A if metadata B is present
- source S2 can provide metadata B
- M already contains the following metadata: C, D, E
then the core should be able to know all this information through the API, preferably before any call to resolve(), so that it can call S2 asking for B first, and then call S1 asking for A.
The current API helps solving (1) with ->supported_keys() and (3) with ->key_depends(), but does not allow (2) to be solved.
Let's try to formalise things a bit to be able to identify the different case that might happen.
(that is, data we have when grl_metadata_source_resolve() is called)
M: | a GrlMedia object, containing the following information:
|
||||
---|---|---|---|---|---|
K: | key we want to resolve |
Only one key to resolve, and not a list is considered here. For a list, one should be able to just repeat the algorithm for each member of the list. Not sure it is optimal in speed, but it should guarantee to get the result in all cases where it is possible.
We should note that core's resolution algorithm can potentially expand M.D (through calls to other sources' resolve()), meaning we can be inclined to try the query with that source again with an expanded M.D. M.T and K are fixed.
Note
We suppose we are in the case with GRL_RESOLVE_FULL flag is set, else we just want a call to the source's ->resolve().
The present API only takes K into account. An optimal solution would take K, M.T and M.D into account, and provide a way to resolve issues through more resolution (filling M.D)
For source S, given the data above (M and K), here is how I see the resolution algorithm:
We are necessarily in either one of the cases (a), (b1) or (b2). In cases (a) and (b1), we don't need more information than the fact we are in said case. In case (b2), we need S to give us a list of keys L. It should be noted that if we are in case (b2), after obtaining the needed metadata (supposing we can), we can still be either in case (a) or (b1), since the ability to resolve can depend on the _value_ of a given metadata, and not only on its presence. Likewise, if we are in case (a), the call to resolve() might still fail (dependency on external conditions).
remove supported_keys() and key_depends()
add:
gboolean (*may_resolve) (GrlMetadataSource *source, GrlMedia *media, GrlKeyID key_id, GList **missing_keys)
- if may_resolve() return FALSE, we know in advance $PLUGIN can't find that metadata, we won't call resolve() for nothing.
- both media and needed key are passed as parameters, therefore may_resolve() can have its result depend on these.
- if missing_keys is not NULL, a list of keys on which this resolution would depend is provided there.
(a) <=> may_resolve() returns TRUE (b.1) <=> may_resolve() returns FALSE and *missing_keys == NULL after the call (b.2) <=> may_resolve() returns FALSE and *missing_keys is a non-empty list of GrlKeyID.
supported_keys() is not strictly needed in a resolution algorithm, and could only provide a marginal performance improvement if its result is cached.
Rationale: The resolution algorithm, with supported_keys(), would do something like:
for each source: key_list = supported_keys() if key_id in key_list: possible = may_resolve() if possible: resolve() else: /* try to resolve missing_keys first * /
The alternative using only may_resolve() would do:
for each source: possible = may_resolve() if possible: resolve() else: /* try to resolve missing_keys first * /
In the case where key_id is not in key_list, a clever enough may_resolve() implementation would return FALSE as fast as the first solution will take to call supported_keys() and check for key_id in it. In the case where key_id is in key_list, the first implementation will cost one function call more.
If we nevertheless really want supported_keys() (i.e. we think this thing is worth optimising by caching these results into a dictionary key->list of sources that support it), an alternative could be to pass NULL as media and missing_keys to may_resolve(), but I feel this would make the semantic of may_resolve() too complex, and that in that case we should stick with supported_keys().
That would be one less method to implement, knowing that what it does, or something very similar, has to be done in may_resolve() anyway.