Re: [Tracker] [RFC] Adding CUE sheet support



On Wed, Sep 14, 2011 at 8:35 AM, Martyn Russell <martyn lanedo com> wrote:
On 14/09/11 01:35, Sam Thursfield wrote:

I did like the idea of the URL with a fragment, but now I realise how
much extra data the FS miner attaches when it adds nie:FileDataObject
information - it seems wrong for the tracks of a larger file to also
have type FileDataObject (necessary to set nie:url) but not have any
of the useful file-specific metadata that the FS miner would attach
(because we create the tracks as part of the preupdate).

It now seems more "correct" to me to make the tracks have no nie:url,
but relate them to the container file with nie:isStoredAs and add a
nmm:containerOffset propery which gives the time offset in the
container. Is this logical?

I would be careful when referring to file names too. Currently, renaming or
moving files on the disk only requires 1 update in the database and not
re-extraction of data. If we start storing path / filename information in
other places, it means this can be out of step with rename/move operations.

Thanks, that's an even better reason against :)

I've got this working now and pushed it to cuesheets-0.12 branch. As I
mentioned in IRC it required adding a 'postupdate' parameter for
extractors, which will be generally useful if anyone else wants to
break the InformationElement==DataObject rule and associate multiple
logical resources with one nfo:FileDataObject.

There's one outstanding issue which I'm open to ideas on - how to
delete the track resources when the file is re-crawled. The obvious
solution isn't supported by the query parser (unbound predicate):

DELETE { ?track p ?o } WHERE { ?track nie:isStoredAs ?file. ?file
nie:url "file:///test.flac" . ?track ?p ?o } "

If we could get a stable URN for the track resources we could delete
them, but that's difficult too - our only guaranteed metadata is track
# and file URL, and the latter can change.

The only robust way seems to be querying each existing track from the
store in a SELECT. This might be practical if done by the FS miner
once the extractor has returned, and only if it returns a non-empty
postupdate field to minimise the performance hit (in fact, it only
needs to be done on updates as well) - we could then SELECT { ?ie
?nie:isStoredAs <file-urn> } and go through deleting the results.

I realise that's a pretty big change, although at least again it's
reusable for any other files with multiple logical resources inside -
since the worst case scenario anyway is that we have multiple
nmm:MusicPiece resources referring to the same track, maybe there's a
simpler way that I'm missing. I appreciate your input anyway.

Sam



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]