Op 14/06/2013 19:36, Ivan Frade
schreef:
Hi Ivan,Yep. I noticed it.
A problem that I see with the FD passing using pipe is that we can't know for sure whether a library that we depend on for metadata extraction wont use seek() and assume they got a real file's fd. I'm even afraid that most do. The others are probably buffer or mmap based. Meaning that libstreamanalyzer's way of just rewriting all extractors to be stream based is probably the only way to end up with a consistent and sensible solution for in-archive metadata extraction. You're right if there is only one such kind of extractor. As soon as you want to select libstreamanalyzer for one kind of archive-mime-type combination and another extractor for another archive-mime-type combination, this wont work. But I agree that we could have a tracker-extract-container.c and .rule for application/x-tgz among other container types that then splits it out to tracker-extract-container-streamanalyzer.cpp and tracker-extract-container-somethingelse.c based on logic defined not in the top .rule system but on what tracker-extract-container.c itself does. And at first this can simply be to throw them all to a streamanalyzer.cpp one (which will likely look a lot like what tracker-topanalyzer.cpp is now). Note that there's no reason to keep tracker-topanalyzer.cpp's filename. With the .rule based system the filename topanalyzer.cpp makes no sense anymore. If Jos isn't working on it anymore, surely we can look into it ourselves. I doubt that Jos would reject patches that conditionally make libstreamanalyzer spit out a better ontology than the broken upstream Nepomuk ontologies for multimedia. A bit of stream and decorator in C++ will do good to us. Kind regards, Philip |