Re: [Tracker] Indexing PowerPoint



On 08/12/08 19:16, David H. Vree wrote:
Hello,

Hi,

My company and I create many PowerPoint files and I would like to index
them in tracker.

I thought we were doing that already. The content is extracted using filters which are kept in filters/ in the source tree. The filters which I thought we were using for this are called:

  application/vnd.oasis.opendocument.presentation*

In those we use odt2txt which converts a an open document or open office document to raw text.

The question, however, is, does that include Microsoft Office documents?

If not, then your filter would be well received.

I have discovered that the package "catdoc" provides a program called
"catppt" which will dump the contents of a powerpoint file to ascii
text. I install and testing it on my system and it sees to work fairly
well. Is a way to wire this capability into tracker....or should it just
work?

Sure, the source currently uses this to know which filter to run to extract file content:

  str = g_strconcat (mime, "_filter", NULL);

  text_filter_file = g_build_filename (LIBDIR,
                                       "tracker",
                                       "filters",
                                       str,
                                       NULL);

These are installed into /usr/lib/tracker/filters. These filters are effectively those found in the filters/ directory in the source tree. Look there for more examples.

--
Regards,
Martyn



[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]