Re: [Tracker] It doesn't index PHP files



On Thu, 2012-10-04 at 10:54 +0100, Martyn Russell wrote:
On 04/10/12 09:17, Ivan Frade wrote:
  I think python script contents are indexed because the mimetype is
"text/x-python" and it falls back to the "text/*" extractor. PHP files
have the mimetype "application/x-php" and there is no default option
for that.
  This can be solved adding "application/x-php" in the .rules file of
the text extractor (check
/usr/local/share/tracker/extract-rules/90-text-generic.rule and other
rule files in the same folder).
  Note that generic text indexing means that the python code is treated
as plain text, a bunch of words. You could always write an specialized
extractor that takes into account the semantic of the file. For
example ignoring __init__.py files, or import statemens, maybe
ignoring the code and indexing only function names.... depends on what
you want. Same applies to PHP.
  Writing an extractor module is not difficult with some rudiments of
programming in C and we can help via mailing list or IRC. Patches are
welcome ;)
I should add, you can use:
   tracker-control -m $MIME
or
   tracker-control --reindex-mime-type=$MIME
If you change the rules file to note have to reindex all content again.

awilliam linux-nysu:~>
cat /usr/share/tracker/extract-rules/90-text-generic.rule
[ExtractorRule]
ModulePath=/usr/lib64/tracker-0.14/extract-modules/libextract-text.so
MimeTypes=text/*;application/php

awilliam linux-nysu:~> tracker-control
--reindex-mime-type="application/php"
Reindexing mime types was successful

awilliam linux-nysu:~> grep -c
Vaccaro /home/awilliam/Documents/Development/PHP/jsonRPCClient.php
1

awilliam linux-nysu:~> tracker-search Vaccaro
Results:

Nope. :(

awilliam linux-nysu:~>
tracker-info /home/awilliam/Documents/Development/PHP/jsonRPCClient.php
Querying information for
entity:'/home/awilliam/Documents/Development/PHP/jsonRPCClient.php'
  'urn:uuid:ccd602ad-60ea-faee-f4d3-6c8e54274fe3'
Results:
  'http://purl.org/dc/elements/1.1/date' = '2012-04-18T21:25:19Z'
  'http://purl.org/dc/elements/1.1/source' =
'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66'
  'tracker:added' = '2012-09-22T00:07:48Z'
  'tracker:modified' = '441117'
  'rdf:type' = 'http://www.w3.org/2000/01/rdf-schema#Resource'
  'rdf:type' =
'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#DataObject'
  'rdf:type' =
'http://www.semanticdesktop.org/ontologies/2007/01/19/nie#InformationElement'
  'rdf:type' =
'http://www.semanticdesktop.org/ontologies/2007/03/22/nfo#FileDataObject'
  'nie:byteSize' = '3977'
  'nie:dataSource' =
'urn:nepomuk:datasource:9291a450-1d49-11de-8c30-0800200c9a66'
  'nie:isPartOf' = 'urn:uuid:44f7deca-c907-266e-ee4c-1850b641f8a7'
  'nie:url' =
'file:///home/awilliam/Documents/Development/PHP/jsonRPCClient.php'
  'nfo:belongsToContainer' =
'urn:uuid:44f7deca-c907-266e-ee4c-1850b641f8a7'
  'tracker:available' = 'true'
  'nie:isStoredAs' = 'urn:uuid:ccd602ad-60ea-faee-f4d3-6c8e54274fe3'
  'nie:mimeType' = 'application/x-php'
  'nfo:fileLastAccessed' = '2012-04-18T21:25:19Z'
  'nfo:fileLastModified' = '2012-04-18T21:25:19Z'
  'nfo:fileName' = 'jsonRPCClient.php'
  'nfo:fileSize' = '3977'

Doing a touch on the file seems to cause nfo:fileLastModified to change,
but it still doesn't show up in a search.




[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]