tracker 1.5.1



About Tracker
=============

Tracker is a semantic data storage for desktop and mobile devices.
Tracker uses W3C standards for RDF ontologies using Nepomuk with
SPARQL to query and update the data.

Tracker is a central repository of user information, that provides two
big benefits for the user; shared data between applications and
information which is relational to other information (for example:
mixing contacts with files, locations, activities and etc.).

News
====

  * Many fixes to RSS miner:
    - Dumps more complete data on tracker-store.
    - Stability fixes.
    - Leak fixes.
    - Performs automatic maintenance of feed messages.
  * Bumped libgrss dependency on 0.7
  * Performance improvements on tracker-store delete operations
  * Performance improvements on tracker-miner-fs delete operation handling.
  * Fix main Resource table id/urn leaks
  * Fix unnecessary queries in tracker-extract

Translations: es, hu, pt


ChangeLog
=========
2015-07-21  Carlos Garnacho  <carlosg gnome org>

Release 1.5.1

libtracker-miner: Avoid full table scans on recursive sparql buffer queries
If MATCH_CHILDREN is specified for a TrackerTask, we use
tracker:uri-is-descendant(), it's however smarter to use fn:starts-with,
as that'll resort to sqlite tricks that will avoid full table scans.

libtracker-miner: Remove operations on children on deleted folders
This is an optimization to reduce the number of queries that we
perform across the deletion of large directory trees.

libtracker-miner: Only set MATCH_CHILDREN on tasks for directory files
It's a query that can be avoided for non-directories, so better do it.

libtracker-miner: Add tracker_file_notifier_get_file_type()
Just plug the hole from the internal TrackerFileSystem, will be
handy for fast file type checks at the TrackerMinerFS level.

libtracker-miner: Be smarter at not triggering TrackerDecorator activity
There's times where tracker-extract GraphUpdated handler will fire due
to its own inserts. Doing so is harmless, but triggers each time a
query for the count of unhandled elements that effectively goes nowhere
as it's already active.

So handle_updates() is now smarter at not triggering activity unless
a resource of the inspected classes is added, and graph_updated_cb()
won't trigger anymore the count query everytime.

libtracker-data: delete elements from the Resource table
On deletion, items with an specific row ID are removed from all
tables but the Resources one, which holds the urn:uuid:... mappings.

The deletion of that table lead to confusions in the fts_view view
and ultimately the FTS table, as both will indirectly depend on the
elements stored there, so the deleted rows still had FTS representation,
just filled with nulls.

This looks like was just forgotten, if it was there to cover
constraint errors, it'll be better to just open the pandora box, and
fix the bugs we receive. Anyhow, from testing most common scenarios
it works alright.

parser: Optimize 0-length string parsing
We were still creating the ICU parser and trying to feed it with
data, which turned out surprisingly expensive on deletes, as
"deleting" on FTS just replaces the text with "nothing", so we're
creating a parser for each of these.

This reduces the timing of the sparql delete in the previous commit
further down to:
real 1m7.029s
user 0m0.023s
sys 0m0.009s

libtracker-data: Don't schedule all deletes only because of FTS
The limitations in FTS why it made sense to perform the scheduled
delete no longer apply since FTS4 and external content tables
(or rather, we don't need the previous values explicitly).

The scheduled delete is a lot more (if not extremely) thorough,
decomposing the properties and items to be deleted into individual
queries. This has quite an effect on deletes involving a large
number of elements, a query like

delete { ?u a rdfs:Resource; }
where { ?u nie:url ?url .
          FILTER (fn:starts-with (?url, ".../linux/"))}

on a linux git checkout indexed through tracker-miner-fs used
to involve 7M sqlite queries, with this fast path it's down to
1.6M (and infinitely less sqlite3_stmt cache misses). In result
the timing is improved substantially, time(1) from that query
on the "tracker sparql" command went from:

real 2m33.377s
user 0m0.021s
sys 0m0.008s

Down to:

real 1m23.625s
user 0m0.021s
sys 0m0.009s

libtracker-data: Add function to delete an entire row from the FTS table
This can be used as an optimization, instead of updating each column
individually as we currently do.

2015-07-19  Carlos Garnacho  <carlosg gnome org>

tracker-extract-msoffice: Avoid frequent errors when feeding wrong files
There's mimetypes which detection is too weak (i.e. purely based on
filename extension matches), so it makes sense to avoid the frequent
errors we get when the module gets fed a random file.

tracker-extract-gstreamer: Avoid frequent errors when feeding wrong files
There's mimetypes which detection is too weak (i.e. purely based on
filename extension matches), so it makes sense to avoid the frequent
errors we get when the module gets fed a random file.

Merge branch 'wip/GrssPerson'

2015-07-18  Igor Gnatenko  <ignatenko src gnome org>

configure: bump required libgrss version to 0.7
and now grss has unversioned pc name

2015-07-18  Carlos Garnacho  <carlosg gnome org>

rss: use tracker_sparql_builder_object_blank_open()/close()
Tested-by: Igor Gnatenko <ignatenko src gnome org>

2015-07-18  Igor Gnatenko  <ignatenko src gnome org>

rss: add extracting additional attrs for persons
GrssPerson is introduced in libgrss 0.7

2015-07-18  Carlos Garnacho  <carlosg gnome org>

rss: Extract copyright/contributors/categories from feed messages
These do get extracted as nie:copyright, nco:contributor and nie:keyword
respectively.

2015-07-18  Pedro Albuquerque  <palbuquerque73 gmail com>

Updated Portuguese translation

2015-07-17  Balázs Úr  <urbalazs gmail com>

Updated Hungarian translation

2015-07-17  Carlos Garnacho  <carlosg gnome org>

rss: Store html as nmo:htmlMessageContent
We get raw HTML content from the feed, and nie:plainTextContent
should be that, plain text. This change is twofold, we now store
the HTML content as nmo:htmlMessageContent (as the ontology
observes), and honor nie:plainTextContent (and FTS!) by storing
the plain text stripped of all tags.

rss: Maintain references to GrssFeedChannels
We're currently leaking these, and recreating all from scratch
again each time we query the mfo:FeedChannels. Just keep the references,
and reuse GrssFeedChannels from previous additions.

rss: Account for a same feed message coming from different channels
Unfortunately the nmo:communicationChannel docs are very explicit about
the property cardinality. So we just create the mfo:FeedMessage for
the first channel, and make it bail out any next time it would be
added, from the same mfo:FeedChannel or another.

https://bugzilla.gnome.org/show_bug.cgi?id=752484

rss: replace string comparison by boolean check
The cursor can return the correct type right away, no need to
retrieve the boolean value as a string and compare to anything.

rss: Handle mfo:FeedChannel deletes
If we receive a delete for one of those, we'll delete all feed messages
associated with it.

2015-07-16  Daniel Mustieles  <daniel mustieles gmail com>

Updated Spanish translation

2015-07-16  Carlos Garnacho  <carlosg gnome org>

rss: Make --title argument optional
We can fetch that now from the feed channel, so leave --add-feed
as the minimum required.

rss: Lower severity of frequent message
No need for a g_message() for something that's completely expected.

rss: Retrieve information around mfo:FeedChannels
We leave these mostly untouched, but we should have some info to fill
in when the GrssFeedChannel has been populated. We currently retrieve
title, url (although we should have gotten that in the first place),
feed type, description, image link, and last message date.

rss: Be more careful about updates
We just check all feeds on any hint from GraphUpdate, which means
we also update everything if we dare to modify the mfo:FeedChannel,
resulting in circular updates.

Actually, we should be just inspecting additions of mfo:FeedChannel
elements (or selected property changes in these), so modifications
on these don't trigger the extraction of all the feeds again.
If we happen to update all feeds on GraphUpdates, we also do so
when modifying FeedChannels

2015-07-15  Carlos Garnacho  <carlosg gnome org>

rss: Unset timeout source id
This timeout is meant to run once, but leaves the timeout_id behind,
which warns when we g_source_remove() it.

rss: Fix double free
The variant is not for us to free.

rss: fix typo in ontology
nco:fullname is all lowercase, this one caught me too...

rss: Fix compile error
has_author was not defined. It's been renamed to "author" too, the former
name makes more sense for booleans.

2015-07-15  Igor Gnatenko  <ignatenko src gnome org>

rss: author field should be nco:Contact, not string
Reference: https://bugzilla.gnome.org/show_bug.cgi?id=752398

rss: add author field
Reference: https://bugzilla.gnome.org/show_bug.cgi?id=752398

2015-07-14  Igor Gnatenko  <ignatenko src gnome org>

bump libgrss to latest 0.6
There are no API break since 0.5, but 0.5 doesn't work well and
doesn't shipped in distros

Reference: https://bugzilla.gnome.org/show_bug.cgi?id=752371

Download
========
https://download.gnome.org/sources/tracker/1.5/tracker-1.5.1.tar.xz (4.67M)
  sha256sum: 24890ecf4edd320bca4e8334a8e25622b88a055b66b66604aba16e5bce82ec32


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]