Re: [Tracker] Fear and Loathing in Las Vegas



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10/07/2014 12:35, Martyn Russell wrote:

I have to say, I am a bit averse to this sort of thing, but mainly 
because I worry about performance.

nod.

However, you can't escape requirements by some to have some level
of protection over their data in the database when it's being
shared /accessed by others putting their own content in there too.

For a while, using multiple DBs was an idea but we've been down
that path and it's ugly and complex at times.

Graphs and multiple DBs, yes.

ps. libtracker-sparql and tracker-store are designed to someday
support multiple instances and multiple ontologies. So filesystem
based UNIX access control on a meta_domain.db and different databases
per domain to protect data in a domain-based way wouldn't be a bad idea.

Making queries that combine data from different databases wouldn't be
easy of course.

Using a hash or secure method is certainly a nice alternative (if
it works). My immediate thoughts are about how queries would
actually work and what kind of performance hit this would have.

The performance issues are reduced by precalculating the hashes and
storing the hashes in meta.db instead of the values. After that we let
queries, joins and comparers (of sqlite) work with the hashes instead
of on real values.

The decrypted values are fetched afterwards, using the same hashes.

So the query parser of the insert would first hash the values:

INSERT { <s> nie:title 'something' }

Becomes in SQL:

 INSERT INTO "nie:InformationElement" ("nie:title") VALUES
      ('437b930db84b8079c2dd804a71936b5f')
and a

 put_in_values_store ('something')

The read queries do the same thing:

SELECT ?a, ?b { ?s nie:title ?a; nie:subject ?b .
         FILTER (?a == 'something') }

Becomes in SQL:

SELECT "nie:title" AS A, "nie:subject" AS B FROM
         "nie:InformationElement" WHERE "nie:title" =
                '437b930db84b8079c2dd804a71936b5f'

And then the query parser would travel the answers of the selection,
and reveal them (in A and B you'd get these hashes):

once {
  values = get_all_plain_values(with_credentials)
  foreach (value in values)
      hashmap.add (MD5(value), value)
}

foreach (row in result)
  foreach (cell in row)
    lines.add (hashmap.get(cell.value))
return lines

Result would be that you can still discover the relationships but you
can't discover the values without the with_credentials.

With increasing processing power, perhaps this is less of an
issue?

Philip, while this is all great hot air we're producing here, is
there an actual requirement for this from someone for something
real? :)

No, not really. At this point it's as you correctly point out hot air.

Kind regards,

Philip

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.20 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTvnMUAAoJEEP2NSGEz4aDj6oH/i5rME52DRpCgRMub/G8KiWk
E1ZyUSnVR9byBsbDEZ8RNv4+4VnH1ZsUSjonPXnQ1e0qE1BWQHTD6UM5m0vaoa3D
0ktfSulM9bHlih91KUlJmO9FBKCwje8iXqiM3kV42zs11HfrwHVsDq9oTQkkICnD
YlJ2YS7vvXlXroskRuU8wSE7E6HwJFr8EngUw0xwVmhi/OZVxKEQ95EdIBfjfCKn
KSTmTdJY/x/IFiLVxpeEym83YqvAIP2kYGHX9/mO6198AWRdpPep2cununZHPASf
xLPbTJH0H8zSoug24WSsSjlTsy0tsukDjSFQrRnxWtFmxL84EILAyw7NFhtP68s=
=dCdS
-----END PGP SIGNATURE-----


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]