We store our data in a decomposed way. For single value properties we create a table per class and have a column per property. Multi value properties go in a separate table. For now I’ll focus on those single value properties.
Imagine you have a MusicPiece. In Nepomuk that’s a subclass of InformationElement. InformationElement adds properties like title and subject. MusicPiece has performer, which is a Contact, and duration, an integer. A Contact has a fullname.
Alright, that looks like this in our internal storage.
Querying that in SPARQL goes like this. I’ll add the Nepomuk prefixes.
SELECT ?musicpiece ?title ?subject ?performer { ?musicpiece a nmm:MusicPiece ; nmm:performer ?p ; nie:title ?title ; nie:subject ?subject . ?p nco:fullname ?performer . } ORDER BY ?title
A problem if you ORDER BY the title field is that Tracker needs to make a join and a full table scan with that InformationElement table.
So we’re working on what we’ll call domain specific indexes. It means that we’ll for certain properties have a redundant mirror column, on which we’ll place the index. The native SQL query will be generated to use that mirror column instead. A good example is nie:title for nmm:MusicPiece.
ps. A normal triple store has instead a huge table with just three columns: subject, predicate and object. That wouldn’t help you much with optimizing of course.