SPARQL, Nepomuk, StreamAnalyzer and Tracker

We at the Tracker team should in my opinion report more often in our blogs about our progress on things like Tracker’s SPARQL and Nepomuk support.

Let’s start with the awesome SPARQL support that Jürg has been working on. Just a few minutes ago when you made a SPARQL query that had a unknown predicate, Tracker returned an empty array over D-Bus.

dbus-send --print-reply --dest=org.freedesktop.Tracker
   --type=method_call /org/freedesktop/Tracker/Search
   org.freedesktop.Tracker.Search.SparqlQuery
   string:'SELECT ?title WHERE { ?s nie:ttle
      ?title FILTER regex(?title, ".*in.*") . }'

method return sender=:1.66 -> dest=:1.98 reply_serial=2
   array [
   ]

Leaving you in the unknown about your query being in error. Jürg fixed this and now you get something like this instead.

tracker-sparql --query="SELECT ?title WHERE { ?s nie:ttle
      ?title FILTER regex(?title, \".*in.*\") . }"
Could not query search, Unknown property `http://.../nie#ttle'

This way you can fix your query’s error and do something like this instead:

tracker-sparql --query="SELECT ?title WHERE { ?s nie:title
      ?title FILTER regex(?title, \".*in.*\") . }"

  The final metadata solution
  Tracker in gnome bugzilla

Today I migrated the code in Tracker that implements support for the metadata D-Bus API for E-mail to the Nepomuk Message Ontology. Meaning that Tracker will store the metadata it receives from E-mail clients like KMail and Evolution using the NMO ontology and that it’ll make this metadata available to the SPARQL query engine.

Great news that we got informed of this week is that a developer has started implementing the metadata D-Bus API for E-mail in Thunderbird. He left a pointer to his git repository on the wiki-page.

Meanwhile I have implemented the API in KMail. This patch is pending review. We are planning to add support for this in Modest soon too.

Next. We are migrating the indexers and extractors to Nepomuk. These tasks come with all sorts of extra work related to integrating with Nepomuk as ontology.

I have also implemented integration with Strigi’s truly awesome StreamAnalyzer. I have rarely seen such a beautifully designed piece of code that in my opinion outperforms whatever Tracker has at this moment for extracting metadata in several interesting ways.

I don’t know why we shouldn’t join Strigi on making StreamAnalyzer kick ass. I can find no reason why instead of trying to compete with it we shouldn’t integrate with it. I’m pushing our team to consider the integration option and so far they are enthusiastic about it.

StreamAnalyzer needs a migration from Xesam as ontology to Nepomuk. But Evgeny Egorochkin and Jos Vandenoever already told me that they have put this on their agenda. After that, with the integration that I did for Tracker, can StreamAnalyzer become the core analysis code that Tracker uses. Right now the plan is to let StreamAnalyzer be the first to run and then letting Tracker’s own extractors follow up.

Let’s make some more bridges with KDE projects. Why not!

4 thoughts on “SPARQL, Nepomuk, StreamAnalyzer and Tracker”

  1. Just a small addition: libstreamanalyzer is not scary. It is plain c++ (no Qt or other large dependencies) and easy to use from c as Philip has demonstrated.

  2. Just wanted to say this post was a pleasure to read!
    Looking forward to more integration with libstreamanalyzer and nepomuk.

    Keep up the good work :-)

Comments are closed.