Documentation. Documentation. Documentation!

As I was rereading Ivan Frade’s blog post about the class-signals feature that me and Ivan developed for Tracker last week I got reminded of something I wrote a long time ago:

In my opinion, there’s nothing as important as developer documentation for a framework or library.

Ivan decided to ~ dump our internal planning document on GNOME’s Live wiki service. He didn’t write the document so he’s of course not to blame, but the document was in my opinion not suitable as end user framework documentation. Application developers should not be required to understand what goes on in our team’s minds. So I rewrote this.

You can find the document at the SignalsOnChanges page. Let’s see if starting next week I can convince the other team members to document their new features in a similar way. Many new things in Tracker’s experimental branch could use more clarification and examples.

I happen to believe that undocumented libraries and frameworks aren’t meaningful. That’s because letting it remain undocumented the developer renders his work utterly irrelevant for a serious application developer. I also believe that undocumented infrastructure software is worse than no infrastructure software.

I strongly recommend to application developers to refrain from using any piece of undocumented library or framework. Don’t lower yourself to their standards. Demand documentation by refusing to use undocumented infrastructure. Doing it as free software is no excuse for delivering extremely low quality, like what undocumented infrastructure is (in my opinion).

Thanks to Ivan’s blog post for reminding me of what I once wrote, and today still believe in, myself. While passionately coding new features you sometimes forget about this.

Which is unacceptable.

Tracker’s class signals, live search capabilities

Ivan Frade already blogged this, so that that means that I can keep this one shorter.

But yeah, we have just finished what we decided to call class-signals for Tracker. You are looking at software development taking place as we speak, because we still need to do a peer-review of the branch and then merge it to our experimental branch. This will probably happen in a few minutes.

The new feature means that if the ontology of a rdfs:Class contains a rdfs:Property tracker:notify that is set to true, then Tracker will offer a DBus object for that class. Whenever a subject of that class is added, updated or removed you will receive a signal about it on that DBus object.

Ok, ok, I know. You don’t want to click links and search for it yourself. Here are some “screenshots”:

nmo:FeedMessage a rdfs:Class ;
         tracker:notify true ;
         rdfs:subClassOf nmo:Message .

The DBus API is split per rdfs:Class. That’s because we don’t want that a lot of applications will be listening for each and every change to each and every subject. Now they can instead select the classes that they are interested in. That way they can avoid getting woken up constantly by our damn signal.

This feature is a change signal mechanism. We will eventually implement support for full live-queries.

Supporting full live-queries means that we’ll need to store some state about your session to test whether your SPARQL query requires an update from Tracker to get your client’s previous result list synchronized. It’s quite a bit more tricky, especially if you want to support most of SPARQL to go together with the live-query notifications, and especially if you want to be accurate.

Finally you don’t want the live-query capability to be a huge performance burden each time material becomes updated.

This is why we are now putting in place this feature. We think that for 90% of the applications the capabilities of the class-signals feature will be sufficient for their live update needs.

Biweekly Tracker development planning

We realize

At the Tracker team we realize that we should not only tell you guys about Tracker’s new features. We should also involve you in the planning and realization of software development at the Tracker project.

I have discussed this with our team and we agreed that I will make a summary at each beginning of our biweekly planning.

The reason is fairly simple: we want to involve the community and inspire them to become active contributors. But how can they contribute unless they know what the plans of the development team of Tracker for the next few days are? You are right, they can’t.

Which is a situation that we want to change. Therefor, here’s the summary of the plans for the next two weeks:

Ontologies

  • Making tickets for an audio ontology that we made during a previous sprint to upstream Nepomuk.
  • Craft an ontology for images that will later be proposed to Nepomuk. We will convert the extractors in our experimental branch to use the new image ontology.
  • Applications need to be able to create arbitrary key value pairs to existing resources. We plan to add a custom ontology on top of Nepomuk’s NAO ontology that will make this possible.

Fixing things in the 0.6.9x series

  • Changing unique value sorting to use collations and we’re planning to change the unique value queries for better performance.
  • Making sure tracker-indexer and tracker-extract support detecting unmount events, and marking resources as unavailable, at a very early stage. This helps cleanly unmounting a removable device without keeping the device busy as the unmount request takes place.
  • Supporting listening for a signal that tells us to pause the indexer. This will be helpful on devices that want us to be silent while they are performing a higher priority task.
  • Evaluating and testing concurrent access.

Our experimental branch

  • Start discussing merging our experimental branch to trunk together with Jamie.
  • We have updated the new .ontology files in our experimental branch to include more specific rdfs:range information.
  • Making it possible to observe changes about resources that belong to a class. For example observing changes in all nfo:Document classes. The granularity will be at the level of a resource. You’ll know whether a resource is updated, deleted or created. This task might take more than a week to complete. Ivan Frade is planning to write a blog item dedicated to this subject.
  • To fulfill the requirements of certain use-cases we’re planning to make it possible to delete a complete resource (or a set of) with one function call.
  • We might start working on adding support for application domain (custom) ontologies.
  • Porting recent improvements on support for detecting file renames to our experimental branch.
  • Evaluating isLogicalPartOf or alternative link as cascading rule for deletes.

When people are interested in joining development of Tracker, they can ping us at the channel #tracker on GimpNet. They can also send a E-mail to Tracker’s mailing list.

The ontology descriptions of Tracker, now in Turtle

Thanks to Jürg is the experimental branch of Tracker storing its ontology descriptions using the Turtle format.

What is an ontology anyway?

Wikipedia sums it up pretty well: In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

What is Turtle?

The w3 specification explains it as a textual syntax for RDF that allows RDF graphs to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes.

Turtle is the format that we want to standardize on. We for example plan to use it for some of our interprocess communication needs, we are already using it for backup and restore support and we used it as base format for persisting user metadata on removable devices.

An example snippet from the ontology in Turtle:

nie:InformationElement a rdfs:Class ;
     rdfs:label "Information Element" ;
     rdfs:subClassOf rdfs:Resource .
nie:title a rdf:Property ;
     rdfs:label "Title" ;
     rdfs:comment "The title of the document" ;
     rdfs:subPropertyOf dc:title ;
     nrl:maxCardinality 1 ;
     rdfs:domain nie:InformationElement ;
     rdfs:range xsd:string ;
     tracker:fulltextIndexed true .
nfo:Document a rdfs:Class ;
      rdfs:label "Document" ;
      rdfs:comment "A generic document. A common superclass for all documents on the desktop." ;
      rdfs:subClassOf nie:InformationElement .

Okay …

It only made sense for us to destroy and eliminate inifile formats like the ontology descriptions of the current non-experimental Tracker. Let me explain:

We have plans to add support for adding domain specific custom ontologies. By that I mean that it’ll be possible for an application to install and remove an ontology, to get the application specific metadata out and to restore this data as part of reinstalling a custom ontology.

As I already pointed out during my Tracker presentation at FOSDEM doesn’t this mean that we encourage application developers not to care about the base ontology. In fact we strongly recommend application developers to stick to- instead of diverting much from Nepomuk.

Meanwhile experimental Tracker’s indexer has started to work and is storing things in our decomposed RDF storage that uses Nepomuk as schema and can be queried using SPARQL.

Vision

My vision for metadata storage is that files are just one kind of resource. Tracker’s indexer collects the metadata about mostly those file-based resources. However. Metadata is everywhere: in META tags of websites and RSS feeds, in both locally stored and streamable remotely located multimedia resources, in E-mails, meeting requests and calendar items, in your roster and contacts list, in daily events, in computer events and installed applications.

Computer events? I hear you thinking. Well, for example a “hardware” event like your location changing just before you took a picture. That location can be harvested as metadata about the picture. There are many more examples imaginable.

For some datas is metadata the only thing really being stored. For example aren’t contact-resources having much more data than today’s ontologies describe. For most of the others will metadata describe the resource. And often, more importantly, its relationship with other resources.

We want Tracker to be your framework for metadata on both desktops and mobile devices. This is why we want to use w3 standards like Turtle over our own formats. We just happen to see things quite big.

No more old XML RDFQuery, but SPARQL. No more inifiles, but Turtle. No more home brewed ontology, but Nepomuk.

Merge-to-trunk plans

We are planning to start merging the experimental stuff to trunk, and start calling it the 0.7.x series of Tracker. Let’s see how many cocktails we’ll need for Jamie to get drunk enough to start accepting our insane, but already working, ideas.