Writeback, writing metadata back into your files

Today, I feel like exposing you to some bleeding edge development going on as we speak at the Tracker team. I know you’re scared of that and that’s precisely why I want to expose you! Hah.

We are prototyping writeback support for Tracker.

With writeback we mean writing metadata that the user passes to us via SPARQL UPDATE into the file that he’s describing.

This means that it must be about a thing that is stored, that it must update a property that we want to writeback and it means that we need to support the format.

OK, that’s three requirements before we write anything back. Let’s explain how this stuff works in the prototype!

In our prototype you mark properties that are eligible for being written into the files using tracker:writeback.

It goes like this:

nie:title a rdf:Property ;
   rdfs:label "Title" ;
   rdfs:comment "The title of the document" ;
   rdfs:subPropertyOf dc:title ;
   nrl:maxCardinality 1 ;
   rdfs:domain nie:InformationElement ;
   rdfs:range xsd:string ;
   tracker:fulltextIndexed true ;
   tracker:weight 10 ;
   tracker:writeback true .

Next you need a writeback module for tracker-writeback. We implemented a prototype one that can only write the title of MP3 files. It uses ID3lib‘s C API.

When the user is describing a file, the resource must have nie:isStoredAs. The property being changed ‘s tracker:writeback must be true. We want the value of the property too. That’s simple in SPARQL, right? Sure it is!

SELECT ?url ?predicate ?object {
    <$subject> ?predicate ?object ;
               nie:isStoredAs ?url .
    ?predicate tracker:writeback true
 }

You’ll find this query in the code, go look!

Now it’s simple: using ID3lib we map Nepomuk to ID3 and write it.

No don’t be afraid, we’re not going to writeback metadata that we found ourselves. We’ll only writeback data that the user provided in the form of a SPARQL Update on the default graph. No panic. Besides, using tracker-writeback is going to be completely optional (just don’t run it).

This is a prototype, I repeat, this is a prototype. No expectations yet please. Just feel exposed to scary stuff, get overly excited and then join us by contributing. It’s all public what we’re doing in the branch ‘writeback’.

ps. Whether this will be Maemo’s future metadata-write stuff? Hmm, I don’t know. Do you know? ;-)

9 thoughts on “Writeback, writing metadata back into your files”

  1. Ingenious. It’s about time someone tackles this issue.
    As this is in tracker, chances are it will be supported by a variety of platforms. Way to go!

  2. So if I understand correctly it is about synchronizing data a user added to Tracker for a file (via Nautilus for instance) directly within the corresponding file?

  3. My biggest qualm with metadata is that when it gets stored in a database detached from the files, moving files around and renaming them, unless I do it with a DB-approved application, or transferring them to another computer almost always loses it, unless I take care to locate and transfer the metadata database (which doesn’t work well when the destination computer has its own). I really do want metadata to follow the file around. :( I wish files in general had some standard method to store metadata that travelled with them, that they could exist in some file container that had [container header | metadata | actual file ] and that applications could deal with [actual file] or could deal with the container too. I’m sure that would baffle other OSes though :)

    So, thanks for working on writeback support. The better attached metadata is to a file, the less fragile it is and the more portable it will be. YAY!

    Now if I could only convince metadata systems to consider a hash of a file as well as its path to identify it.

  4. Looks very elegant. You must be avoiding a gazillion issues by having the tracker:writeback modifier in the update itself.

  5. @Anders: the SPARQL update doesn’t need the tracker:writeback, the tracker:writeback goes into the ontology. We distinguish between user and miner by doing a bit of pseudo named graph support. If you pass FROM or INTO with your SPARQL Update query, we don’t trigger writeback. If you don’t and the property’s tracker:writeback is set to true, we do.

Comments are closed.