Tracker’s class signals, live search capabilities

Ivan Frade already blogged this, so that that means that I can keep this one shorter.

But yeah, we have just finished what we decided to call class-signals for Tracker. You are looking at software development taking place as we speak, because we still need to do a peer-review of the branch and then merge it to our experimental branch. This will probably happen in a few minutes.

The new feature means that if the ontology of a rdfs:Class contains a rdfs:Property tracker:notify that is set to true, then Tracker will offer a DBus object for that class. Whenever a subject of that class is added, updated or removed you will receive a signal about it on that DBus object.

Ok, ok, I know. You don’t want to click links and search for it yourself. Here are some “screenshots”:

nmo:FeedMessage a rdfs:Class ;
         tracker:notify true ;
         rdfs:subClassOf nmo:Message .

The DBus API is split per rdfs:Class. That’s because we don’t want that a lot of applications will be listening for each and every change to each and every subject. Now they can instead select the classes that they are interested in. That way they can avoid getting woken up constantly by our damn signal.

This feature is a change signal mechanism. We will eventually implement support for full live-queries.

Supporting full live-queries means that we’ll need to store some state about your session to test whether your SPARQL query requires an update from Tracker to get your client’s previous result list synchronized. It’s quite a bit more tricky, especially if you want to support most of SPARQL to go together with the live-query notifications, and especially if you want to be accurate.

Finally you don’t want the live-query capability to be a huge performance burden each time material becomes updated.

This is why we are now putting in place this feature. We think that for 90% of the applications the capabilities of the class-signals feature will be sufficient for their live update needs.

Biweekly Tracker development planning

We realize

At the Tracker team we realize that we should not only tell you guys about Tracker’s new features. We should also involve you in the planning and realization of software development at the Tracker project.

I have discussed this with our team and we agreed that I will make a summary at each beginning of our biweekly planning.

The reason is fairly simple: we want to involve the community and inspire them to become active contributors. But how can they contribute unless they know what the plans of the development team of Tracker for the next few days are? You are right, they can’t.

Which is a situation that we want to change. Therefor, here’s the summary of the plans for the next two weeks:

Ontologies

  • Making tickets for an audio ontology that we made during a previous sprint to upstream Nepomuk.
  • Craft an ontology for images that will later be proposed to Nepomuk. We will convert the extractors in our experimental branch to use the new image ontology.
  • Applications need to be able to create arbitrary key value pairs to existing resources. We plan to add a custom ontology on top of Nepomuk’s NAO ontology that will make this possible.

Fixing things in the 0.6.9x series

  • Changing unique value sorting to use collations and we’re planning to change the unique value queries for better performance.
  • Making sure tracker-indexer and tracker-extract support detecting unmount events, and marking resources as unavailable, at a very early stage. This helps cleanly unmounting a removable device without keeping the device busy as the unmount request takes place.
  • Supporting listening for a signal that tells us to pause the indexer. This will be helpful on devices that want us to be silent while they are performing a higher priority task.
  • Evaluating and testing concurrent access.

Our experimental branch

  • Start discussing merging our experimental branch to trunk together with Jamie.
  • We have updated the new .ontology files in our experimental branch to include more specific rdfs:range information.
  • Making it possible to observe changes about resources that belong to a class. For example observing changes in all nfo:Document classes. The granularity will be at the level of a resource. You’ll know whether a resource is updated, deleted or created. This task might take more than a week to complete. Ivan Frade is planning to write a blog item dedicated to this subject.
  • To fulfill the requirements of certain use-cases we’re planning to make it possible to delete a complete resource (or a set of) with one function call.
  • We might start working on adding support for application domain (custom) ontologies.
  • Porting recent improvements on support for detecting file renames to our experimental branch.
  • Evaluating isLogicalPartOf or alternative link as cascading rule for deletes.

When people are interested in joining development of Tracker, they can ping us at the channel #tracker on GimpNet. They can also send a E-mail to Tracker’s mailing list.

The ontology descriptions of Tracker, now in Turtle

Thanks to Jürg is the experimental branch of Tracker storing its ontology descriptions using the Turtle format.

What is an ontology anyway?

Wikipedia sums it up pretty well: In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

What is Turtle?

The w3 specification explains it as a textual syntax for RDF that allows RDF graphs to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes.

Turtle is the format that we want to standardize on. We for example plan to use it for some of our interprocess communication needs, we are already using it for backup and restore support and we used it as base format for persisting user metadata on removable devices.

An example snippet from the ontology in Turtle:

nie:InformationElement a rdfs:Class ;
     rdfs:label "Information Element" ;
     rdfs:subClassOf rdfs:Resource .
nie:title a rdf:Property ;
     rdfs:label "Title" ;
     rdfs:comment "The title of the document" ;
     rdfs:subPropertyOf dc:title ;
     nrl:maxCardinality 1 ;
     rdfs:domain nie:InformationElement ;
     rdfs:range xsd:string ;
     tracker:fulltextIndexed true .
nfo:Document a rdfs:Class ;
      rdfs:label "Document" ;
      rdfs:comment "A generic document. A common superclass for all documents on the desktop." ;
      rdfs:subClassOf nie:InformationElement .

Okay …

It only made sense for us to destroy and eliminate inifile formats like the ontology descriptions of the current non-experimental Tracker. Let me explain:

We have plans to add support for adding domain specific custom ontologies. By that I mean that it’ll be possible for an application to install and remove an ontology, to get the application specific metadata out and to restore this data as part of reinstalling a custom ontology.

As I already pointed out during my Tracker presentation at FOSDEM doesn’t this mean that we encourage application developers not to care about the base ontology. In fact we strongly recommend application developers to stick to- instead of diverting much from Nepomuk.

Meanwhile experimental Tracker’s indexer has started to work and is storing things in our decomposed RDF storage that uses Nepomuk as schema and can be queried using SPARQL.

Vision

My vision for metadata storage is that files are just one kind of resource. Tracker’s indexer collects the metadata about mostly those file-based resources. However. Metadata is everywhere: in META tags of websites and RSS feeds, in both locally stored and streamable remotely located multimedia resources, in E-mails, meeting requests and calendar items, in your roster and contacts list, in daily events, in computer events and installed applications.

Computer events? I hear you thinking. Well, for example a “hardware” event like your location changing just before you took a picture. That location can be harvested as metadata about the picture. There are many more examples imaginable.

For some datas is metadata the only thing really being stored. For example aren’t contact-resources having much more data than today’s ontologies describe. For most of the others will metadata describe the resource. And often, more importantly, its relationship with other resources.

We want Tracker to be your framework for metadata on both desktops and mobile devices. This is why we want to use w3 standards like Turtle over our own formats. We just happen to see things quite big.

No more old XML RDFQuery, but SPARQL. No more inifiles, but Turtle. No more home brewed ontology, but Nepomuk.

Merge-to-trunk plans

We are planning to start merging the experimental stuff to trunk, and start calling it the 0.7.x series of Tracker. Let’s see how many cocktails we’ll need for Jamie to get drunk enough to start accepting our insane, but already working, ideas.

TrackerClue!

I was just skipping some memes and version control system flames debates when suddenly I bumped on an interesting blog post by Henri Bergius on how he sees integration of the GeoClue project with the desktop.

I’m more into mobile myself, considering a desktop a necessary evil that people don’t really enjoy using. But have to use, because there’s nothing better. Meanwhile is there a trend towards mobile uses. Music players, cameras, in-car entertainment, navigation assistance, movie players, setup boxes. And sooner or later ePaper devices to replace magazines, books and newspapers.

But whenever the desktop’s software gets integrated with a location framework, it wakes me up. That’s because I consider having access to meaningful location clues to be a creator for a large amount of very interesting use-cases for mobile. Use-cases which we might not be seeing yet, today. Because we humans are walking blinded into the future.

Which means that we at the Tracker project must and will welcome such integration. We too want to enable the app developers of tomorrow and today to convert their innovative ideas into elegant solutions. Location clues about events and resources will be very interesting meta information for those apps, indeed.

Such systems can already update the Nepomuk structured meta information that Tracker collects using the SPARQL INSERT, DELETE and UPDATE support that Jürg started working on since a week or so.

It’s actually finished… maybe not sufficiently tested. But isn’t crashing hard pure fun anyway? Gives you a reason to go code and fix it!

Although we are very hard at work to get the indexer working again wont our experimental branch index your documents just yet. We have been testing our query stuff by importing generated Turtle files to be honest.

Nonetheless I kindly invite people to completely break their Tracker install by trying out our experimental stuff. Read a bit about Nepomuk’s ontology and mentally glue that together with the query flexibility SPARQL enables, and you’ll pretty soon grasp how cool it will be.

And yeah, there’s still a lot of hard work to do. But that’s great and a lot of fun.

And you should grow a pointy hat, put on a beard, jolt drink cola, and fun the join!

Ok. That’s enough Tracker propaganda for a day. Let’s now check if Nepomuk’s current stuff is good for storing GeoClue’s info.

SPARQL, Nepomuk, StreamAnalyzer and Tracker

We at the Tracker team should in my opinion report more often in our blogs about our progress on things like Tracker’s SPARQL and Nepomuk support.

Let’s start with the awesome SPARQL support that Jürg has been working on. Just a few minutes ago when you made a SPARQL query that had a unknown predicate, Tracker returned an empty array over D-Bus.

dbus-send --print-reply --dest=org.freedesktop.Tracker
   --type=method_call /org/freedesktop/Tracker/Search
   org.freedesktop.Tracker.Search.SparqlQuery
   string:'SELECT ?title WHERE { ?s nie:ttle
      ?title FILTER regex(?title, ".*in.*") . }'

method return sender=:1.66 -> dest=:1.98 reply_serial=2
   array [
   ]

Leaving you in the unknown about your query being in error. Jürg fixed this and now you get something like this instead.

tracker-sparql --query="SELECT ?title WHERE { ?s nie:ttle
      ?title FILTER regex(?title, \".*in.*\") . }"
Could not query search, Unknown property `http://.../nie#ttle'

This way you can fix your query’s error and do something like this instead:

tracker-sparql --query="SELECT ?title WHERE { ?s nie:title
      ?title FILTER regex(?title, \".*in.*\") . }"

  The final metadata solution
  Tracker in gnome bugzilla

Today I migrated the code in Tracker that implements support for the metadata D-Bus API for E-mail to the Nepomuk Message Ontology. Meaning that Tracker will store the metadata it receives from E-mail clients like KMail and Evolution using the NMO ontology and that it’ll make this metadata available to the SPARQL query engine.

Great news that we got informed of this week is that a developer has started implementing the metadata D-Bus API for E-mail in Thunderbird. He left a pointer to his git repository on the wiki-page.

Meanwhile I have implemented the API in KMail. This patch is pending review. We are planning to add support for this in Modest soon too.

Next. We are migrating the indexers and extractors to Nepomuk. These tasks come with all sorts of extra work related to integrating with Nepomuk as ontology.

I have also implemented integration with Strigi’s truly awesome StreamAnalyzer. I have rarely seen such a beautifully designed piece of code that in my opinion outperforms whatever Tracker has at this moment for extracting metadata in several interesting ways.

I don’t know why we shouldn’t join Strigi on making StreamAnalyzer kick ass. I can find no reason why instead of trying to compete with it we shouldn’t integrate with it. I’m pushing our team to consider the integration option and so far they are enthusiastic about it.

StreamAnalyzer needs a migration from Xesam as ontology to Nepomuk. But Evgeny Egorochkin and Jos Vandenoever already told me that they have put this on their agenda. After that, with the integration that I did for Tracker, can StreamAnalyzer become the core analysis code that Tracker uses. Right now the plan is to let StreamAnalyzer be the first to run and then letting Tracker’s own extractors follow up.

Let’s make some more bridges with KDE projects. Why not!

E-mail metadata, “E-mail as a (desktop) service”

Not on the desktop but on mobiles I think the era of E-mail clients will soon be over. Just like the era of filemanagers will be over. A person who’s using a mobile or a phone doesn’t really want to start and stop applications. Those users don’t start and stop applications to receive phonecalls and text messages. Why would they want to start and stop E-mail clients?

Not only that. People also want their text messages, E-mails, history of calls, meetings, contacts and photos to be integrated.

When we search for a meeting, we want to find the photos we took at that meeting. We also want to find the contacts that were at that meeting. We want to find the invitation E-mail and all the replies to the invitation. And when we select a contact, we want to see a tree of E-mail discussions that we once had with the contact.

On a mobile we don’t want one big application that does all this. Instead we want all applications to integrate with all this information. And we want it to be very easy for application developers to integrate with this system. An application framework.

This will be the purpose of Tracker on mobiles.

This means that the concept of an E-mail client will eventually be moved to the background of the mobile device. E-mail must just be there, not started. Something must communicate with the IMAP server whenever needed. Meanwhile all applications on the mobile need to have easy access to the metadata of the E-mails. It must be easy for them to get a MIME part of an E-mail, perhaps as a InputStream?

The combination of E-mail metadata querying and handling of the E-mail’s MIME parts is what I refer to as “E-mail as a service”. Some people in the past tried to explain me that if they would just put JavaMail’s API over D-Bus that this would already solve “E-mail as a service”. I don’t think this is true. Camel, which had its API based on JavaMail, offers a truly weak query interface. It’s for example not possible to ask for MIME parts that are images that have a specific Exif tag set.

That’s of course because it’s not Camel’s purpose to do metadata indexing of the attachments. But that was yesterday. Yesterday was boring.

With modern IMAP servers that have the CONVERT capability it will be possible to ask for a converted MIME part of an E-mail. Converted to Exif plain-text data. Meaning that we don’t have to fetch 5MB of JPEG data just to read a few hundred bytes of Exif metadata.

Meanwhile normal IMAP servers already offer ENVELOPE and BODYSTRUCTURE which of course gives us a lot of metadata too.

To assist people who want to write a “E-mail as a service” D-Bus service today I have decided to write a document that explains some of the capabilities of modern IMAP.

I think the future of E-mail “infrastructure” lies in:

  1. Using an RDF store that can be accessed using SPARQL, like Tracker. This stores the ENVELOPEs and BODYSTRUCTUREs of your E-mails next to the attachment’s other metadata triples. The query language can then be used to query against metadata found by analyzers like Tracker’s own extractors and/or Strigi’s StreamAnalyzer as well as metadata coming from IMAP itself.

    People who saw my presentation at FOSDEM already know that we are planning to push Tracker in the direction of SPARQL + Nepomuk as ontology. Meanwhile we are in discussion with the Xesam and Nepomuk people to change the Nepomuk Message Ontology to be suitable for this. As a result Evgeny Egorochkin made this proposal.

  2. Having a small service for dealing with E-mail specific things. Like getting the contents of the MIME parts as streams. Requesting them to be downloaded if they aren’t cached locally yet. Requesting a CONVERTed version of them.

    There are some experiments happening that will implement this capability. It’s all still very early. If I ever start a Tinymail 2.0, I will probably make it focus primarily on this.

I don’t think the conventional E-mail client will survive for very long. Especially not on mobiles where “integration” is far more important for the end-user.

Every (mobile) application can soon become as capable of handling E-mail as what we today call “the E-mail client”. At least from the point of view of the user. In reality a desktop service will solve the hard stuff.

A generic E-mail metadata D-Bus API

A generic E-mail metadata API

You guys remember the Evolution Metadata API? No? I wrote about it a few days ago.

We now want Tracker to integrate with the KDE desktop too. I started implementing the D-Bus API in KMail. I have adapted the specification to become more generic and to split out specific pieces into a shared and a specific part. Like for example the ontology.

QT/C++

Yes, I’m doing some QT/C++ coding after three years of not having to touch a lot of C++. But I’ll manage. I mean, it can’t get worse than Python, PHP or Perl, right? I survived developing with those languages too and it’s not my first time that I had to touch C++ code. Also, I survive GObject without always having Vala available.

Well, QT/C++ is interesting. Firstly I refuse to call it normal C++, it’s not normal C++ at all. That’s not necessarily a bad thing about it, though. Secondly on top of the inhuman quagmire that C++ has become by itself, QT adds almost as much boilerplate and macros as GObject is doing on top of C

That’s not good.

In my opinion C++ is not a nice language at all. It’s a completely inconsistent quagmire in my opinion.

Freedesktop.Org

I’m not immediately proposing it as a Freedesktop.Org standard. My personal experience with F.D.O. is that unless you have a few implementations nothing much will happen with your proposal. At least not for several months. Maybe if I first finish the implementation for KMail while having the one for Evolution ready it’ll speeds things up?

Let’s first implement it for both E-mail clients.

Akonadi

This is a message for specifically Aaron Seigo, because I know he would otherwise be frantic about it.

I have been talking with the KMail developers and because KMail’s Akonadi integration will take too long to be finished we have agreed to start with this D-Bus API. We will most likely integrate Tracker with Akonadi as soon as it starts becoming consumable on people’s KDE desktops.

Birds will sing, the sun will shine. Don’t worry too much and especially don’t start punching the project teams who are actively but pragmatically trying to make things better.

We’ll get there but converting Evolution to use Akonadi is trying to take too large steps and is therefor not a useful proposal at this moment. While for Tracker, Akonadi needs to be ready *now*.

I propose that the Akonadi people develop a CamelProvider that talks with Akonadi, by the way. Then make an EPlugin that overrides the account management to always create such Akonadi accounts and develop working and well tested migration code for the data.

I think even if this is ever started as a project that it would be quite unlikely that this will be finished in time. So not very useful for Tracker right now.

The Evolution DBus metadata API

I just finished the Evolution DBus metadata API‘s implementation. Information about this work can be found on this wiki page.

It’s currently not shipped as part of Evolution. Instead it’s done as an EPlugin compiled and installed by Tracker if your Evolution’s development package is 2.25.5 or later. At this moment that’s the versions in Evolution‘s and Evolution Data Server‘s Subversion trunks.

This API enables application developers to get notified not just about new E-mails, but also about any state changes of any E-mail Evolution handles. For example whenever the state of an E-mail changes from Unseen to Seen. You are invited to check out this Vala example to learn how to use the DBus API yourself.

Not only does the API give you “I’ll call you” or “pushed” hints about these state changes, it also gives you a lot of meta information about each E-mail: subject, sent, from, to, cached MIME filename, cc, service’s uid, seen, junk status, answered status, flagged status, forwarded status, deleted status, size, tags, keywords and flags.

The API will also inform you about message arrivals and about message expunges. Which are not the same as flagged for deletion.

With this information you have enough to build a simple E-mail client outside of Evolution that uses just this API to get itself informed about Evolution’s E-mails. That’s of course not the purpose of it, but it illustrates the completeness.

The purpose is to let a metadata engine like Tracker and Beagle get themselves fully informed and aware of all the state and metadata of your E-mails in real time. No more scanning of your $HOME/.evolution/mail directory. Which is by the way a for Evolution internal cache format that will change and has changed in the past. Scanning it yourself is for that reason too unreliable to depend on. This EPlugin uses Evolution’s own APIs, and runs in Evolution’s processes, to get the same information out and then pass it to one of Tracker’s processes over IPC.

“IPC??”, I hear you thinking, “isn’t that slow??”. Well we measured it. If a quite slow IPC like DBus can throw all of the metadata of 10,000 E-mails over in less time than Evolution needs to start up, knowing that we throw it over asynchronously with the user interface’s thread of Evolution, they I don’t think you should be too worried about that. But just in case you would still be worried have I specified the protocol in such a way that only the delta of changes since the last time Tracker registered itself are sent. While Evolution and Tracker run, Evolution will communicate new info with Tracker: Evolution pushes stuff into Tracker instead of Tracker pulling things out of Evolution.

What are you waiting for? Start adapting your $HOME/.evolution/mail scanning softwares! Now!

What is happening nowadays?

Working on a metadata DBus API for E-mail clients. I have started a wiki page proposing the API for an implementation in Evolution.

Afterwards I started implementing it as a proof of concept for this E-mail client.

I plan to implement the same as a plugin in Thunderbird, Tinymail and Modest. Perhaps after reviewing the Evolution and GNOME specific bits and pieces of the proposal and making them more generic. That way, finally, will metadata engines like Beagle and Tracker have a sane way of accessing and getting notified of E-mail content.

Mikkel Kamstrup decided to wrap the API proposal up in a Xesam jacket which might end up becoming that ‘more generic’ API proposal. But let’s first have a proof of concept in Evolution that works with the stuff that we are working on at the Tracker project.

Yezs you can find bugz, diffz ‘n codez at Bug #565082 and Bug #565091. If you want to help out, just ping me and then I’ll quickly make branches of Tracker and Evolution’s data server so that we can work together on this.

There’s also a Vala client example which illustrates how to consume this service.

Tracker is by the way being worked on heavily. We’ve been making a lot of architectural changes to the indexer during the last few weeks.

Meanwhile has Jürg started working on adding a ‘decomposed’ RDF triple store. Making it possible to support any kind of ontology. Including the Nepomuk ontologies, which are at this moment the ontologies that we are aiming for.

Jürg also added a SparQL query language engine to it. Making it possible for you as a client developer to execute SparQL queries on the stored data. We’re not yet supporting everything of SparQL, because some things make relatively few sense for our purposes, but we have added a few SparQL extensions that do make sense. Like aggregation and GROUP-BY.

Here’s an example of a SparQL query that finds people stored using a Nepomuk ontology that have a specific phone number:

SELECT ?firstname ?lastname ?email WHERE {
      ?person nco:hasPhoneNumber <tel:+19071131826> ;
       nco:nameGiven ?firstname ;
       nco:nameFamily ?lastname ;
       nco:hasEmailAddress ?email
}

Here’s another example of a SparQL query that shows the ten most recent E-mails:

SELECT ?subject ?date WHERE {
    ?msg nmo:messageSubject ?subject ;
         nmo:receivedDate ?date
} ORDER BY DESC(?date) LIMIT 10

Or this one which lists all individual artists, count of albums for each artist and total playing time of all songs for the artist:

SELECT ?artistname COUNT(?album) AS count SUM(?length) AS len WHERE {
    ?song nid3:leadArtist ?artist ;
          nid3:length ?length ;
          nid3:albumTitle ?album .
    ?artist nco:fullname ?artistname
} GROUP BY ?artist

These are sample queries that already work, if you nag Jürg on how you can get some data into the tables. We’re of course working on adapting the indexer to populate the tables. Knowing Jürg this might already work flawlessly.

If you like things like “semantic desktops”, like having your desktop search cope with truly meaningful queries (the kind of queries that Federico was dreaming of in his keynote at Istanbul), then you should checkout the developments we are doing with Tracker. I warn you that a lot of this truly is ‘development’. It might not work at all, etc. But it’s cool. Really.

Let’s turn your desktops and mobiles into platforms that offer all kinds of services for your high level applications written in JavaScript or whatever language you fancy. Like configuration services, Thumbnailing services, E-mail metadata and notification services, Metadata query services. Meanwhile we’ll make you GObject-introspection so that it’s very easy to write a platform library yourself that you can directly invoke from those higher languages. As that project will make most language bindings almost automatic. And we’ll have Vala to make it easy for you to write services and other platform software yourself.

ps. The RDF triple store and SparQL stuff ain’t happening in Tracker’s trunk yet. That would disturb development of Tracker too much. We’ve been doing this in a git branch, use the branch “vstore”.

Good, that stack of links should keep you blog reading wolves silent for another few weeks.

Thumbnailer specification and prototype

Why do we need thumbnailing to be a service?

  • For user interface applications it makes relatively few sense to run the task of creating a thumbnail in the same context as the mainloop that draws the user interface. On the other hand if each desktop application starts creating either processes or worker threads that will be armed with thumbnailing code, then we will have a lot of threads and processes all running the same code;
  • Most applications link with a user interface toolkit that will happily deal with the vast majority of pixbuf shaped formats. That doesn’t mean that these toolkits will equally enjoy dealing with PDF, Office and video formats. There’s a lot of code involved here and we should try to avoid requiring everybody to load these complex pieces of code into their processes. I can give a few purely technical reasons like not heaving to map-in code that is not relevant for the application, reducing VmSize (although, admittedly, only things like VmRSS are really important). There are also a few political reasons, like patented formats. In the end I’ll just say it the way it is: it’s a bad architecture;
  • Application developers are really not very interested in developing LIFO queues and worker threads or processes that will handle the task of creating thumbnails;
  • Finally, application developers are asking for this (for example F-Spot). Creating thumbnails is not at all an exclusive task for the filemanager.
  • My proposal

    Based on those conclusions I decided to write a DBus specification. I also reimplemented Maemo’s Hildon Thumbnail to be conform this specification. This work has been merged with the TRUNK of the project and will be used on Maemo‘s Fremantle release.

    While rewriting Hildon Thumbnail I decided to make sure that the software compiles and runs on any normal desktop. This way the software can serve as a proof of concept and working prototype for the DBus specification. Special care was taken to make sure it feels as desktop neutral as possible.

    I opened a bug to officially request a freedesktop.org project for this specification. I hope this organization will offer a platform for further development of this DBus specification. Hildon Thumbnailer can serve as a prototype and will be adapted whenever the specification improves.

    Here’s a meme: org.freedesktop.Thumbnailer

    People who know me probably saw this blog item coming. Here it is!

    In Tracker we want to ahead of time create thumbnails for interesting files. Among the use cases is when the user has moved or copied photos from his camera into one of the photo folders. We want to start preparing thumbnails for those files early so that filemanagers and photo applications are fast when needed.

    The current infrastructure for this in Tracker is to launch a script for each file that is to be thumbnailed. If you find a lot such files (some people end up with a camera with 1,000ths of photos after a busy weekend), that would mean that we’d do this 1,000 times:

    fork();
    execv(tracker-thumbnailer);
    fork();
    execv(bash);
    fork();
    execv(convert);

    Luckily this is not activated by default in current Tracker. :-)

    I don’t have to explain most people who read this blog that this is a bad idea on a modest ARM device with a bit more than one hundred MB of RAM. A better idea would be to have a service that queues these requests and that solves the requests with specialized image libraries. Perhaps launching a separate binary for the MIME types that the service has no libraries for?

    At first we were planning to make tracker-thumbnailer listen on stdin in a loop. Then I figured: why not do this over DBus instead? Pretty soon after that was Ivan Frade concluding that if we’d do that, other applications on the device could be interested in consuming that service too. We decided that perhaps we should talk with the right people in the two large desktop communities about the idea of specifying a DBus specification for remotely managing the thumbnail cache as specified by the Thumbnail managing standard by Jens Finke and Olivier Sessink.

    I don’t know of a official procedure other than filing a bug on freedesktop.org, so at first I tried to get in touch with people like David Faure (KDE), Christian Kellner (Nautilus), Rob Taylor (DBus, Telepathy, Wizbit) and later also a few mass discussions on #kde-devel, #nautilus and #gnome-hackers.

    I started a discussion on xdg-list which made me conclude that such a DBus API would indeed make sense for a lot of people. Discussions with individuals on IRC added to that feeling. I started a draft of a first specification for a DBus API.

    Meanwhile I had already started adapting the hildon-thumbnail code to become more service-like. Right now that code has a DBus daemon that implements the draft DBus API and on top of that provides the possibility to have dynamically loadable plugins. The specification also allows registering thumbnailers per MIME type. For that reason I made it possible to run those dynamically loadable plugins both standalone and in-process of the generic thumbnailer.

    It has been my prototype for testing the DBus API specification that I was writing. People told me that if you want to make a specification that’ll get accepted, the best way is to write a prototype too. Meanwhile Rob Taylor had joined me on fine tuning the specification itself. With his DBus experience he helped me a lot in multiple areas. Thanks for caring, Rob!

    The current prototype does not yet make it possible to simply drop-in a thumbnailer binary to add support for a new MIME type. By making a standalone thumbnailer that for being a thumbnailer simply launches external thumbnailers you could of course add that possibility that a lot of current thumbnail-infrastructure has. Although as mentioned above I don’t think this is a good architecture (the fork() + execv() troubles), I plan to make such a standalone plug-in thumbnailer.

    I certainly hope that this specification will be approved by the community. I can help with making patches for Konqueror and Nautilus. We’ll most likely use this on the Maemo platform for thumbnailing ourselves.

    On reference counting

    I made a little bit of documentation on reference counting. It’s not yet really finished, but I’ve let two other developers review it now. I guess that means it’s somewhat ready.

    The reason I made it was because as I browsed and contributed to GNOME’s code, I noticed that a lot of developers seem to either ignore reference counting or they use it incorrectly all over their code.

    I even saw people removing their valid reference usage because they had a memory leak they wanted to solve. As if introducing a race condition is the right fix for a memory leak! Some people have rather strange ways of fixing bugs.

    What people who don’t want to care about it should do, and I agree with them, is to use Vala instead.(Or D, or Python, or C#, or Java, before I get hordes of language fans in my comments again. Oh! Or C++ with smartpointers too! – oeps, I almost forgot about the poor céé plus plus guys -)

    Anyway, I’m sure my guidelines are not correct according to some people, as there are probably a lot of opinions on reference counting. In general I do think that whenever you pass an instance to another context (another thread or a callback) that you simply must add a reference. If you do this consistently you’ll have far less problems with one context finalizing while another context is still using it.

    It’s a wiki page, I’m subscribed. You can just change the content if you disagree. Being subscribed I’ll notice your changes and I’ll review them that way.

    http://live.gnome.org/ReferenceCounting

    It’s not the first such item that I wrote down. Here are a few others:

    After reviewing this document José Dapena promised me he’s going to make a page about reference count debugging in gdb, like adding watches on the ref_count field of instances. To make sure he keeps to his promise I decided to put a note about that here. <g>

    Recursive locks in Vala and D-Bus service support for Vala interfaces

    I have been whining about features that I want in Vala to Jürg. To make up for all the time he lost listening to me I decided to fix two Vala bugs.

    The first bug I fixed was using a recursive mutex for lock statements. Code like this will work as expected now:

    public class LockMe : GLib.Object { }
    public class Executer : GLib.Object {
    	LockMe o { get; set; }
    	construct { o = new LockMe (); }
    	void Internal () {
    		lock (o) { }
    	}
    	public void Method () {
    		lock (o) { Internal (); }
    	}
    }
    public class App : GLib.Object {
    	static void main (string[] args) {
    		Executer e = new Executer ();
    		e.Method ();
    	}
    }

    Here’s a gdb session that most GLib programmers will recognize:

    Breakpoint 1, 0x08048a87 in executer_Method ()
    (gdb) break g_static_rec_mutex_lock
    Breakpoint 2 at 0xb7e4d0e6
    (gdb) cont
    Continuing.
    Breakpoint 2, 0xb7e4d0e6 in g_static_rec_mutex_lock ()
       from /usr/lib/libglib-2.0.so.0
    (gdb) bt
    #0  0xb7e4d0e6 in g_static_rec_mutex_lock () from /usr/lib/libglib-2.0.so.0
    #1  0x08048b04 in executer_Method ()
    #2  0x08049046 in app_main ()
    #3  0x0804908a in main ()
    (gdb) cont
    Continuing.
    Breakpoint 2, 0xb7e4d0e6 in g_static_rec_mutex_lock ()
       from /usr/lib/libglib-2.0.so.0
    (gdb) bt
    #0  0xb7e4d0e6 in g_static_rec_mutex_lock () from /usr/lib/libglib-2.0.so.0
    #1  0x08048a6e in executer_Internal ()
    #2  0x08048b0f in executer_Method ()
    #3  0x08049046 in app_main ()
    #4  0x0804908a in main ()
    (gdb) cont
    Continuing.
    Program exited normally.
    (gdb)

    The second bug is supporting interfaces for D-Bus services in Vala. It goes like this:

    using GLib;
    [DBus (name = "org.gnome.TestServer")]
    public interface TestServerAPI {
    	public abstract int64 ping (string msg);
    }
    public class TestServer : Object, TestServerAPI {
    	int64 counter;
    	public int64 ping (string msg) {
    		message (msg);
    		return counter++;
    	}
    }
    void main () {
    	MainLoop loop = new MainLoop (null, false);
    	try {
    		var conn = DBus.Bus.get (DBus.BusType.SESSION);
    		dynamic DBus.Object bus = conn.get_object (
    			"org.freedesktop.DBus", "/org/freedesktop/DBus",
    			"org.freedesktop.DBus");
    		uint request_name_result = bus.RequestName ("org.gnome.TestService", 0);
    		if (request_name_result == DBus.RequestNameReply.PRIMARY_OWNER) {
    			// start server
    			var server = new TestServer ();
    			conn.register_object ("/org/gnome/test", server);
    			loop.run ();
    		} else {  // client
    			dynamic DBus.Object test_server_object =
    				conn.get_object ("org.gnome.TestService",
    					"/org/gnome/test", "org.gnome.TestServer");
    			int64 pong = test_server_object.ping ("Hello from Vala");
    			message (pong.to_string ());
    		}
            } catch (Error foo) { }
    }

    Implementing your Vala interfaces in GObject/C

    In Vala you can define interfaces just like in C# and Java. Interfaces imply that you can have class types that implement one or more such interfaces. Vala does not force you to implement its interfaces in Vala. You can also implement them in good-old GObject C.

    Here’s a detailed example how you implement a type that implements two Vala interfaces in GObject/C:

    Switching to multiple threads, with a non-thread-safe resource

    Your application used to be single threaded and is consuming a resource that is not thread-safe. You’re splitting your application up into two or more threads. Both threads want to consume the non-thread-safe resource.

    In this GNOME-Live item I explain how to use GThreadPool for this.

    It’s a wiki so if you find any discrepancies in the sample and or text, just correct them. I’m subscribed so I’ll review it that way.

    The GNOME-Live item is done in a similar way to the item about using asynchronous DBus bindings and the AsyncWorker item.

    DBus using DBus.GLib.Async and dbus-binding-tool

    While I was gathering some info about a DBus related task that I’m doing at this moment, I wrote down whatever I found about DBus’s glib bindings in tutorial format.

    A few other people have done similar things in their blogs. This one explains how to use org.freedesktop.DBus.GLib.Async a little bit too.

    If you find any mistakes in the document, it’s a wiki page so please just correct them.

    Iterators and tree models! Shocking!

    Iterators

    Now that Murray has posted about the iterators idea I can no longer hide.

    Together with Jürg, the Vala man and Murray, who before becoming a pregnant guy was and still is coding on things like Glom and gtkmm, I’ve been writing up a document. This document explains iterators like the ones you’ll find in .NET and Java.

    This is related to the Iterators concept because the document about iterators could be part of a solution for this.

    I once had to write a custom GtkTreeModel (I think I still suffer from vertigo) and therefore I sent treeview Kris an analysis about this.

    There’s a rule that an interface should be a contract for solving exactly one problem.

    Use interfaces to say “What objects can do” or “What can be done to an object”. You can let a class implement multiple interfaces. There’s absolutely no need to make huge interfaces. An interface should define that your class “Does One Thing”. Another interface defines that your class “Does Another Thing, too”, and a last interface defines that your class “Also Does This”. My class Z can do D, E and F. Therefore: class Z implements D, E and F.

    Instead, the contract being GtkTreeModel requires you to solve five problems:

    • being iterable: you can go to the next node
    • being a container: you can add and remove things from it
    • being observable: it notifies a component about its changes
    • being a tree: sometimes items can parent other items
    • having columns: it’s something that describes columns

    Summarizing: interface TreeModel means BEING a D, E, F, G and H at the same time. Correct would be if you specified that TreeView requires a model that does D, does E and perhaps also does F. In case treeview is showing the things as a tree, it requires that the model also does E and does H.

    GtkTreeModel should have been five different interfaces.

    Shocking! But in my opinion, true.

    The consequence is that having to implement a GtkTreeModel has become a difficult task.

    “You have to implement solutions for all five problems anyway!”, you might think. That’s true. However! If there were five interfaces, other people would have made default solutions for four other interfaces. Meanwhile “you” focus on “your” specialization. For example Rhythmbox and Banshee are specialized in huge models that contain a specific set of data that must be searchable. Preferably search and sort are done by a database instead of a model-filter.

    Banshee and Rhythmbox developers:

    • … don’t have to care about the Columns problem
    • … don’t have to care at all about the Tree problem
    • … don’t have to care about the Container problem: they never need to add nor remove things from it because they set up their model instantly by passing a query to a data source. Maybe the feature “remove from playlist” requires solving the Container problem?
    • … unless they support inotify on the MP3 folder of the disk, they don’t even have to care about the observer problem

    Yet with GtkTreeModel you have put the burden on application developers to solve all five the problems in their model. No wonder Banshee developers decide to code their own TreeView in .NET!

    Web 2.0 !!!

    A few days ago I got this reply on one of my blog posts:

    Phil:

    Your post — and your work — miss the point entirely. Nobody cares how email works, they just want it to work.

    Gmail (and most other webmail applications) makes everything else obsolete. I can’t imagine why Evolution is even shipped with Gnome anymore.

    Web-based email clients are the standard.

    -Anon

    I just finished the E-mail client the guy wants. Here it is!

    using GLib;
    using Gtk;
    using WebKit;
    
    class Web20EmailClient : Window {
    	WebView view;
    	construct {
    		view = new WebView ();
    		view.open ("http://gmail.com");
    		view.set_size_request (640, 480);
    		add (view);
    	}
    	static void main (string[] args) {
    		Gtk.init (ref args);
    		var win = new Web20EmailClient ();
    		win.show_all ();
    		Gtk.main ();
    	}
    }