The Evolution DBus metadata API

I just finished the Evolution DBus metadata API‘s implementation. Information about this work can be found on this wiki page.

It’s currently not shipped as part of Evolution. Instead it’s done as an EPlugin compiled and installed by Tracker if your Evolution’s development package is 2.25.5 or later. At this moment that’s the versions in Evolution‘s and Evolution Data Server‘s Subversion trunks.

This API enables application developers to get notified not just about new E-mails, but also about any state changes of any E-mail Evolution handles. For example whenever the state of an E-mail changes from Unseen to Seen. You are invited to check out this Vala example to learn how to use the DBus API yourself.

Not only does the API give you “I’ll call you” or “pushed” hints about these state changes, it also gives you a lot of meta information about each E-mail: subject, sent, from, to, cached MIME filename, cc, service’s uid, seen, junk status, answered status, flagged status, forwarded status, deleted status, size, tags, keywords and flags.

The API will also inform you about message arrivals and about message expunges. Which are not the same as flagged for deletion.

With this information you have enough to build a simple E-mail client outside of Evolution that uses just this API to get itself informed about Evolution’s E-mails. That’s of course not the purpose of it, but it illustrates the completeness.

The purpose is to let a metadata engine like Tracker and Beagle get themselves fully informed and aware of all the state and metadata of your E-mails in real time. No more scanning of your $HOME/.evolution/mail directory. Which is by the way a for Evolution internal cache format that will change and has changed in the past. Scanning it yourself is for that reason too unreliable to depend on. This EPlugin uses Evolution’s own APIs, and runs in Evolution’s processes, to get the same information out and then pass it to one of Tracker’s processes over IPC.

“IPC??”, I hear you thinking, “isn’t that slow??”. Well we measured it. If a quite slow IPC like DBus can throw all of the metadata of 10,000 E-mails over in less time than Evolution needs to start up, knowing that we throw it over asynchronously with the user interface’s thread of Evolution, they I don’t think you should be too worried about that. But just in case you would still be worried have I specified the protocol in such a way that only the delta of changes since the last time Tracker registered itself are sent. While Evolution and Tracker run, Evolution will communicate new info with Tracker: Evolution pushes stuff into Tracker instead of Tracker pulling things out of Evolution.

What are you waiting for? Start adapting your $HOME/.evolution/mail scanning softwares! Now!

What is happening nowadays?

Working on a metadata DBus API for E-mail clients. I have started a wiki page proposing the API for an implementation in Evolution.

Afterwards I started implementing it as a proof of concept for this E-mail client.

I plan to implement the same as a plugin in Thunderbird, Tinymail and Modest. Perhaps after reviewing the Evolution and GNOME specific bits and pieces of the proposal and making them more generic. That way, finally, will metadata engines like Beagle and Tracker have a sane way of accessing and getting notified of E-mail content.

Mikkel Kamstrup decided to wrap the API proposal up in a Xesam jacket which might end up becoming that ‘more generic’ API proposal. But let’s first have a proof of concept in Evolution that works with the stuff that we are working on at the Tracker project.

Yezs you can find bugz, diffz ‘n codez at Bug #565082 and Bug #565091. If you want to help out, just ping me and then I’ll quickly make branches of Tracker and Evolution’s data server so that we can work together on this.

There’s also a Vala client example which illustrates how to consume this service.

Tracker is by the way being worked on heavily. We’ve been making a lot of architectural changes to the indexer during the last few weeks.

Meanwhile has Jürg started working on adding a ‘decomposed’ RDF triple store. Making it possible to support any kind of ontology. Including the Nepomuk ontologies, which are at this moment the ontologies that we are aiming for.

Jürg also added a SparQL query language engine to it. Making it possible for you as a client developer to execute SparQL queries on the stored data. We’re not yet supporting everything of SparQL, because some things make relatively few sense for our purposes, but we have added a few SparQL extensions that do make sense. Like aggregation and GROUP-BY.

Here’s an example of a SparQL query that finds people stored using a Nepomuk ontology that have a specific phone number:

SELECT ?firstname ?lastname ?email WHERE {
      ?person nco:hasPhoneNumber <tel:+19071131826> ;
       nco:nameGiven ?firstname ;
       nco:nameFamily ?lastname ;
       nco:hasEmailAddress ?email
}

Here’s another example of a SparQL query that shows the ten most recent E-mails:

SELECT ?subject ?date WHERE {
    ?msg nmo:messageSubject ?subject ;
         nmo:receivedDate ?date
} ORDER BY DESC(?date) LIMIT 10

Or this one which lists all individual artists, count of albums for each artist and total playing time of all songs for the artist:

SELECT ?artistname COUNT(?album) AS count SUM(?length) AS len WHERE {
    ?song nid3:leadArtist ?artist ;
          nid3:length ?length ;
          nid3:albumTitle ?album .
    ?artist nco:fullname ?artistname
} GROUP BY ?artist

These are sample queries that already work, if you nag Jürg on how you can get some data into the tables. We’re of course working on adapting the indexer to populate the tables. Knowing Jürg this might already work flawlessly.

If you like things like “semantic desktops”, like having your desktop search cope with truly meaningful queries (the kind of queries that Federico was dreaming of in his keynote at Istanbul), then you should checkout the developments we are doing with Tracker. I warn you that a lot of this truly is ‘development’. It might not work at all, etc. But it’s cool. Really.

Let’s turn your desktops and mobiles into platforms that offer all kinds of services for your high level applications written in JavaScript or whatever language you fancy. Like configuration services, Thumbnailing services, E-mail metadata and notification services, Metadata query services. Meanwhile we’ll make you GObject-introspection so that it’s very easy to write a platform library yourself that you can directly invoke from those higher languages. As that project will make most language bindings almost automatic. And we’ll have Vala to make it easy for you to write services and other platform software yourself.

ps. The RDF triple store and SparQL stuff ain’t happening in Tracker’s trunk yet. That would disturb development of Tracker too much. We’ve been doing this in a git branch, use the branch “vstore”.

Good, that stack of links should keep you blog reading wolves silent for another few weeks.

Thumbnailer specification and prototype

Why do we need thumbnailing to be a service?

  • For user interface applications it makes relatively few sense to run the task of creating a thumbnail in the same context as the mainloop that draws the user interface. On the other hand if each desktop application starts creating either processes or worker threads that will be armed with thumbnailing code, then we will have a lot of threads and processes all running the same code;
  • Most applications link with a user interface toolkit that will happily deal with the vast majority of pixbuf shaped formats. That doesn’t mean that these toolkits will equally enjoy dealing with PDF, Office and video formats. There’s a lot of code involved here and we should try to avoid requiring everybody to load these complex pieces of code into their processes. I can give a few purely technical reasons like not heaving to map-in code that is not relevant for the application, reducing VmSize (although, admittedly, only things like VmRSS are really important). There are also a few political reasons, like patented formats. In the end I’ll just say it the way it is: it’s a bad architecture;
  • Application developers are really not very interested in developing LIFO queues and worker threads or processes that will handle the task of creating thumbnails;
  • Finally, application developers are asking for this (for example F-Spot). Creating thumbnails is not at all an exclusive task for the filemanager.
  • My proposal

    Based on those conclusions I decided to write a DBus specification. I also reimplemented Maemo’s Hildon Thumbnail to be conform this specification. This work has been merged with the TRUNK of the project and will be used on Maemo‘s Fremantle release.

    While rewriting Hildon Thumbnail I decided to make sure that the software compiles and runs on any normal desktop. This way the software can serve as a proof of concept and working prototype for the DBus specification. Special care was taken to make sure it feels as desktop neutral as possible.

    I opened a bug to officially request a freedesktop.org project for this specification. I hope this organization will offer a platform for further development of this DBus specification. Hildon Thumbnailer can serve as a prototype and will be adapted whenever the specification improves.

    Here’s a meme: org.freedesktop.Thumbnailer

    People who know me probably saw this blog item coming. Here it is!

    In Tracker we want to ahead of time create thumbnails for interesting files. Among the use cases is when the user has moved or copied photos from his camera into one of the photo folders. We want to start preparing thumbnails for those files early so that filemanagers and photo applications are fast when needed.

    The current infrastructure for this in Tracker is to launch a script for each file that is to be thumbnailed. If you find a lot such files (some people end up with a camera with 1,000ths of photos after a busy weekend), that would mean that we’d do this 1,000 times:

    fork();
    execv(tracker-thumbnailer);
    fork();
    execv(bash);
    fork();
    execv(convert);

    Luckily this is not activated by default in current Tracker. :-)

    I don’t have to explain most people who read this blog that this is a bad idea on a modest ARM device with a bit more than one hundred MB of RAM. A better idea would be to have a service that queues these requests and that solves the requests with specialized image libraries. Perhaps launching a separate binary for the MIME types that the service has no libraries for?

    At first we were planning to make tracker-thumbnailer listen on stdin in a loop. Then I figured: why not do this over DBus instead? Pretty soon after that was Ivan Frade concluding that if we’d do that, other applications on the device could be interested in consuming that service too. We decided that perhaps we should talk with the right people in the two large desktop communities about the idea of specifying a DBus specification for remotely managing the thumbnail cache as specified by the Thumbnail managing standard by Jens Finke and Olivier Sessink.

    I don’t know of a official procedure other than filing a bug on freedesktop.org, so at first I tried to get in touch with people like David Faure (KDE), Christian Kellner (Nautilus), Rob Taylor (DBus, Telepathy, Wizbit) and later also a few mass discussions on #kde-devel, #nautilus and #gnome-hackers.

    I started a discussion on xdg-list which made me conclude that such a DBus API would indeed make sense for a lot of people. Discussions with individuals on IRC added to that feeling. I started a draft of a first specification for a DBus API.

    Meanwhile I had already started adapting the hildon-thumbnail code to become more service-like. Right now that code has a DBus daemon that implements the draft DBus API and on top of that provides the possibility to have dynamically loadable plugins. The specification also allows registering thumbnailers per MIME type. For that reason I made it possible to run those dynamically loadable plugins both standalone and in-process of the generic thumbnailer.

    It has been my prototype for testing the DBus API specification that I was writing. People told me that if you want to make a specification that’ll get accepted, the best way is to write a prototype too. Meanwhile Rob Taylor had joined me on fine tuning the specification itself. With his DBus experience he helped me a lot in multiple areas. Thanks for caring, Rob!

    The current prototype does not yet make it possible to simply drop-in a thumbnailer binary to add support for a new MIME type. By making a standalone thumbnailer that for being a thumbnailer simply launches external thumbnailers you could of course add that possibility that a lot of current thumbnail-infrastructure has. Although as mentioned above I don’t think this is a good architecture (the fork() + execv() troubles), I plan to make such a standalone plug-in thumbnailer.

    I certainly hope that this specification will be approved by the community. I can help with making patches for Konqueror and Nautilus. We’ll most likely use this on the Maemo platform for thumbnailing ourselves.