The original SPARQL regex support of Tracker is using a custom SQLite function. But of course back when we wrote it we didn’t yet think much about optimizing. As a result, we were using g_regex_match_simple which of course recompiles the regular expression each time.
Today Jürg and me found out about sqlite3_get_auxdata and sqlite3_set_auxdata which allows us to cache a compiled value for a specific custom SQLite function for the duration of the query.
Today we improved journal replaying from 1050s for my test of 25249 resources to 58s.
Journal replaying happens when your cache database gets corrupted. Also when you restore a backup: restore uses the same code the journal replaying uses, backup just makes a copy of your journal.
During the performance improvements we of course found other areas related to data entry. It looks like we’re entering a period of focus on performance, as we have a few interesting ideas for next week already. The ideas for next week will focus on performance of some SPARQL functions like regex.
I think the first pieces of the RSS- and the other web miners will start becoming available in this week’s unstable 0.7 release. Martyn is still reviewing the branches of the guys, but we’re very lucky with such good software developers as contributors. Very nice work Michele, Roberto and Adrien!
Nonetheless, I think the European community should do it just to strengthen Europe’s economy. I’m not satisfied by Europe’s economic strength: I want it to be undefeatable.
We must not let the IMF solve our problems. Europe might be a political dwarf, but we Europeans should show that we will solve our own problems. We’re an adult composition of cultures with vast amounts of experience. We know how to solve any imaginable problem. And let’s not, in our defeatism, pretend we don’t.
A EMF is a commitment to future member states: Europe often asks them fundamental changes; economic strength is what Europe offers in return. This needs to come at a highest price: Greece will have to fix their deficit problem. Even if their entire population goes on strike. Greece will be an example for countries like my own: Belgium has to fix a serious deficit problem, too.
An EMF comes at an equally high price, and that frightens me a bit: I don’t want the ECB to go as ballistic on money creation as the FED has been last two years. I want the EURO to be the strongest relevant currency mankind has ever created. No matter how insane the rest of the world thinks that ambition is: I believe that keeping the EURO’s M3 in check is a key to creating a wealthy society in Europe.
Politically I want European nations to negotiate more and more often. The European Union is a political dwarf only because finding agreement is hard. But in the long run will our solution be the most negotiated, most tested on this planet.
Together we can deal with anything. That doesn’t mean it’ll be easy; it has never been easy: just seventy years ago we were still killing each other. We’re all guilty of that one way or another. And before that it wasn’t any better. Today, not that many people still care: “it wasn’t me”, right? So stop being a bitch about it, then.
It’s time to let it be. It’s time to start a new European century that will be better. With respect for all European cultures, languages, nations, nationalities, values, borders and interests.
But also a European century with economic responsibilities for each member. It’s our strength: we figured out how to keep our population wealthy: let’s continue doing so in the future.
It was the dawn of the 1970s, at the height of worldwide student protests against the Vietnam War, and a librarian stationed at a U.S. Information Agency post abroad had received bad news: A student group was threatening to burn down her library.
But the librarian had friends among the group of student activists who made the threat. Her response on first glance might seem either naïve or foolhardy — or both: She invited the group to use the library facilities for some of their meetings.
But she also brought Americans living in the country there to listen to them — and so engineered a dialogue instead of a confrontation.
In doing so, she was capitalizing on her personal relationship with the handful of student leaders she knew well enough to trust — and for them to trust her. The tactic opened new channels of mutual understanding, and it strengthened her friendship with the student leaders. The library was never touched.
– Daniel Goleman, Working With Emotional Intelligence, Competencies of the stars. 1998
In Working with Emotional Intelligence, Daniel Goleman explains several practical methods to improve the social skills of people. Before I bought this book a year or two ago, I read Daniel’s first book Emotional Intelligence. This weekend I finally started reading Working With.
I recommend the section Some Misconceptions. Regretfully ain’t this section available for display in the flash preview widget. Instead of violating copyright laws by typing it down here, I’m recommending to just buy the book.
You can find audiobooks online. The section about misconceptions is at track three. Track five talks about two computer programmers, which is very illustrative for many of my blog’s readers (and possibly myself). I hope you wont illegally download using torrents. Instead, buy the material.
What distinguishes Daniel Goleman from old line proponents of positive thinking, however, is his grounding in psychology and neuroscience. Armed with a Ph.D in psychology from Harvard and a first-grade journalism background at the New York Times, Dr. Goleman has authored half a dozen books that explore the physical and chemical workings on the brain and their relationship with what we experience as everyday life.
– Peter Allen, director of Google university, introduction to Daniel Goleman. August 3, 2007
I hope readers of my blog will shun away from pseudo science when it comes to emotional and social intelligence, but instead read and learn from authors like Daniel Goleman. I also (still) recommend the books available at The Moral Brain by for example Dr. Jan Verplaetse.
psst. I have inside information that I might not be allowed to share that 1.2 is being prepared already, and will have bodystructure and envelope summary fetch. And it’ll fetch E-mail body content per requested MIME part, instead of always entire E-mails. Whoohoo!
You know about those guys that use your software against huge datasets like their entire filesystem, with thousands of files?
We do. His name is Tshepang Lekhonkhobe and we owe him a few beers for reporting to us many scalability issues.
Today we found and fixed such a scalability issue: the update query to reset the availability of file resources (this is for support for removable media) was causing at least a linear increase of VmRss usage per amount of file resources. For Tshepang’s situation that meant 600 MB of VmRss. Jürg reduced this to 30 MB of peak VmRss in the same use-case, and a performance improvement from minutes to a second or two, three. Without memory fragmentation as glibc is returning almost all of the VmRss back to the kernel.
Thursday is our usual release day. I invite all of the 0.7 pioneers to test us with your huge filesystems, just like Tshepang always does.
So long and thanks for all the testing, Tshepang! I’m glad we finally found it.
We would rather suffer the visible costs of a few bad decisions than incur the many invisible costs that come from decisions made too slowly – or not at all – because of a stifling bureaucracy.
– Letter by Warren E. Buffett to the shareholders of Berkshire, February 26, 2010
Perhaps we should put a damp rag like the one he mentions in his mouth next time he opens it?
Nigel Farage, you’re an disgrace to yourself. The European parliament is no place for personal attacks, and you aren’t fit to carry the title Member of the European Parliament. Please keep the honour to yourself and resign.
Every sensible person outside of the U.K. thinks you should. Even the Euro skeptics do. You’re an embarrassment for your country and its culture, so I hope for the people in the U.K. that they’ll kick you out of politics.
I fear you’re just playing the populist card, and that you’ll even get votes for this from other morons.
I don’t decide about Tracker‘s release. The team of course does.
But when you look at our roadmap you notice one remaining ‘big feature’. That’s coping with modest ontology changes.
Right now if we’d make even a small ontology change all of our users would have to recreate their databases. We also don’t support restoring a backup of your metadata over a modified ontology.
This is about to change. This week I started working in a branch on supporting class and property ontology additions.
I finished it today and it appears to be working. The patches obviously need a thorough review by the other team members, and then testing of course. I invite all the contributors and people who have been testing Tracker 0.7′s releases to tryout the branch. It only supports additions, so don’t try to change properties or classes, or remove them. You can only add new ones. You might have noticed the nao:deprecated property in the ontology files? That’s what we do with deleted properties.
Anyway
Meanwhile are Martyn and Carlos working on a bugfix in the miner about duplicate entries for file resources and on a timeout for the extractor so that extraction of large or complicated documents doesn’t block the entire filesystem miner.
This (super) cool .NET developer and good friend came to me at the FOSDEM bar to tell me he was confused about why during the Tracker presentation I was asking people to replace F-Spot and Banshee.
I hope I didn’t say it like that, I would never intent to say that. But I’ll review the video of the presentation as soon as Rob publishes it.
Anyway, to ensure everybody understood correctly what I did wanted to say (whether or not I did, is another question):
The call was to inspire people to reimplement or to provide different implementations of F-Spot’s and Banshee’s data backends, so that they would use an RDF store like tracker-store instead of each app its own metadata database.
I think I also mentioned Rhythmbox in the same sentence because the last thing I would want is to turn this into a .NET vs. anti-.NET debate. It just happens to be that the best GNOME softwares for photo and music management are written in .NET (and that has a good reason).
People who know me also know that I think those anti-.NET people are disruptive ignorable people. I also actively and willingly ignore them (and they should know this). I’m actually a big fan of the Mono platform.
I’ll try to ensure that I don’t create this confusion during presentations anymore.
Since 0.2 we (Michele and me) have just dropped dependency from rss-glib due some limitation found, and created our own Glib-oriented feeds handling library, libgrss, starting from the code of Liferea and adding nice stuffs such as a PubSub subscriber implementation. At the moment it is shipped with tracker-miner-rss itself, in the future may be splitted so to easy usage by other developers.
Next will come integration with libchamplain to describe geographic points found in geo-rss enabled feeds, integration with libedataserver to better handle “person” rappresentation (suggestions for a better PIM-like shared library with useful objects?), and perhaps a first full-featured feed reader using Tracker as backend.
Enjoy :-)
Roberto is doing a demo on FSter at FOSDEM during our presentation. My role in the presentation will be light this year. I decided to give most of the talk away to Rob Taylor and Roberto. I will probably demo Debarshi Ray‘s Solang and if time permits his work on the Nautilus integration. Regretfully Debarshi can’t come and so he asked me to do the demo.
For the last few weeks has Debarshi Ray contributed to Tracker’s Nautilus plugin and worked on Solang, a photo manager that will start using Tracker’s SPARQL capability to get a language to query for metadata about the photos and the photos themselves.
Debarshi explains it all very well himself on his own blog.
We’ll probably do a lightening demo during our Tracker presentation at FOSDEM about how Solang did this integration. We’re also planning to demo the code of a few other applications that are working on integrating with Tracker’s store.
Somebody should port Solang to the next version of Maemo!
In the third segment of The Real News‘ interview with Dr. Brzezinski, Paul Jay asks him about Israel’s threat to bomb Iranian Nuclear facilities and the American strategy towards Iran.
Brzezinski talks about how this might force the U.S. out of the region in the short term, how it would affect the price of oil, how the U.S. would be militarily involved and how the U.S. would be alone in this. And what the fundamental consequences for Israel would be.
You can find all three parts of the interview and their transcripts here:
It’s Sunday so I skim Facebook a bit. I came across Lefty’s link to a 100 quotes every geek should know blog. Artwork like humor often represents a philosophy. I think this first quote on that blog is a very good meme, also for foundation boards:
Strange women lying in ponds distributing swords is no basis for a system of government. Supreme executive power derives from a mandate from the masses, not from some farcical aquatic ceremony. — Dennis the Peasant, Monty Python and the Holy Grail
I’ve been watching The Real News for some time now. It claims to be “the real news” but the reality is that it’s fairly left-wing pro-unions most of the times. Most of their documentaries and interviews are very interesting, though. Nor do they make it difficult to filter out their own bias. It’s quite unique in the U.S.’s completely broken media to have something like The Real News.
This week they are interviewing Brzezinski. People who know Brzezinski, know that that’s a huge interview for them. Watch part one of the interview. Knowing The Real News, part two will probably be released in a week.
This style of subqueries will also work (you can do this one without a subquery too, but it’s just an example of course):
SELECT ?name COUNT(?msg)
WHERE {
?from a nco:Contact ;
nco:hasEmailAddress ?name . {
SELECT ?from
WHERE {
?msg a nmo:Email ;
nmo:from ?from .
}
}
} GROUP BY ?from
The same query in QtTracker will look like this (I have not tested this, let me know if it’s wrong Iridian):
What you find in this branch already supports it. You can find early support for subqueries in QtTracker in this branch.
To quickly put some stuff about Emails into your RDF store, read this page (copypaste the turtle examples in a file and use the tracker-import tool). You can also enable our Evolution Tracker plugin, of course.
Coming to you in a few days is what Jürg has been working on for last week.
Yeah, you guess it right by looking at the query below: subqueries!
This example shows you the amount of E-mails each contact has ever sent to you:
SELECT ?address
(SELECT COUNT(?msg) AS ?msgcnt WHERE { ?msg nmo:from ?from })
WHERE {
?from a nco:Contact ;
nco:hasEmailAddress ?address .
}
The usual warnings apply here: I’m way early with this announcement. It’s somewhat implemented but insanely experimental. The SPARQL spec has something for this in a draft wiki page. Due to lack of error reporting and detection it’s easy to make stuff crash or to get it to generate wrong native SQL queries.
But then again, you guys are developers. You like that!
Why are we doing this? Ah, some team at an undisclosed company was worried about performance and D-Bus overhead: They had to do a lot of small queries after doing a parent query. You know, a bunch of aggregate functions for counts, showing the last message of somebody, stuff like that.
I should probably not mention this feature yet. It’s too experimental. But so exciting!
Tracker has a limited capability to write metadata back into the data resource. In case of a file that means writing it back into the file. For example writing some of the metadata the user sets using a SPARQL Update back into an MP3 file as ID3 tags.
Which ones do we support already?
Right now the write back capability is under development and only supports a bunch of fields for a few XMP formats (JPEG, PNG and TIFF) and the Title of MP3 files. In near future we will start supporting increasingly more fields.
Documentation?
For people who want to write support for their properties and file formats, read the documentation.
Looks like I found myself a book that I need to read someday:
But it could be that we, who are earth-bound creatures and have begun to act as though we were dwellers of the universe, will forever be unable to understand, that is, to think and speak about the things which nevertheless we are able to do. In this case, it would be as though our brain, which constitutes the physical, material condition of our thoughts, were unable to follow what we do, so that from now on we would indeed need artificial machines to do our thinking and speaking.