IPC performance, the report

The Tracker team will be doing a codecamp this month. Among the subjects we will address is the IPC overhead of tracker-store, our RDF query service.

We plan to investigate whether a direct connection with our SQLite database is possible for clients. Jürg did some work on this. Turns out that due to SQLite not being MVCC we need to override some of SQLite’s VFS functions and perhaps even implement ourselves a custom page cache.

Another track that we are investigating involves using a custom UNIX domain socket and sending the data over in such a way that at either side the marshalling is cheap.

For that idea I asked Adrien Bustany, a computer sciences student who’s doing an internship at Codeminded, to develop three tests: A test that uses D-Bus the way tracker-store does (by using the DBusMessage API directly), a test that uses an as ideal as possible custom protocol and technique to get the data over a UNIX domain socket and a simple program that does the exact same query but connects to SQLite by itself.

Here’s the report:

Exposing a SQLite database remotely: comparison of various IPC methods

By Adrien Bustany
Computer Sciences student
National Superior School of Informatics and Applied Mathematics of Grenoble (ENSIMAG)

This study aims at comparing the overhead of an IPC layer when accessing a SQLite database. The two IPC methods included in this comparison are DBus, a generic message passing system, and a custom IPC method using UNIX sockets. As a reference, we also include in the results the performance of a client directly accessing the SQLite database, without involving any IPC layer.

Comparison methodology

In this section, we detail what the client and server are supposed to do during the test, regardless of the IPC method used.

The server has to:

  1. Open the SQLite database and listen to the client requests
  2. Prepare a query at the client’s request
  3. Send the resulting rows at the client’s request

Queries are only “SELECT” queries, no modification is performed on the database. This restriction is not enforced on server side though.

The client has to:

  1. Connect to the server
  2. Prepare a “SELECT” query
  3. Fetch all the results
  4. Copy the results in memory (not just fetch and forget them), so that memory pages are really used

Test dataset

For testing, we use a SQLite database containing only one table. This table has 31 columns, the first one is the identifier and the 30 others are columns of type TEXT. The table is filled with 300 000 rows, with randomly generated strings of 20 ASCII lowercase characters.

Implementation details

In this section, we explain how the server and client for both IPC methods were implemented.

Custom IPC (UNIX socket based)

In this case, we use a standard UNIX socket to communicate between the client and the server. The socket protocol is a binary protocol, and is detailed below. It has been designed to minimize CPU usage (there is no marshalling/demarshalling on strings, nor intensive computation to decode the message). It is fast over a local socket, but not suitable for other types of sockets, like TCP sockets.

Message types

There are two types of operations, corresponding to the two operations of the test: prepare a query, and fetch results.

Message format

All numbers are encoded in little endian form.

Prepare

Client sends:

Size Contents
4 bytes Prepare opcode (0x50)
4 bytes Size of the query (without trailing \0)
Query, in ASCII

Server answers:

Size Contents
4 bytes Return code of the sqlite3_prepare_v2 call

Fetch

Client sends:

Size Contents
4 bytes Fetch opcode (0x46)

Server sends rows grouped in fixed size buffers. Each buffer contains a variable number of rows. Each row is complete. If some padding is needed (when a row doesn’t fit in a buffer, but there is still space left in the buffer), the server adds an “End of Page” marker. The “End of page” marker is the byte 0xFF. Rows that are larger than the buffer size are not supported.

Each row in a buffer has the following format:

Size Contents
4 bytes SQLite return code. This is generally SQLITE_ROW (there is a row to read), or SQLITE_DONE (there are no more rows to read). When the return code is not SQLITE_ROW, the rest of the message must be ignored.
4 bytes Number of columns in the row
4 bytes Index of trailing \0 for first column (index is 0 after the “number of columns” integer, that is, index is equal to 0 8 bytes after the message begins)
4 bytes Index of trailing \0 for second column
4 bytes Index of trailing \0 for last column
Row data. All columns are concatenated together, and separated by \0

For the sake of clarity, we describe here an example row

100 4 1 7 13 19 1\0aaaaa\0bbbbb\0ccccc\0

The first 100 is the return code, in this case SQLITE_ROW. This row has 4 columns. The 4 following numbers are the offset of the \0 terminating each column in the row data. Finally comes the row data.

Memory usage

We try to minimize the calls to malloc and memcpy in the client and server. As we know the size of a buffer, we allocate the memory only once, and then use memcpy to write the results to it.

DBus

The DBus server exposes two methods, Prepare and Fetch.

Prepare

The Prepare method accepts a query string as a parameter, and returns nothing. If the query preparation fails, an error message is returned.

Fetch

Ideally, we should be able to send all the rows in one batch. DBus, however, puts a limitation on the message size. In our case, the complete data to pass over the IPC is around 220MB, which is more than the maximum size allowed by DBus (moreover, DBus marshalls data, which augments the message size a little). We are therefore obliged to split the result set.

The Fetch method accepts an integer parameter, which is the number of rows to fetch, and returns an array of rows, where each row is itself an array of columns. Note that the server can return less rows than asked. When there are no more rows to return, an empty array is returned.

Results

All tests are ran against the dataset described above, on a warm disk cache (the database is accessed several time before every run, to be sure the entire database is in disk cache). We use SQLite 3.6.22, on a 64 bit Linux system (kernel 2.6.33.3). All test are ran 5 times, and we use the average of the 5 intermediate results as the final number.

For the custom IPC, we test with various buffer sizes varying from 1 to 256 kilobytes. For DBus, we fetch 75000 rows with every Fetch call, which is close to the maximum we can fetch with each call (see the paragraph on DBus message size limitation).

The first tests were to determine the optimal buffer size for the UNIX socket based IPC. The following graph describes the time needed to fetch all rows, depending on the buffer size:

The graph shows that the IPC is the fastest using 64kb buffers. Those results depend on the type of system used, and might have to be tuned for different platforms. On Linux, a memory page is (generally) 4096 bytes, as a consequence buffers smaller than 4kB will use a full memory page when sent over the socket and waste memory bandwidth. After determining the best buffer size for socket IPC, we run tests for speed and memory usage, using a buffer size of 64kb for the UNIX socket based method.

Speed

We measure the time it takes for various methods to fetch a result set. Without any surprise, the time needed to fetch the results grows linearly with the amount of rows to fetch.

IPC method Best time
None (direct access) 2910 ms
UNIX socket 3470 ms
DBus 12300 ms

Memory usage

Memory usage varies greatly (actually, so much that we had to use a log scale) between IPC methods. DBus memory usage is explained by the fact that we fetch 75 000 rows at a time, and that it has to allocate all the message before sending it, while the socket IPC uses 64 kB buffers.

Conclusions

The results clearly show that in such a specialized case, designing a custom IPC system can highly reduce the IPC overhead. The overhead of a UNIX socket based IPC is around 19%, while the overhead of DBus is 322%. However, it is important to take into account the fact that DBus is a much more flexible system, offering far more features and flexibility than our socket protocol. Comparing DBus and our custom UNIX socket based IPC is like comparing an axe with a swiss knife: it’s much harder to cut the tree with the swiss knife, but it also includes a tin can opener, a ball pen and a compass (nowadays some of them even include USB keys).

The real conclusion of this study is: if you have to pass a lot of data between two programs and don’t need a lot of flexibility, then DBus is not the right answer, and never intended to be.

The code source used to obtain these results, as well as the numbers and graphs used in this document can be checked out from the following git repository: git://git.mymadcat.com/ipc-performance . Please check the various README files to see how to reproduce them and/or how to tune the parameters.

More introduction to RDF and SPARQL

Introduction

I plan to give an introduction to features like COUNT, FILTER REGEX and GROUP BY which are supported by Tracker‘s SPARQL engine. We support more such features but I have to start the introduction somewhere. And overloading people with introductions to all features wont help me much with explaining things.

Since my last introduction to RDF and SPARQL I have added a few relationships and actors to the game.

We have Morrel, Max and Sasha being dogs, Sheeba and Query are cats, Picca is still a parrot, Fred and John are contacts. Fred claims that John is his friend. I changed the ontology to allow friendships between the animals too: Sasha claims that Morrel and Max are her friends. Sheeba claims Query is her friend. John bought Query. Fred being inspired by John decided to also get some pets: Morrel, Sasha and Sheeba.

Ontology

Let’s put this story in Turtle:

<test:Picca> a test:Parrot, test:Pet ;
	test:name "Picca" .

<test:Max> a test:Dog, test:Pet ;
	test:name "Max" .

<test:Morrel> a test:Dog, test:Pet ;
	test:name "Morrel" ;
	test:hasFriend <test:Max> .

<test:Sasha> a test:Dog, test:Pet ;
	test:name "Sasha" ;
	test:hasFriend <test:Morrel> ;
	test:hasFriend <test:Max> .

<test:Sheeba> a test:Cat, test:Pet ;
	test:name "Sheeba" ;
	test:hasFriend <test:Query> .

<test:Query> a test:Cat, test:Pet ;
	test:name "Query" .

<test:John> a test:Contact ;
	test:owns <test:Max> ;
	test:owns <test:Picca> ;
	test:owns <test:Query> ;
	test:name "John" .

<test:Fred> a test:Contact ;
	test:hasFriend <test:John> ;
	test:name "Fred" ;
	test:owns <test:Morrel> ;
	test:owns <test:Sasha> ;
	test:owns <test:Sheeba> .

Querytime!

Let’s first start with all friend relationships:

SELECT ?subject ?friend
WHERE { ?subject test:hasFriend ?friend }

  test:Morrel, test:Max
  test:Sasha, test:Morrel
  test:Sasha, test:Max
  test:Sheeba, test:Query
  test:Fred, test:John

Just counting these is pretty simple. In SPARQL all selectable fields must have a variable name, so we add the “as c” here.

SELECT COUNT (?friend) AS c
WHERE { ?subject test:hasFriend ?friend }

  5

We counted friend relationships, of course. Let’s say we want to count how many friends each subject has. This is a more interesting query than the previous one.

SELECT ?subject COUNT (?friend) AS c
WHERE { ?subject test:hasFriend ?friend }
GROUP BY ?subject

  test:Fred, 1
  test:Morrel, 1
  test:Sasha, 2
  test:Sheeba, 1

Actually, we’re only interested in the human friends:

SELECT ?subject COUNT (?friend) AS c
WHERE { ?subject test:hasFriend ?friend .
        ?friend a test:Contact
} GROUP BY ?subject

  test:Fred, 1

No no, we are only interested in friends that are either cats or dogs:

SELECT ?subject COUNT (?friend) AS c
WHERE { ?subject test:hasFriend ?friend .
       ?friend a ?type .
       FILTER ( ?type = test:Dog || ?type = test:Cat)
} GROUP BY ?subject"

  test:Morrel, 1
  test:Sasha, 2
  test:Sheeba, 1

Now we are only interested in friends that are either a cat or a dog, but whose name starts with a ‘S’.

SELECT ?subject COUNT (?friend) as c
WHERE { ?subject test:hasFriend ?friend ;
                 test:name ?n .
       ?friend a ?type .
       FILTER ( ?type = test:Dog || ?type = test:Cat) .
       FILTER REGEX (?n, '^S', 'i')
} GROUP BY ?subject

  test:Sasha, 2
  test:Sheeba, 1

Conclusions

Should we stop talking about ontologies and start talking about searchboxes and user interfaces instead? Although I certainly agree more UI-stuff is needed, I’m not sure yet. RDF and SPARQL are also about relationships and roles. Not just about matching stuff. Whenever we explain the new Tracker to people, most are stuck with ‘matching’ in their mind. They don’t think about a lot of other use-cases.

Such a search is just one use-case starting point: user entered a random search string and gives zero other meaning about what he needs. Many more situations can be starting points: When I select a contact in a user interface designed to show an archive of messages that he once sent to me, the searchbox becomes much more narrow, much more helpful.

As soon as you have RDF and SPARQL, and with Tracker you do, an application developer can start taking into account relationships between resources: The relationship between a contact in Instant Messaging and the attachments in an E-mail that he as a person has sent to you. Why not combine it with friendship relationships synced from online services?

With a populated store you can make the relationship between a friend who joined you on a trip, and photos of a friend of your friend who suggested the holiday location.

With GeoClue integration we could link his photos up with actual location markers. You’d find these photos that came from the friend of your friend, and we could immediately feed the location markers to the GPS software on your phone.

I really hope application developers have more imagination than just global searchboxes.

And this is just a use-case that is technically already possible with today’s high-end phones.

Introduction to RDF and SPARQL

Let’s start with a relatively simple graph. The graph shows the relationships between John, Fred, Max and Picca. John and Fred are humans who we’ll refer to as contacts. Max and Picca are pets. Max is a dog and Picca is a parrot. Both Picca and Max are owned by John. Fred claims that John is his friend.

If we would want to represent this story semantically we would first need to make an dictionary that describes pets, contacts, dogs, parrots. The dictionary would also describe possible relationships like ownership of a pet and the friendship between two contacts. Don’t forget, making something semantic means that you want to give meaning to the things that interest you.

Giving meaning is exactly what we’ll start with. We will write the schema for making this story possible. We will call this an ontology.

We describe our ontology using the Turtle format. In Turtle you can have prefixes. The prefix test: for example is the same as using <http://test.org/ontologies/tracker#>.

In Turtle you describe statements by giving a subject, a predicate and then an object. The subject is what you are talking about. The predicate is what about the subject your are talking about. And finally the object is the value. This value can be a resource or a literal.

When you write a . (a dot) in Turtle it means that you end describing the subject. When you write a ; (semicolon) it means that you continue with the same subject, but will start describing a new predicate. When you write a , (comma) it means that you even continue with the same predicate. The same rules apply in the WHERE section of a SPARQL query. But first things first: the ontology.

Note that the “test” ontology is not officially registered at tracker-project.org. It serves merely as an example.

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .
@prefix tracker: <http://www.tracker-project.org/ontologies/tracker#> .
@prefix test: <http://www.tracker-project.org/ontologies/test#> .

test: a tracker:Namespace ;
	tracker:prefix "test" .

test:Entity a rdfs:Class .

test:Contact a rdfs:Class ;
	rdfs:subClassOf test:Entity .

test:Pet a rdfs:Class ;
	rdfs:subClassOf test:Entity .

test:Dog a rdfs:Class ;
	rdfs:subClassOf test:Entity .

test:Parrot a rdfs:Class ;
	rdfs:subClassOf test:Entity .

test:name a rdf:Property ;
	rdfs:domain test:Entity ;
	rdfs:range xsd:string .

test:owns a rdf:Property ;
	rdfs:domain test:Contact ;
	rdfs:range test:Pet .

test:hasFriend a rdf:Property ;
	rdfs:domain test:Contact ;
	rdfs:range test:Contact .

Now that we have meaning, we will introduce the actors: Picca, Max, John and Fred. Copy the @prefix lines of the ontology file from above, put the ontology file in the share/tracker/ontologies directory and run tracker-processes -r before restarting tracker-store in master. After doing all that you can actually store this as a /tmp/import.ttl file and then run tracker-import /tmp/import.ttl and it should import just fine. Ready for the queries below to be executed with the tracker-sparql -q ‘$query’ command.

Note that tracker-processes -r destroys all your RDF data in Tracker. We don’t yet support adding custom ontologies at runtime, so for doing this test you have to start everything from scratch.

<test:Picca> a test:Parrot, test:Pet ;
	test:name "Picca" .

<test:Max> a test:Dog, test:Pet ;
	test:name "Max" .

<test:John> a test:Contact ;
	test:owns <test:Max> ;
	test:owns <test:Picca> ;
	test:name "John" .

<test:Fred> a test:Contact ;
	test:hasFriend <test:John> ;
	test:name "Fred" .

Let’s do some simple SPARQL queries. You can execute these queries this way:

tracker-sparql -q "SELECT ?subject WHERE { ?subject a test:Parrot }"

In this query we ask for the subject of each entity that is a parrot. The query will yield test:Picca because Picca is the only parrot in our situation.

  test:Picca

Usually we aren’t interested in the subject, but in a real property of the parrot. We can ask for such a property this way:

SELECT ?subject ?name WHERE { ?subject a test:Parrot ; test:name ?name}
  test:Picca, Picca

Another simple example, give me all the contacts:

SELECT ?subject WHERE { ?subject a test:Contact }"
  test:John
  test:Fred

Just the contacts doesn’t illustrate much. Give me all contacts that have a friend. And display the contact and the friend’s names:

SELECT ?name ?friend
WHERE { ?subject test:hasFriend ?f ;
                 test:name ?name .
        ?f test:name ?friend }
  Fred, John

Let’s ask for all the pets that are owned:

SELECT ?subject WHERE { ?unknown test:owns ?subject }
  test:Max
  test:Picca

Oh, not the subject. The names. How did we do that again? Right:

SELECT ?name
WHERE { ?unknown test:owns ?subject .
        ?subject test:name ?name }
  Max
  Picca

This will of course yield the same results in our situation:

SELECT ?name
WHERE { <test:John> test:owns ?subject .
        ?subject test:name ?name }
  Max
  Picca

But this wont, Fred doesn’t own any pets. Only John owns pets.

SELECT ?name
WHERE { <test:Fred> test:owns ?subject .
        ?subject test:name ?name }

Let’s print the owner’s and the pet’s names:

SELECT ?owner ?name
 WHERE { ?unknown test:owns ?subject ;
                  test:name ?owner .
         ?subject test:name ?name }"
  John, Max
  John, Picca

Still with me? Let’s now conclude with requesting the names of the contacts who are a friend of the person who owns Picca:

SELECT ?name
WHERE { ?subject test:owns <test:Picca> .
        ?unknown test:hasFriend ?subject ;
                 test:name ?name }
  Fred

Invitation for Jürg and Rob: How about you guys writing a introduction to OPTIONAL, SUM, COUNT, GROUP-BY and FILTER, etc in SPARQL? :-) The more advanced stuff.

The ontology descriptions of Tracker, now in Turtle

Thanks to Jürg is the experimental branch of Tracker storing its ontology descriptions using the Turtle format.

What is an ontology anyway?

Wikipedia sums it up pretty well: In computer science and information science, an ontology is a formal representation of a set of concepts within a domain and the relationships between those concepts.

What is Turtle?

The w3 specification explains it as a textual syntax for RDF that allows RDF graphs to be completely written in a compact and natural text form, with abbreviations for common usage patterns and datatypes.

Turtle is the format that we want to standardize on. We for example plan to use it for some of our interprocess communication needs, we are already using it for backup and restore support and we used it as base format for persisting user metadata on removable devices.

An example snippet from the ontology in Turtle:

nie:InformationElement a rdfs:Class ;
     rdfs:label "Information Element" ;
     rdfs:subClassOf rdfs:Resource .
nie:title a rdf:Property ;
     rdfs:label "Title" ;
     rdfs:comment "The title of the document" ;
     rdfs:subPropertyOf dc:title ;
     nrl:maxCardinality 1 ;
     rdfs:domain nie:InformationElement ;
     rdfs:range xsd:string ;
     tracker:fulltextIndexed true .
nfo:Document a rdfs:Class ;
      rdfs:label "Document" ;
      rdfs:comment "A generic document. A common superclass for all documents on the desktop." ;
      rdfs:subClassOf nie:InformationElement .

Okay …

It only made sense for us to destroy and eliminate inifile formats like the ontology descriptions of the current non-experimental Tracker. Let me explain:

We have plans to add support for adding domain specific custom ontologies. By that I mean that it’ll be possible for an application to install and remove an ontology, to get the application specific metadata out and to restore this data as part of reinstalling a custom ontology.

As I already pointed out during my Tracker presentation at FOSDEM doesn’t this mean that we encourage application developers not to care about the base ontology. In fact we strongly recommend application developers to stick to- instead of diverting much from Nepomuk.

Meanwhile experimental Tracker’s indexer has started to work and is storing things in our decomposed RDF storage that uses Nepomuk as schema and can be queried using SPARQL.

Vision

My vision for metadata storage is that files are just one kind of resource. Tracker’s indexer collects the metadata about mostly those file-based resources. However. Metadata is everywhere: in META tags of websites and RSS feeds, in both locally stored and streamable remotely located multimedia resources, in E-mails, meeting requests and calendar items, in your roster and contacts list, in daily events, in computer events and installed applications.

Computer events? I hear you thinking. Well, for example a “hardware” event like your location changing just before you took a picture. That location can be harvested as metadata about the picture. There are many more examples imaginable.

For some datas is metadata the only thing really being stored. For example aren’t contact-resources having much more data than today’s ontologies describe. For most of the others will metadata describe the resource. And often, more importantly, its relationship with other resources.

We want Tracker to be your framework for metadata on both desktops and mobile devices. This is why we want to use w3 standards like Turtle over our own formats. We just happen to see things quite big.

No more old XML RDFQuery, but SPARQL. No more inifiles, but Turtle. No more home brewed ontology, but Nepomuk.

Merge-to-trunk plans

We are planning to start merging the experimental stuff to trunk, and start calling it the 0.7.x series of Tracker. Let’s see how many cocktails we’ll need for Jamie to get drunk enough to start accepting our insane, but already working, ideas.

Utilitarianism

Introduction

In a discussion some concluded that technology X is ‘more tied to GNOME’ than technology Y because ‘more [GNOME] people are helped by X’ due to dependencies for Y. Dependencies that might be unacceptable for some people.

This smells like utilitarianism and therefore it’s subject to criticism.

Utilitarianism is probably best described by Jeremy Bentham as:

Ethics at large may be defined, the art of directing men’s actions to the production of the greatest possible quantity of happiness.

— Bentham, Introduction to the Principles of Morals and Legislation

A situational example that, in my opinion, falsifies this:

You are standing near the handle of a railroad switch. Six people are attached to the rails. Five of them at one side of the switch, one at the other side of the switch. Currently the handle is set in such a way that five people will be killed. A train is coming. There’s no time to get help.

  • Is it immoral to use the handle and kill one person but save five others?
  • Is it immoral not to use the handle and let five people get killed?

The utilitarianist chooses the first option, right? He must direct his actions to the production of the greatest possible quantity of happiness.

Body of the discussion

Now imagine that you have to throw a person on the rails to save the lives of five others. The person would instantly get killed but the five others would be saved by you sacrificing one other.

A true utilitarianist would pick the first option in both exercises; he would use the handle and he would throw a person on the rails. In both cases he believes his total value of produced happiness is (+3) and he believes that in both situations picking the second option means his total value of produced happiness is (-4) + (+1) = (-3). The person who picks the second option is therefore considered ethically immoral by a true utilitarianist.

For most people that’s not what they meant the first time. Apparently ethics don’t allow you to always say (+4) + (-1) = (+3) about happiness. I’ll explain.

The essence of the discussion

Psychologically, less people will believe that throwing a person on the rails is morally the right thing to do. When we can impersonificate we make it more easy for our brains to handle such a decision. Ethically and morally the situation is the same. People feel filthy when they need to physically touch a person in a way that’ll get him killed. A handle makes it more easy to kill him.

Let’s get back to the Gnome technology discussion … If you consider pure utilitarianism as most ethical, then you should immediately stop developing for GNOME and start working at Microsoft: writing good Windows software at Microsoft would produce a greater possible quantity of happiness.

Please also consider reading criticism and defence of utilitarianism at wikipedia. Wikipedia is not necessarily a good source, but do click on some links on the page and you’ll find some reliable information.

Some scientists claim that we have a moral instinct, which is apparently programmed by our genes into our brains. I too believe that genetics probably explain why we have a moral system.

The developer of X built his case as following: My technology only promotes happiness. The technology doesn’t promote unhappiness.

It was a good attempt but there are multiple fallacies in his defense.

Firstly, in a similar way doesn’t technology Y promote unhappiness either. If this is assumed about X, neither promote unhappiness.

Secondly, how does the developer of X know that his technology promotes no unhappiness at all? Y also promotes some unhappiness and I don’t have to claim that it doesn’t. That’s a silly assumption.

Thirdly, let’s learn by example: downplaying the amount of unhappiness happens to be the exact same thing regimes having control over their media also did whenever they executed military action. The act of downplaying the amount of unhappiness should create a reason for the spectator to question it.

Finally, my opinion is that the very act of claiming that ‘X is more tied to GNOME’, will create unhappiness among the supporters of Y. Making the railroad example applicable anyway.

My conclusion and the reason for writing this

‘More’ and ‘less’ happiness doesn’t mean a lot if both are incommensurable. Valuations like “more tied to GNOME” and “less tied to GNOME” aren’t meaningful to me. That’s because I’m not a utilitarianist. I even believe that pure utilitarianism is dangerous for our species.

To conclude I think we should prevent that the GNOME philosophy is damaged by too much utilitarianism.

Mindstorm … s

You buy a bunch of Lego Mindstorms bricks and you start building a robot to remotely control your mobile devices.

Well, that’s the official explanation.

The actual explanation is that this is what happens when you are 26 years of age, your girlfriend tells you you are almost 30 and that when you are 30 it’s the end of your youth (although, people of that age usually tell me this ain’t true), you are a nerd of the type software developer (and quite addicted to this too), you have your own business and therefore your accountant asks to make some expenses (like .. buying a Mindstorms robot! No?).

I acknowledge it’s probably just an early midlife crisis. Boys want to make things, fiddle with stuff, put things together. Whereas girls, girls just wanna have fun. I’m totally guilty of being a boy. I know. (although, I’m sure a lot of girls enjoy making things too — before I get killed by a group of feminists –).

Now that the model itself is finished, I clearly see what I am becoming: an old lonely dude who plays with trains, electricity stuff and mostly breaks things just to put them back together. I’ll probably die getting electrocuted while trying to take apart a by that time old holographic 3D gesture recognizing display, as I’m trying to figure out whether some evil corporation is spying on its customers by using such electronic devices.

But, isn’t that cute? No? I mean, Tinne, seriously, now I must be ‘like’ a younger dude, no? I have been playing with toys for kids aged 11 to 16 (that’s what the Lego box’s age indicator says, so it must be true). Anyway, the only way that it can get worse now, is if I’ll start writing software for this Lego model. I’ll have a camera view on my screen where I can mouse-over so that the robot will follow my mouse pointer. With a library like GStreamer I can let that camera image go efficiently over a distance. Sending some commands over a socket ain’t very hard.

About the bot itself: it has three axis. One (the X one) uses normal wheels, two others (Y and Z) are built on top of the chassis. All axis are controlled by Mindstorms motors. The Mindstorms computer thing is integrated in the model, there’s a touch sensor on one of the axis (the Z one). I don’t yet have this software, that’s the next thing I’ll (try to) finish. I’ve spend ~ 450 euros on this thing (the normal Mindstorms package didn’t have enough bricks, but the programmable thing, the sensors and the motors are ~ 300 euros).

But hey, 450 euros for something that you could give to a little fellow as soon as you are done playing with it? That’s not much for multi functional and multi age toys! I mean, if I get bored of this thing, I can make another robot with it. If you have a son (or a technical minded daughter), you can let him (or her) play with the Lego bricks while watching his (her) brains grow! You can’t convince me that today’s computer games are better for training a kid’s brain than Lego.

After the kid is finished building the bot, you can make the software for it. Hah! Perfect father – son (or daughter) relationship. You actually help him make his toys, and you enjoy doing that! And … he’ll get interested in software development, join one of the many free software communities, he’ll find a job in IT as programmer, etc etc.

Lego rocks!