October 2006 – How easy it is to make people believe a lie, and how hard it is to undo that work again

A more typical situation

Extremes are interesting for development and knowing whether you are doing things wrong. The average user, however, isn’t going to fetch 30,000 headers from an IMAP service on his cellphone or mobile device. Yet that is not an excuse for not supporting the extreme cases. My aim is to support 50,000 messages on the Nokia 770 (but sorting such a folder will take a few seconds).

I therefore decided to do a memory analysis of a more typical situation. Downloading a folder of 1,000 headers, going offline, opening the folder, closing the folder and reopening the folder.

The red bar was the INBOX folder (almost same size as the test folder). The blue was downloading the test folder. The two green bars was opening it, closing it, opening it.

What do we learn from this analysis? We learn that there are still a lot normal mallocs causing ~200k heap-admin. I bet hunting them down and replacing those with gslice would reduce that ~100k. Most normal mallocs are in camel-mime-utils.c in case you want to make yourself guilty of killing them. Just regex-search for “sizeof\s*(\*”. One problem is that the early Camel authors often freed and allocated these outside of the camel-mime-utils.c file. Replacing it with gslice often means searching all of Camel for possible free syscalls on that pointer.

We learn that realloc accounts for ~100k, which is mostly caused by one GPtrArray (the one of the CamelFolderSummary instance of the active folder). Not much I can do about it now.

We learn that downloading (now) takes an almost equal amount of memory at its peak when compared to loading it. That’s of course because I now periodically synchronize the on-disk mmap while downloading happens in iterations. You can see the four or five iterations that happened on the graph too.

We learn that loading a typical folder consumes something like 200k in total. That is why ~200k heap-admin matters. Reducing this would mean that loading two such folders would also be possible on a device that has exactly enough memory to load one typical folder. This means allowing the user to run one more game, one more other application. Having to install one less memory chip. Making the device lighter. Making it consume less battery.

I would like more non-me testing. Everybody is invited to test tinymail by for example compiling and running its demoui:

svn co https://svn.tinymail.org/svn/tinymail
wget http://pvanhoof.be/files/create_tinymail_test_account.sh
sh create_tinymail_test_account.sh
cd tinymail/trunk
./autogen.sh --prefix=/opt/tinymail --enable-gnome=no
make && sudo make install
/opt/tinymail/bin/tinymail # and enter unittest as password

There’s indeed no more Camel to compile. It’s embedded in the build, don’t worry about it anymore. Please run it with valgrind or your other fancy memory analysis tool. Feel free to find weak spots. Things that are too slow. Situations that consume too much memory. Report it on mailing list.

T-Dose

The 2e or 3th of december I will be doing a talk about tinymail at T-Dose. T-Dose is a new conference event in the Netherlands that aims at bringing the FOSDEM conference formula to their country.

I don’t know the schedule myself yet, so I don’t know (yet) which day nor at what hour I will do the talk.

Music by and for coders

Some cool music from one of those Nokia people (Karoliina, to be more specific) working on Maemo

Lots of “Dream” titled songs. I’ll try coding on them tomorrow. The ones that I have listened to so far are actually very good.

In stead of commercial electro and techno, I’ll use those for my next video-demo. I’m hoping to demo Modest (or another demoui) showing 55,000 message headers on a Nokia 770 and a PocketPC running GPE … soon. We’ll see. It’s not a promise, just an aim, something I should do more often, etc

I just need to get myself a device that will run GPE, like a PocketPC. Somebody near Belgium into letting me install tinymail on his device :-)? I will need ~50MB of flash disk space, GPE installed and ~15MB ram on it.

Less, less!!

You remember this one? It’s Evolution’s current Camel downloading 30,000 headers from IMAP:

This one was a first hack that I did that drastically reduced memory consumption. You remember it?

Ok, so this one is what tinymail will do today (all graphs show the same folder, the same amount of headers being downloaded, indeed):

That’s indeed four times less memory than the original Camel as shipped by Evolution. What you don’t see in the graph is that it also uses far less bandwidth and the implementation is a lot less complicated and mostly because it transfers less data, faster (You don’t believe me about the complexity? Compare with the original here).

Memory consumption, speed and bandwidth consumption are extremely important for mobile & embedded appliances (both GPRS traffic and memory are expensive, being online often consumes battery, the CPU of the devices is often slower). Mobile & embedded appliances are the (current) focus of the tinymail framework.

You don’t have to believe me. Go ahead and try it yourself with what is current in tinymail’s repository. There’s a test account available: mail.tinymail.org, u:tinymailunittest, p:unittest. If people are now going to abuse this account, the account wont be available anymore (I do measure traffic and directory size frequently, indeed).

Initial download of a large IMAP folder on tinymail

Camel (a library being used by Evolution and a heavily hacked version of it by tinymail) does something like this when you for the first time downloaded 30,000 message headers from an IMAP service’s folder:

The Camel in tinymail does something like this (after this patch, as I have to hack a few other providers to support this hack first):

Anyway, this hack together with the mmap one (which has already been applied to tinymail’s camel) makes it possible to use very large folders (more than 20,000 headers) on devices with ~10M RAM. Like maybe the cellphones of tomorrow?

How? Simple: I dumped the headers to the summary file that will be mmapped each 1000th header, store some status and continue fetching more headers from the IMAP server.

This also improves canceling. Because it now dumps each 1000th header, you can cancel the operation yet not lose per 1000th header the ones that were already received (like what happens if you would cancel such a download in Evolution).

Doing it with less

Another significant milestone has been reached for tinymail. I finally achieved getting the non-mmap memory consumption for the folder summaries lower than the amount of data being mapped using mmap.

These are averages but since all messages where spam, probably meaningful. You can read the source code of and compile and run this test to reproduce the tests yourself.

Loading 50,000 headers used to consume something like 30M. More typical numbers where the 10M for 15,000 headers. The idea was that not much people want to view 50,000 headers on a mobile device. Boy, I can come up with lame excuses for abusing your memory don’t you think?

Nevertheless, I improved the situation. 50,453 headers now consume 15.9M, 30,264 headers now consume 10M and 15,000 headers now consume 4.7M. At around 3,000 headers tinymail goes below the 1M barrier. Most people use their mobile device to view less than 5,000 headers. Making tinymail a cheap E-mail client infrastructure when it comes to memory consumption.

How did I do this? Well, I don’t yet support threaded sorting of messages. Therefore I removed that from the Camel that I’m using. This, for example, has cut 7% of the allocated memory usage. I also removed a few other pointers in the struct that is allocated per such summary item. Each pointer is of course 4 bytes on my test system. When dealing with 50,000 instances, it indeed becomes more significant.

Me and especially Matthew Barnes have also been upgrading some of the e-d-s infrastructure being used by Camel to modern glib functionality. Like GSlice in stead of EMemChunk (if done correctly, reduces heap admin a little bit it seems), GMutex in stead of EMutex, GThread/GAsyncQueue in EMsgPort and killing EThread. I also removed something called “tracing” and this debug-only “magic” checking from the CamelObject type (the magic stuff added 4 bytes to each CamelObject instance) and many other micro improvements. Most of the memory savings, however, happened by removing a few pointers from the CamelMessageInfoBase structure (the summary item structure in Camel). That structure is indeed a major contributor to why your Evolution is consuming a lot of your RAM.

Note that tinymail also uses an mmap. The size of this mmap is, at this moment, near the size of the allocated memory consumption. Does this mean that you need to double its memory usage in your mind? Not exactly. (On Linux) An mmap uses on-demand paging which is implemented in the kernel. Especially on devices with few remaining memory resources wont a tinymail based E-mail client consume a lot. The kernel will probably page-out most until you restart using the application.

I created a second memory analysis report for tinymail. You are hereby invited to reproduce the test and results. Feel free to add more memory reports and measurements yourself if you are interested in doing that.

Saving attachments with tinymail

I moved “the saving of mime parts” out of the tinymail framework. Examples of mime parts are the attachments of an E-mail.

I did this because the user interface for “saving them” will be application specific: your E-mail client might want to use a popup menu for telling the user “this is how you save this attachment”. But this other E-mail client might use the menu invoked by the left-top button on your cellphone. You know, the one above your “call” button.

If tinymail would implement a TnyMsgView by using a popup menu for this, it wouldn’t be possible for (for example) a vendor like Nokia to use that component on cellphones. Unless of course they would hack a gtk-type like the menu. Else they would be forced to fully implement a message view. While that, because everything in tinymail for that reason is an interface, is perfectly possible, it’s not as cool as allowing the developer to inherit mine.

To prefer things like strategy and decorator (delegation) over inheritance is one of those design pattern principles. Imagine I have just received a spreadsheet file in my E-mail box. Not having a viewer for that, my phone obviously can’t view that file. Ahhh .. maybe some phones have a viewer?! But I bet those phones are more expensive. Tinymail wants to be usable on both types of phones. Tinymail is for all phones, for all mobile devices, for all embedded appliances and soon for all web applications that expose E-mail functionality. If not, that’s a design bug which you can report.

Say there would be no good reason to save it on the phone itself. Maybe the phone will someday support a virtual filesystem? I doubt that, and if it will it might not be (at all) like our gnome-vfs. In my opinion, gnome-vfs definitely isn’t the right tool for that.

I made “the saving of a mime part” a strategy that the application developer can implement himself. I have a default type for saving, implemented using gtk+. It can be inherited of course. The interface of it can, however, also be implemented. For example an implementation that does nothing or one that saves it on “myspace” in stead of on the phone. You know, by using the secret GPRS service of this “myspace” vendor.

It’s not my business how they, the phone- and this “myspace” vendor, will agree on a feature like this. This is exactly why tinymail is LGPL, not GPL. They are encouraged and allowed to keep their stuff closed. They are equally encouraged and will get my assistance to open it. But one thing is a certainty: that it’s not my business nor a decision of some opensource community to make for them.

Soon tinymail will have language bindings for programming environments like D, Java and .NET. Programming with tinymail while using these environments will look like this. Feel free to CamelCase and fix coding style yourself (it’s just an example). The links point to the tinymail framework API pieces being used. I did that just in case somebody still believes that tinymail isn’t documented.

DotNetters: #define extends :\n #define implements ,\n #define super base

public
class MyOwnMsgView extends TnyGtk.MsgView implements Tny.MimePartSaver
{
        private Button save_button;
        private Tny.MimePartSaveStrategy saver;

        public MyOwnMsgView () : base () {
                this.save_button = new Button ();
                this.save_button.Clicked = new EventHandler
			(on_save_button_clicked);
        }

        public void on_save_button_clicked (object s, ...) {
                this.perform_save (super.get_msg ());
        }

        public Tny.MimePartSaveStrategy get_save_strategy  () {
                if (this.saver == null)
                        this.saver = new TnyGtk.MimePartSaveStrategy ();
                return this.saver;
        }

        public void set_save_strategy  (Tny.MimePartSaveStrategy s) {
                this.saver = s;
        }

        public void perform_save (Tny.MimePart part) {
                this.get_mime_part_save_strategy ().save (part);
        }
}

ps. Showing and explicitly linking to i.e. its documentation is indeed my personal technique for both forcing myself to do it, and proving that I did it. Unlike most libraries and projects, tinymail is documented. Its documentation, flexibility, design and testability are the top priority of the project. And … If I say that I’ll do it, I do it. Pe ri od.

Handling mime parts, integration with for example Dates

A month ago I had this funny idea that the tinymail 1.0 API was not going to change significantly.

Strange that I have these ideas.

Anyway. I changed something significant. I changed the entire concept of viewing mime parts like attachments, pgp signatures, e-mail bodies and other mime parts.

There have been two to do items on the tinymail development pages. One about integration with Dates and one about having PGP support using for example Seahorse.

Both Dates and Seahorse are of course optional and probably will have to be changed to another something on your specific device. Don’t worry, I know that; Maybe indeed you have a in-house calendaring software installed on the mobile devices? Maybe you simply don’t support it? Maybe you want to upload meetings to a service on the Internet in stead of registering them with the calendaring tool on the device? I don’t know. Nor will tinymail know. But you will make it know about that.

I do know that there are a lot people, a lot devices and a lot possible situations to support. I also know that we have this one certainty in the IT industry: the simple fact that everything is going change sooner or later.

So the idea is to let you implement your own TnyMimePartView types. Possibly by inheriting mine but definitely not necessarily. You can, as with all tinymail types, implement using just interface and don’t care about whatever I cooked for you. Maybe you want to implement it using ASP.NET by letting it render a web page?

Tomorrow some stupid company x invents a new mime part that it uses for … I don’t know, calendaring or whatever. Adding support for this new mime part is simply going to be as difficult as implementing a new TnyMimePartView and registering it with the TnyMsgView.

I will soon create a TnyMimePartView implementation that will ask you whether or not you want to store an attached meeting request in Dates and another one that will verify a PGP key using, probably, Seahorse. They will, as usual, serve both as examples and as the implementations that will probably also be used by Modest on the Nokia 770.

One step closer to having an integrated PIM on your cute Nokia 770.

Oh btw. Bart reassured me he is still working on the .NET bindings for tinymail. Like all of us, he’s busy with his daytime job and building his house and stuff like that. I guess we will have to exercise some patience. The Python bindings are of course still functional.

VMWare saves my consultant’s-ass

I’m so lucky the people at VMWare created a great virtualisation platform that integrates well with the (Windows) desktop. Being a consultant, for my current/new project, I have to use a laptop provided by the company/customer for who I will do the project. I’m allowed to install any software on it. But I’m not allowed to replace the Operating System (Windows XP), as they have some strange agreement with Microsoft. Something like: your staff ‘has’ to use Microsoft software to do their jobs, so all company laptops ‘must’ be installed with Windows XP. Hrmm, strange Microsoft sales people with their strange monopoly strategies. Do they really think to win popularity contests this way? Do they really believe that this way, software developers will consider Microsoft to be a serious company?

So I will install the “software” VMWare player and I will add the software’s configuration data “a Ubuntu Dapper image” to that VMWare “software”. The dudes at IT, at my customer, agreed with that setup. I will be using NAT so that they will not have to care about the Linux OS asking for an IP address nor will they see a new MAC address in their super cool Windows firewalls.

And since the integration with that Windows desktop works smooth, it will not block me a lot from working. The task is upgrading some custom software that was created for Solaris. Can you imagine software developing for Solaris using Windows tools like notepad? At least now I will have GNU gcc and stuff like vim and gmake, heck I’ll even have Anjuta available in that Ubuntu VMWare image. That’s a lot better than having to install a bunch of crazy Windows tools or having to develop on an aged Solaris 2.6 where installing a modern GNOME on is more like a three week adventure than something you do in a few hours.

Thank you VMWare team. Great product. The product ‘really’ makes me much more productive. Yet you guys don’t market bullshit this a lot (like the VS.NET sales guys at Microsoft did and are still doing).

Statistics, popularity, future and change

You have lies, damned lies and you have statistics like daily website usage statistics. I sometimes try to use these statistics to measure whether or not people are actually interested in this tinymail thingy.

In the graph you often see peeks. Especially in the blue and green colored bars which represents things like the hits and the amount of requested files on the web server. These peeks, not surprisingly, happen each time after I blog about tinymail.

More interesting are the visits (yellow and orange). Obviously the amount of visits peek a little bit the same day the blue and green bars peek. The good news is that the amount of visits stays relatively high a few days after the blog, whereas the amount of hits don’t. You can see this the 6th day of September and the 4th day. Same happened the 29th day of September.

Interesting is also that the amount of sites (unique IP addresse) are not very different from the amount of visits. Does this mean that companies do a lot visiting, whereas people do a lot hits? It’s a vague statistic indeed.

Maybe it’s just me interpreting the results the way I want them to be (the typical psychological mistake most people make when looking at statistical data). I believe it’s important to keep this psychology in mind. It’s the first full month that I measured this, I will of course measure it more often and try to identify both true trends and my psyche-mistakes.

I’ve been surprised by the interest in the project. I hope this will soon translate to project members, contributors and contributions. Chances are high there’s going to be very interesting announcements in future. Chances are equally high there wont be such announcements. A lot depends on how much competent people will invest time in the project. A lot depends, they told me, on my own availability and willingness to partly invest my own career in the project. If more people join the project, pressure on me will be less but interest in the project will be more.

A lot people told me that the future is web E-mail. For desktop, yes probably. Or maybe? I’m not even certain. For all mobile devices and embedded appliances, I’m less certain about that. I will also steer tinymail to be a framework for web E-mail application development. Not everybody will be using G-, Hot- and Xmail if E-mail will or would always be accessed using HTML. A lot companies might be interested in having their own E-mail appliance in their own company rack. Maybe E-mail services on devices like wifi routers? A lot companies hate having to deal with setting up both E-mail servers and clients like outlook.

A lot of today’s HTML E-mail services and websites suck on most, if not all, mobile devices. A trend or an unavoidable fact for mobiles? Is everything to become a website? Or is that just silly Web 2.0 marketing activity?

Too much people, including me on occasion, think black/white: they like their GMail thing and now think it’s simply impossible that there are other possibilities for E-mail, other then their own focus-idea about it.

I say, let us get some software built on top of tinymail. A lot people are waiting for Modest. Modest will happen. But I’m also interested in other users and cases. The API is going to be like this. That is a certainty. Yes, things are still going to change. But not as drastic as the last few months. It’s more or less becoming what I wanted it to be. Change is good, it means that you can request it when building your software with it.

But tinymail is designed with this change in mind. It’s flexible and adaptive. This means that change doesn’t necessarily mean API change. It means extending. Adding. Keeping it adaptive and flexible is what required the massive refactoring of the last months. Next time this happens, the major version number will flip and a new API directory and branch will be made. My plan is to guard API and ABI per major version.

M	T	W	T	F	S	S
« Sep				Nov »
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

Month: October 2006