Tinymail newsflash

Today tinymail not only uses less than 4 MB RAM for viewing large e-mail folders, it’s also becoming blazing fast and it’s getting the minimum features a small e-mail client needs. The big missing feature for being alpha-release-ready is creating e-mails (and sending using a transport camel provider).

After some attempts to remove the allocation peek, I squeezed it with a megabyte by more quickly unreferencing the camelfolder instances. Peeks aren’t interesting because they require the hardware builder to nevertheless add memory to the device. It’s good that memory doesn’t stay allocated, but simply not needing the memory yet being fast enough is of course better. I’m never satisfied! Now tinymail is actually faster yet its folder-load allocation peek is lower. Win-win.

Some guys on IRC want me to build an ncurses version of tinymail, using libtinymail. They are, of course, invited to start. I’d certainly join. The libtinymail and libtinymail-camel libraries don’t depend on any GUI thing.

API boutification

Performance tweaking

  • Header-list model: Caching the total length of a folder replaces a g_list_length on the header glist.
  • Folder type: Caching the priv pointer of the folder type in the loop over each message header, makes the slower gobject get_private stuff happen less often.
  • Folder and message header types: Not duplicating the uid of the messages, but keeping the gptrarray until the folder must finalize itself, letting the message header instances have their uid point to the string instances in the gptrarray.
  • Message header type: Removed the priv pointer of the header proxy type. Made the entire type opaque (forward typedef). This makes the slower gobject get_private stuff not happen for this type anymore.

New features

  • Completed the mime-part type. It now as properties like filename, mime-type and others.
  • Added a attachment viewer in the message window, added an attachment list model type that will set the mime-type icon correctly. Since that uses a tiny gnome api I had to let libtinymailui depend on libgnomeui-2 (but this can very easily be changed of course)

Disksummary branch, vfolders

Hey fejj, that poor-man’s version (the proxy technique) is one of the very few possibilities for reducing memory usage when using camel outside of Evolution. As you know is the (current) focus of tinymail to run on mobile devices. Implementing a “poor man’s” solution is IMHO better then depending on a forked and modified camel.

On IRC I did confess (to you, personally, on #gnome-hackers) that vfolders would, indeed, be more difficult to implement in a mail client like tinymail because of the technique. But not impossible (the technique might be useless because, indeed, it needs a reference to the real camelfolder instance for each folder that is used in such a vfolder — but only for such folders).

But believe me that probably a lot people don’t care about those vfolders.

For the mobile-device case: nobody will ask for vfolders on a mobile device if such a feature means paying eight times the price of a typical mobile device because viewing their mail suddenly needs hundreds of megabytes of ram memory.

Perhaps should it be made possible to turn off the vfolder feature in Evolution until Novell decides to finish the disksummary branch? People who aren’t using it, are wasting serious amounts of memory while using Evolution. By the way., I very recently contacted Nat about this disksummary branch. My opinion is that it should be finished. And I’ve already offered my help (in my freetime, I can’t spend my daytime job-hours on it).

But if Novell doesn’t start working on it: I’m not going to do it on my own. So I have to implement what you call “poor-man’s” solutions. It’s the only solution that works, at this moment. If it’s a “poor-man’s” solution, I guess Novell needs to start working on the disksummary branch. Right? (I know some guys of the Evolution team are planning to do that. I’m very interested in that, by the way).

Nevertheless, my prove of concept was that a mail client “can” show large folders without using a lot ram memory using camel. Perhaps I shouldn’t have compared it with Evolution as Evolution does, indeed, have some interesting features that do require memory. But my opinion is that 60 megabytes of ram, is just to much. Way to much. VFolders aren’t a good excuse IMHO.

I’m not the typical asshole that only whines about Evolution. I care about the product and I’m offering my help. But it looks like people are afraid of finishing that disksummary branch.

Tinymail does in 4mb what evolution does with 60mb

I told some people that I “will” prove that an e-mail client can load huge folders using very few memory. So I will prove it now.

Today tinymail can show the folders and subfolders of multiple accounts, it can show the message headers in a summary view and it can show the content of an E-mail picked from that view.

I did a session where I launch both evolution and tinymail from a cold start using the massif tool of valgrind. Both Evolution and Tinymail showed all the folders. Both evolution and tinymail gave a view of the message headers in the folder. Both evolution and tinymail viewed the last e-mail that appeared in the list. Using both evolution and tinymail I scrolled to the last message header and back to the first.

I do understand comparing tinymail with evolution might sound unfair. I do think a lot of the by evolution allocated memory is not just the non-trivial user interface stuff. In fact, by turning off my proxy design pattern tricks, I get surprisingly very similar memory usage results with tinymail.

Evolution used 60 megabytes of ram. Tinymail used 3 – 6 megabytes.

Both evolution and tinymail used camel. But that doesn’t mean that camel doesn’t need improvements. It certainly does.

At stage one the folders are being loaded. At stage two the (visible) message headers are loaded. And at stage three I was scrolling the message header summary view’s scrollbar down and up a bit. The tree stages where repeated on both tinymail and evolution.

Note that I’m uncertain about the three marked points on the evolution graph.

I am, however, certain about these three marked points (I did multiple valgrind measurements and disabled some steps to watch the memory status at known points in time).

I kindly invite people to reproduce these measurements. You can get a checkout of tinymail here. Use “valgrind –tool=massif tinymail or evolution”. Read this piece of source code to know how to create an account.

(I think) The biggest reason why evolution uses a lot memory is because of the CamelFolder instances that get delivered by camel_store_get_folder. I created a treemodel that uses proxy instances for this CamelFolder type. I made sure the real CamelFolder instance gets freed when it’s not used anymore.

For 50 folders that reduced the total amount of memory with ten megabytes. You can see that allocation peek in the valgrind massif report of tinymail. You can’t see it as clear in the evolution report because, well, it simply never gets freed. If you don’t load the folders in tinymail recursively (for example only the INBOX folder), the peek goes away. This proves that it’s indeed the allocation of all the CamelFolder instances (in fact I ironed that out to the very call to camel_store_get_folder, as I wanted to make sure it wasn’t my own code causing these allocations). I think a lot of evolutions memory goes into those CamelFolder instances.

So I was wrong when I said that it’s the camel_folder_get_uids function. As a solution, evolution could create a proxy class for CamelFolder and attempt to camel_object_unref the real folder instance as soon as possible. For a sample, take a look at how tinymail does it.

Look, screencasting

I created this tinymail screencast demo using Byzanz.

I tried showing how tinymail actively tries to deallocate message headers as soon as possible. Regretfully isn’t the disksummary branch of !z’s camel finished.

That branch would also make it possible to keep the list of message id’s on disk. At this moment, that list has to be loaded in memory. If you have a mailfolder with 10.000 e-mails, that’s 10.000 strings in your memory. Which isn’t a good idea on a mobile device.

All I can say is: !z, we need you! I hope Ajay Parthasarathi Susarla and Shreyas Srinivasan are going to work on this branch a little bit. Get it stable etcetera. I’m very likely going to join the fun of camel hacking sooner or later. This branch would, of course, greatly reduce the memory usage of Evolution for people using large mail folders.

Read more about the design decisions of tinymail here. A per definition always outdated class diagram is available here and the code here.

Why I designed tinymail using interfaces

Tinymail is going very well. I’m making progress really fast.

I wrote some explanation about tinymail. For example about why I picked this development model and why I’m so keen on using interfaces a lot. Because it’s rather long, I put it on a page instead of in this blog entry.

Have fun.

Oh Dear Lord

This has quite amazing amounts of stop energy.

Please make it stop!

If people want to develop crazy things and explore new ideas, please let them. If you don’t agree, please shut up or prove the guy wrong using code.

I don’t care whether or not it’s crack. The very fact that somebody is exploring alternatives and putting his time in it, makes it already interesting for at least one person. If he wants others to enjoy his work, it’s his right to do so. It doesn’t mean that somebody has to agree. And Joe, it certainly doesn’t mean that somebody has to insult him.

If fact, you and Ross are in my opinion scaring away potential developers who’d like to join our free software community.

Class diagram for tinymail and more brainless sick ideas coming from me

I just added a quick class diagram that describes the layout of libtinymail, libtinymail-camel and libtinymailui.

Not yet related, but might become related, is this tryout that I’ve been working on a few hours ago. I created for example a GIterIface which is an interface for an iterator. I implemented, as an example, a completely useless glistiter (and a demo of course). Actually it’s not useless as the iterator pattern abstracts iterating over list-types. So the iterator makes it possible to iterate over multiple types of iterateable list-types. I’m planning to, for example, also create a ghashtableiter. Using the exact same interface you could then iterate both a ghashtable and a glist while it wouldn’t matter for the consuming component what the list-type originally was. This might sound totally useless if you haven’t used the iterator pattern before. Most Java and .NET programmers can probably tell you why this is important. It often makes iterating slower yadiyada, but for a lot beginner-programmers it’s nice as they will be forced to correctly use the list-type (the iterator implementation for a specific list-type is typically implemented by an experienced developer).

There’s a problem with the current comparable type in gtk+: the gtktreeiter. GtkTreeIter is a type by itself, it’s not implemented as an implementation of an interface that can have a different implementation. This makes it more difficult to create gtktreemodel implementations with a specific iterator for a specific list-type. Or it makes it more difficult to reuse that iterator implemenation for other parts of your project.

This could be, in my humble opinion, one of the reasons why it might be complex and/or frightening to implement a custom tree model. Something that a lot developers should do instead of using the standard gtktree -and list store.

Other things I’m trying out: a GBindableIface, a GModelIface and a GListModelIface. All these things need a lot more brain-time. It’s something I’m cooking. I know I’m being stupid by already showing this. I know I will very likely get flamed about this. That’s okay, I’m getting used to that anyway (xdg-list). Danw is right.

Anyway. The idea of this concept is to bind data to components. For example make gtktextbuffer implement gbindableiface and then g_bindable_iface_bind (buffer, datasource). Make it possible to define a gbindablebehaviour and assign it to the gbindableiface. Something like g_bindable_iface_set_bind_behaviour (buffer, gda_datasource_to_string_bindbehaviour_new()) and datasource = gda_datasource_new(). Anyway. For this to work on a gtktreemodel, I need an iterator which I can instruct how to function when calling its next method. With the current gtktreeiter I can’t do that.

You can tryout this demo which implements a datasource that reads from a file on the filesystem. Just a silly example.

Also make sure you checkout this thingy. It makes it possible to synchronize gobject properties.

As you can see, I’m sickening glib and gobject with sick evil interfaces. And I love doing that! I guess I should be burned and severely punished. You can do that, visit FOSDEM.

Tinymail now shows e-mail headers per folder

Tinymail now shows both the folders and the headers for each folder you select. Gaaah, I know you guys don’t believe me if I don’t post a stupid cute screenshot. So here you go:

I immediately picked my largest folder: my spam account INBOX folder. This is already using the custom treemodel, so it loaded extremely fast. But that is normal, as I’m only requesting the visible camel-message-info instances. I haven’t yet fully tweaked it. This is just the very first prove of concept. I will prove that you can, using camel and some very simple design patterns, write an e-mail client that has only the visible items in memory. And that a device like the Nokia 770 can, that way, easily show an extremely large e-mail folder. Once I’m really sure only the visible ones are requested, I’m planning to experiment with compressing the items in the base-directory which camel uses to on-disk-cache headers. Once you have full control over which items are requested, such tricks become useful. As it will need to decompress recently-new-visible header information, scrolling the treeview will of course become a little bit slower (I can also do it in a thread, using something like asyncworker).

If I can reduce the amount of disk-space, a 256 mb flash disk might hold all my spam E-mails headers of the last three years. Or perhaps there’s a better database-engine solution for this? Perhaps I should check out the status of that disksummary branch. I sure have plenty ideas to make camel a good candidate for mobile devices.

Tinymail finally displays my folders using camel

I finally master camel. Today, just a few seconds ago, tinymail for the first time displayed the folders of my own IMAP account. It’s even doing it recursively correct (it can do folders in folders in folders in f..).

Tinymail uses the camel library which was created for Evolution. It uses camel in combination with mainly the proxy design pattern (Although I must admit it’s not yet using the proxy class itself, it’s design is fully ready to use such a proxy class. I first need to create a factory for the real subject which is used in the proxy classes for lazy getting the real subject instances).

There’s still a lot tweaking needed. Anyway, if you’re into crazy GObject coding, using an even more crazy library like camel, if you’d like to help me create a E-mail client targeted at mobile devices: tell me. Not that I’m promising something that might work sooner or later. As usual, as it’s a free-time project, I’m not promising anything :-p.

Its documentation isn’t very complete, so I do hope to some day speak the camel authors at some conference and/or on IRC. Fejj told me I’m the first to use camel outside of Evolution. If true, I hope libtinymail-camel will also be a guide for using camel as a decoupled library.

Oh and, sure it scales to a lot IMAP folders. My “spam” account is subscribed to a lot mailing list. Look for a screenshot of tinymail doing that account here. In fact, it should even scale to a lot E-mail accounts. As I’m also going to build proxy classes for the “account”-type. The treeview model is an account-list model. Its current implementation is a very simple inherited gtktreestore (I wanted something that works a.s.a.p., this might change in future).

SVN mime-types, ooo.o version control and tinymail

Subversion svn:mime-type property on files

This one-line UNIX command will echo command-lines that’ll fix the “svn:mime-type” property of files in a Subversion repository (perform it on a fresh untouched checkout, you might have to replace quotes as some browsers make mistakes or as perhaps WP converted it incorrectly; I don’t know):

find . -name ‘.svn’ -prune -o -type f -exec echo ‘svn propset svn:mime-type `gnomevfs-info {} | grep MIME | cut -d : -f 2 | cut -c 2-` {}’ \;

I don’t recommend doing this without the echo as perhaps not all your files should get a non-default mime-type in subversion. If you are brave, go for it. Don’t forget to svn commit once the find-script ran.

If the svn:mime-type property is correct, using the Subversion browser-mode will automatically display the file in the browser-window if the mime-type can be viewed by a browser or will ask for a viewer and download-box if the mime-type can’t be viewed. For example if you have an xhtml or a dia file or a odt in a subversion repository (the links should all suggest the correct mime-type to your browser).

Thanks to Tommi for pointing me to that problem, which triggered me into creating that one-line script. Perhaps this is another reason why gnome should some day switch to a modern source control system? (I, personally, definitely prefer subversion over CVS)

OpenOffice.org2 document version control?

Another question related to the odt file format: Does somebody know of a version control system for odt files? Are the brave OpenOffice.Org2 developers working on this?

Such a system would allow a team to work on the same file simultaneously, would show differences between versions, etcetera. So not just a document-version tracking system for binary files. I really want the differences, merging possibilities. Simple stuff. Given the fact that the odt format uses XML, should make it possible. You could create a merge-tool that is intelligent about not causing XML mistakes.

Right now, it’s unpractical to decompress the file, check it in. Check it out, compress the files, work on it, check it in. That doesn’t work as workflow. It just doesn’t. Perhaps a subversion client plugin or openoffice.org subversion integration to automate this?

Tinymail

After looking at it more carefully, I corrected some GObject standards in tinymail. For example the finalize, class_init and the iface_init methods of each type.

Functional description and updated class diagram for deconf

People might start wondering how deconf is doing these days. Deconf is a funny project of mine. Some weeks I’m very actively busy with it, other weeks I’m not. That way I distillate the best ideas from the cruft some people suggest and even more from the cruft I come up with myself.

I’ve been distillating ideas for a few months now. Sometimes, I believe I have a good idea of what such system needs and what it doesn’t need. I decided to create two class diagrams and write some sort of functional analysis that describes the different components I have in mind: the complete complex class diagram, the simplified class diagram and the functional description of the components.

These links point to the real crack. Not other peoples proposals. The code in that repository is prove of concept code. It’s probably not going to be like that.

Some conclusions, more on codegen and GObject

Some conclusions about the design-patterns-in-gobject-at-fosdem post

Some people responded very positive about doing the presentation, saying they are very interested. So chances are very high (it’s more or less certain) that I’ll do the presentation.

It looks like I’m not the only one who’s been trying to use design patterns in combination with GObject: one person told me he’s planning to document the places in the gtk+ library code where specific design patterns where used (ie. gtktreemodel, gdkdisplaymanager). Another person corrected some little glitch in one of the headers of my code and replaced the private data handling of the proxy sample with the standard GObject way of doing that. And another person tells me he’s interested in helping with the XSLT templates for generating GObject classes and interfaces using codegen.

That last one is the main reason for this conclusion-blog. Perhaps are others also interested in helping with that? So I’ll put some pointers:

Note that Codegen is LGPL and that I’m highly interested in any type of contribution or add-on. I already gave one person a commit-account on the subversion repository. Note that codegen itself is heavily based on the strategy design pattern. This makes extending it, yet not having to touch the core of it, trivial. At this moment I’m not focused on codegen. However, things like “which free software project I’m focusing a.t.m.”, change frequently. So it’s perfectly possible that suddenly a huge commit happens, that adds supports for some insanely cool feature :-p.

Design patterns in GObject at FOSDEM

Somebody of the FOSDEM team asked me to do a talk about programming techniques a.k.a. design patterns using GObject.

I have to decide whether or not I’ll do it tomorrow. So I thought, well .. okay. Let’s prepare some samples.

Because else this blog entry would be to long, I created a page that explains the proxy and strategy design patterns and contains the links to the samples.

Note that yes, I did read Head First Design Patterns. And that yes, I based the strategy sample on the chapter in that book.

Also note that I’m planning to create an XSLT for generating GObject interfaces and classes using codegen. If you want to help me with that, contact me.

There’s probably not a single good reason why to do this type of development using GObject and not using a higher environment like C# or Java. My talk will not illustrate why you should use a programming language like C with glib-object.h rather than C# or Java. It’s not going to be the point. I guess there’s no real reason. In fact is any discussion about which programming languages is the better, stupid. The talk will be about what is possible. I’m not going to play the defender of the GObject religion.

FOSDEM

My employer, well actually mainly Kris, would like me to do talks on conferences. The problem is, however, that I don’t feel like I have a lot to talk about. I can of course jump the stage and talk about whatever I know a little bit about. Which is what most conference speakers do, I guess. Another problem is that FOSDEM will happen sooner or later: Kris will likely again ask me whether or not I submitted a paper.

So the question. What should I talk about? By now I think people have had enough custom treemodel stuff from me. I could talk a little bit about how to get started as an evolution contributor and how you could earn some bounty money by doing that. But then again, it looks like these days nobody likes the evolution code anyway. Tinymail is unfinished and not yet worth a conference talk. Gnome-schedule is way to simple. If I’d do a talk about deconf, people would throw large objects at me, would start yelling and would put the building on fire (just take a look at the xdg-list if you want to know what a heated subject like that can do). And finally codegen can’t yet generate GObjects (which can change, of course). And again, some people would throw large objects at me if I’d do a talk about a .NET subject. Especially Kris, the guy who wants me to do talks, would.

So basically, I have nothing to talk about :p. Lucky me!

Anyway. I will be at FOSDEM. That’s for sure. This year, a developer room for Gnomies is arranged (and else, we’ll just steal/take/pick a room. Right?)! Brussels is easy to fly, drive, travel, etc. And they say we Belgians have great chocolate (and the Belgian chocolate is, of course, much cheaper in Belgium), waffles and good beer. So let’s all be there :p! Jeff has setup this wiki page. You can fill in the attendees page if you’re coming.

Tinymail & Camel

Wow, it looks like I’m taking that TinyMail thingy serious. It’s the second evening in a row that I’m working on it. That’s a good sign!

Probably because I’m learning (learned and now enjoying) how to use the GTypeInterface stuff of GObject and Camel with it.

I’ll guide my blog readers through to idea. Note that all subversion URL’s might change as the subversion repository is just a temporary one.

The library libtinymail is a small abstract library that defines all types as interfaces and adds some proxy classes. It defines types like “account”, “message”, “body”, “header”, “folder”, “attachment” and the simple relations between these types. It also contains a few proxy classes which might get moved to the implementation library. This depends on how I’ll design my factory. If done fully correctly, I won’t need to depend the proxy classes on the implementation: think abstract factory technique. But I’ll see how far I’ll get.

I’m planning to use the proxy technique like how I used it in the treeview demo of last week, a lot. The experiment will be whether or not camel can cope with a concept that utilises the proxy technique. So far I haven’t found anything in the camel API that tells me that it won’t work. I’m hoping to speak the authors of camel about this sooner or later. NotZed, of you catch me on IRC or wherever: ping me? :-p

The library libtinymail-camel implements these types using Camel. It’s extremely unfinished, but the implementation for the type “account” shows what I mean. I’m not yet testing with the disksummary branch, but if the camel API doesn’t change a lot, the transition to that shouldn’t be difficult, right? My plan is to eventually only depend on that disksummary branch.

So far I haven’t found any reason why camel would be a lot less resource friendly compared to other imap4/pop3/smtp libraries. So for now I’m convinced that camel is a good candidate for mobile devices. And it comes with additional support for many stores and transports. It would, perhaps, be better if it wouldn’t be bundled with e-d-s. I don’t really understand why the evolution team didn’t simply make camel a separate package and let e-d-s depend on it. Rather than putting camel in a e-d-s subdirectory and cut-and-paste it’s build environment into the build environment of e-d-s. Harish, if you catch me on IRC .. please explain :-p?!

Note that the application itself will at this moment not do much useful for a normal user. It’s still to early. Developers, however, might be interested.

TinyMail – trying to create an e-mail client for mobile devices

I started a new project called TinyMail named after my girlfriend (Tinne). The plan is to create an E-mail client designed for mobile devices like the Nokia 770. The chances of ever achieving finishing a usable tool are small to zero, but … I guess it’s fun. Which is why I started nevertheless.

What I did so far is creating interfaces (a lot like what I did with that TreeView demo of last week). At this moment, I’m basically preparing all the objects and interfaces.

If you’re interested, please contact me.

You can follow the developments and latest code in this temporary subversion repository. Sure is whatever-forge cooler and whateverer but heck, it’s just a temporary location anyway. And the chances of somebody joining early in the game are extremely small.

The idea is the make use of the proxy programming technique, use an existing library like camel in the implementations of the real subjects and attempt to be as correct as possible in terms of design pattern usage. So that basically means that probably every type will have an interface, and that I’m going to try using for-c-programmers crazy programming principles and techniques. I know this is more or less nuts if you use gobject (checkout the gtktreemodel and gtktreestore code if you want to know how to create and work with interfaces in gobject). So be it.

Here’s some pointers to interesting information:

The libtinymail and libtinymailui are components that will sit in between for example a library like camel and the user interface. They will take care of the high level caching (making sure that only visible things are abusing the memory — i.e. the proxy technique –). A lot like the models that can be used with ETable.

Note that no, a compiled version of whatever is in the repository a.t.m. will not do anything useful at this moment. It’s way to early for that. I might finished it in a few hundred years. You can, of course, join and help me.

Treeview conclusions

It looks like you guys are interested in loading three million rows in a GtkTreeView. Yesterday 512 (new) visitors visited our company subversion service to checkout the demo :-p. I think +- 100 people checked it out completely (the less obvious files, like Makefile.am’s, got viewed +- 90 times). That tops codegen‘s first day. Well, you can now put comments on my blog. So if you have questions about it: go ahead.

Note that I updated some files in the repository. Now it’s using a more correct way of implementing interfaces in C. The usage of the proxy pattern, thus implementing an interface, is actually the point of the demo. Sure is the wow cool thing about it that you can load millions of rows in a view. But the point is the proxy pattern. So it should be correct, if the intention is to ‘show’ how to do it. Right? Note that the proxy technique can be used in all sorts of model view controller situations. Not just for a treeview or datagrid.

In that demo, I should also do more with the proxy classes to show that you can treat them as if they are real subjects. That’s because they fulfill the contract (the interface) of the subject (a message header, in the case of the demo). Yet these proxy instances consume only 20 bytes. By the way, my favorite programming technique is strategy (I promised somebody not to use the word design pattern anymore. He felt design pattern is a buzzword). I’m likely going to do/show something with strategy sooner or later :-p. When browsing free software code, I often see “less good” designs decisions. Performance tweaking is very good but it wont help if the application developers aren’t going to use their brains before designing the application. The era of typical VB6 development should be over. A lot ‘managers’ should stop whining about KISS and first learn what it really means. KISS, like the KISS most people think KISS is, sucks. It doesn’t scale and it’s impossible to integrate unit testing and modern programming methods with it. The other KISS is a different story. Read Head First, Design Patterns and learn all about it. There, I’ve said it.

As a consultant I guess I got tired of manager-type guys who don’t have a clue, telling me to do everything KISS. Perhaps those guys should read about Peter Principle?! Sorry for this opinionated entry. I know I shouldn’t.

Auwch, I made a big mistake:

$ svn diff -r 14
Index: src/msg-header-proxy.c
=============================================================
--- src/msg-header-proxy.c      (revision 14)
+++ src/msg-header-proxy.c      (working copy)
@@ -76,7 +76,7 @@
 MsgHeaderProxy**
 msg_header_proxy_new_alot (gint amount)
 {
-       MsgHeaderProxy **proxies = (MsgHeaderProxy **) g_new (MsgHeaderProxy, amount);
+       MsgHeaderProxy **proxies = (MsgHeaderProxy **) g_new (MsgHeaderProxy*, amount);
        gint i=0;

        for (i=0; i < amount; i++)
$

Wow .. that would have been a total waste of memory AND a huge leak! Note that doing one huge allocation in stead of one allocation of a pointer-index-table followed by many g_slice_alloc, might improve the cpu usage a little bit (less expensive malloc syscalls). So perhaps it can be made even a little bit faster than the current result. Try it, send me a diff.

Migrated to wordpress

Some people might have noticed on the blog aggregators that yesterday, I’ve switched from DotClear to WordPress. I guess the reason some blog aggregators repeat old blog entries is because they compare cache. And after a migration, for example the unique IDs are often different. So no, there’s nothing wrong with my blog :-p!

I also installed a RewriteRule in such a way that most old blog url’s will resolve to the new WordPress URL. I can’t make all of them work automatically because DotClear and WordPress use a different algorithm for forming the title-part of the URL (called the post slug). But most work. You can tell me if a specific “old” URL isn’t working. WordPress allows me to set the “post slug” manually. I can easily set it to the old slug or in such a way that my redirector stuff resolves it correctly.

Oh, and somebody tell the DotClear developers to get themselves a real anti-spam feature. Perhaps a captcha or something like that? For my case it’s to late: goodbye DotClear.

I’m probably going to regret it, but after installing bad-behaviour on both my wiki and my blog and on the blog also Akismet, I re-enabled editing the wiki and posting comments on the blog. As I weed through thousands of moderated spam messages in my old blogs database, I’ll try to recover the relevant comments and restore them in the WordPress database of my new WordPress blog.

Some code that might help you migrate DotClear to WordPress:

Three million rows in a GtkTreeView

Edit, the repository has since disappeared, you can find a Subversion Dump of it here:

Three million rows (the size per cell doesn’t matter a lot) in a treeview, and loading the treeview in four seconds. Is that doable? Sure! The treeview wil become very slow you think? Nope, it works as fast as any other (smaller) treeview. The amount of visible rows is what would slow it down. Since most screens can’t show more than 500 rows, and since showing more would be useless from a usability point of view, it’s fast.

I committed my performance tweaks to the demo repository. It includes using g_slice for allocating the real subject and replacing the GSList in the custom model with an implementation that uses a pointer position.

So I don’t have to depend on a slow linked list anymore. In stead I simply allocate a large block of proxy instances (three million proxy instances of 20 bytes each in a continuous allocation) and inject that as index in my custom treemodel.

Since those are proxy instances, they’ll each check whether their this->real property isn’t NULL when they are needed. When a row becomes visible, that instance is needed for the from, id and to properties. When it becomes invisible, it’s no longer needed (and should therefore be freed, but the unref_node thingy of gtktreeview doesn’t work perfectly — so when scrolling a little bit, around 200 instances are kept around for no reason, I’m going to try fixing that behaviour in gtktreeview soon).

Most of the time is spend in the loop that prepares the proxy instances (msg_header_proxy_new_alot). Bringing the (visible) items to the treeview doesn’t take a lot time, as the GtkTreeView is smart enough not to load everything in case fixed-row-height mode is on.

You don’t have to believe me, you can checkout the code here. Compile it (autotools) and try.

Anyway, I’m convinced GtkTreeView by itself isn’t slow. But that doesn’t mean that the way you use it can’t make it slow. I hope others will enjoy the demo as a starting point for getting their way of using the gtktreeview optimized. For most use-cases, the use of a GSList or GList is a better technique. A linked list makes it more easy to add new items to your model. Inserting and removing items would be a lot more difficult if you use the technique I used in the demo. That technique, however, is fast because you can allocate it as one large block and excercise your high-school pointer knowledge with.

Nevertheless, I swear the unref_node stuff isn’t working correctly! :-p. Or I misunderstood it’s purpose.

Nooo! It’s that GtkTreeView proxy guy again!

And I’m not finished with the GtkTreeView. No I’m not. Moehaha!

I created this full sample that shows what I meant with custom treemodels and only allocating the model items behind visible rows. It includes an autotools environment and more or less good way of creating classes in C (except that I didn’t yet use GObject for MsgHeader nor MsgHeaderProxy, feel free to send me a diff).
This is a Subversion repository. Use “svn checkout” in front of the url after installing subversion.

It uses the “unref_node” method of the GtkTreeModelIface interface which gets triggered by gtk_tree_model_unref_node to unallocate the real subjects that aren’t visible in the view.

I noticed that it (it is the GtkTreeView stuff) sometimes “misses” rows that become invisible (if you scroll very fast). So I fear this unref_node method will need fixes and/or if you use it, you’ll need to also create a background thread/procedure that checks for leftovers. This sample shows how you can walk the entire treemodel and do things with only the unvisible ones. I know it’s ugly. IMHO the full demo isn’t, but regretfully doesn’t the unref_node stuff work perfectly.

Feel free to request SVN accounts and/or send diffs if you want to experiment.