It’s time for some monthly bashing-others about memory wasting. Moehaha! But hey, Camel was not done that bad. This is a friendly one. Some people in the past had this strange idea that I was questioning the competence of those who’ve built Camel. That’s absolutely not true. There’s a reason why I use it myself, don’t forget that (and some day, people will bash the fuck out of tinymail, right? Good! keeps us going).
Some developers enjoy the fact that glib makes their lives more easy a lot, it seems. And I agree with them! You should leverage the fact that glib is a well tested library. Who wants to implement his own hashtable or doubly-linked list each project he starts? I honestly don’t.
However. In the CamelFolderSummary, the number one piece of code that consumes most memory of E-mail clients that are being build on top of Camel (that includes Evolution and tinymail with its camel-lite), we surprisingly see a GHashTable being used in parallel with a GPtrArray! Both holds the exact same instances?!
Somebody like me … questions that. Because both in memory and in performance this (in this case) is a loose-loose situation: adding it to the hashtable takes time, searching a GPtrArray is (in this case) not going to take a lot longer (read below for more information on the “in this case” situation).
Consider the memory caused by each item you add to the GHashTable:
struct _GHashNode { gpointer key; gpointer value; GHashNode *next; guint key_hash; };
That’s on a typical x86 architecture 4 + 4 + 4 + 4 bytes, right? 16 bytes per item (Or am I calculating this wrong? There’s no waste caused by mem alignment here I think, and it uses g_slice_new so there’s also not a lot heap admin). Multiply that with 10,000 headers, that’s 152KB of memory. Nothing you say? Okay, I agree (well, except that for a mobile application this is a significant memory improvement).
However, consider that each and every summary item that you can see in your Evolution consumes this. With the help of some mailing list subscriptions, I use Evolution to manage up to 1,000,000 messages that way: 15,625KB or 15MB of GHashNode instances (gah, I must be miscalculating because that’s much more than what I expected).
Because this:
static CamelMessageInfo* find_message_info_with_uid (CamelFolderSummary *s, const char *uid) { CamelMessageInfo *retval = NULL; guint i = 0, len = strlen (uid); for (i=0; i < s->messages->len; i++) { CamelMessageInfo *info = s->messages->pdata[i]; if (info && !strncmp (info->uid, uid, len)) { retval = info; break; } } return retval; }
Is more difficult than this:
g_hash_table_lookup(s->messages_uid, uid)
I don’t whine without a patch, right? right!
Update: and on the tinymail front, I removed the need for often calling this function (update: which is an important pre-condition for applying patches like this). This made loading large folders a lot faster when using the framework for this. You can svn diff -r 1451:1452 to check what I changed for that.
Update on CPU consumption (because some people have questions about that): The hashtable lookup would indeed be faster, but the hashtable lookup can (and usually is) avoided in Camel. Not sure whether it’s also avoided (and avoidable) in Evolution code (that’s a lot code to check), but all occurrences where the hashtable lookup is needed are avoidable and are therefore avoided by tinymail. In other words: looking up by uid is not often needed. But you are right that a hashtable lookup is faster than that find_message_info_with_uid implementation.
remember that hash is used to reduce consumption of CPU.
lookup in a hash is O(1).
The patch you wrote made it O(n).
n being the number of messages.
@hoa: Yes, I’m rather surprised to see Philip do this, at least not without addressing the performance impacts. Surely he understands the difference between O(1) and O(n), and it may indeed be justifiable to drop the hash table for iteration, but that Philip didn’t discuss the performance tradeoffs is a bit, well, dubious.
Philip, can you comment on the performance impact to camel by iterating instead of the O(1) lookup gained by the hash table?
Oh, yes.
The thing is that the function is not used very often or its use can be avoided most of the times. Also in Evolution’s case.
Also, iterating over the array may end up accessing a lot more memory pages than calculating the hash and iterating over the corresponding bucket. So in a system that can swap to disk, it may result in net in-core memory loss.
Hmm, that’s a good point indeed. Is there a way to test this?
Not that I know of, but I aired an idea here.
On second thought, you could try to graph RSS use, but it’ll probably be too noisy and dependent on when and what the kernel decides to swap out.
Well, the good thing is that I reduced the need for performing the lookup by uid.
I guess it’s a little bit getting very specific (and therefore becoming a bad example for blogging, hehe, well too late now).
But I should indeed test whether what I’m doing is indeed the best way.
Because sometimes it’s still needed to do quite a lot lookups at a specific moment in time: especially when downloading new summary information is happening. This is a background operation that can take some time, as it’s mostly waiting for read() from the socket anyhow. Time is not really the problem, but cache trashing might indeed (that strcmp is going to be much worse than what GHashTable does with comparing the key-values first).
If making that take a little bit more time reduces memory consumption during normal usage, I’m fine with it.
I might implement a trick that turns on and off using the hashtable … because I know exactly when a lot lookups will be needed and when that will stop being needed.
A common trick to avoid str[n]cmp overhead is to compare first bytes and only then call str[n]cmp. I think there is a bug in your code: when uid=”90″ and info->uid=”90x”, strncmp will return 0 since you will be comparing only first 2 bytes. So following code will be better:
if (info && info->uid[0] == uid[0] && !strcmp (info->uid, uid)) {
aivarsk: Both about the bug and comparing the first character, that’s actually true :). Thanks for notifying me.
But note that only compare the first byte might not have a lot impact on how pages are mapped in and out. If all uid’s would be close to each other and fit in a few pages this would indeed be a good idea.
The problem, which is also probably what Hans Peter means or at least what his comment made me think of, is that the locality of where the data pointing at the ->uid’s is, is scattered around the mmap()ed file.
sizeof(hash node) = 4*ptr, sizeof (GList) = 3*ptr, which is almost the same (ignoring the array for the hashtable). Maybe instead of removing the hashtable you could remove the list and iterate over the hashtable when needed?
At that location it’s a GPtrArray, not a GList. However indeed, later (in tinymail) the same instances are put in a GList (but that’s not in camel anymore).