Opened 3 months ago

Last modified 3 months ago

#19073 new enhancement

[packagefs] Investigate alternative data structures for the by-name lookup of Nodes in Directory

Reported by: waddlesplash Owned by: bonefish
Priority: low Milestone: Unscheduled
Component: File Systems/packagefs Version: R1/beta5
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

At present, Nodes (including all UnpackingNodes) are stored in Directories in both a hash table (for fast by-name lookup) and a linked list (for sequential iteration.) However, a hash table might not be the most optimal way to perform lookup, and other data structures could combine the functions of the hash table and the linked list.

In https://review.haiku-os.org/c/haiku/+/8242 I experimented with using an AVL tree, but this wound up being slower, so clearly that's not the answer here.

Change History (4)

comment:1 by waddlesplash, 3 months ago

Copying PulkoMandy's comment from there in full:

Both of these seem to be the wrong algorithm for the job.

For lookup by string, a Trie (https://en.wikipedia.org/wiki/Trie) could be a better choice if there are enough nodes for it to be useful.

If there are not that many nodes and they are in fixed numbers (or at least, they change rarely, for example only when activating or deactivating a package), a binary search in a table sorted alphabetically may work.

Likewise if the keys change rarely, you could build a "perfect" hash table once you know all the keys: http://stevehanov.ca/blog/?id=119 You'd have to rebuild the hashtable everytime a package is activated or deactivated, but you could probably cache it (or parts of it) on disk with the activated-package file, so you don't have to recompute it at every boot. There are various algorithms to compute one such hash, depending if you expect to have only a few entries (example: keywords in a programming language) or a lot of entries (exemple: string search in a large database). HEre, if I understand correctly, the lookup is per directory, so I'd expect around a 1000 entries maximum? So, somewhere in between the two.

The size of the hash table in tour measurements is about 17 bytes per file, I see one half of that is a pointer in the Node class to handle collisions (that will be 8 bytes on a 64bit system), where are the other 9 bytes coming from? With a perfect hash function, there are no collisions, and so this pointer could be removed.

Am I missing something here? I'm not sure if my assumption that these structure change only on package activation/deactivation or some other sufficiently "rare" events is right. If that's the case, it is certainly something we can take to our advantage to pre-compute things at that time.

Looking at the code, another thing that could possibly save space is removing the reference count in Node, and instead using some garbage collection system to handle deleting nodes. This follows the same idea: I assume that nodes in packagefs are rarely deleted (only when a package is deactivated?), and so it would be OK to optimize the usage, at the cost of slightly slower package uninstall (the gc may need to iterate over a few data structures to see what is or isn't referenced). I don't know how much code references these nodes, if there is a lot of places, that could make this difficult to get right.

comment:2 by waddlesplash, 3 months ago

My replies:

where are the other 9 bytes coming from?

The hash table has to allocate an array of pointers for the table itself, of course. I think it allocates rounding up to the nearest power of 2, so it's a little larger than one element per entry in the general case.

With a perfect hash function, there are no collisions, and so this pointer could be removed.

Unless I'm mistaken, the hash table doesn't resize immediately when new items are inserted, but only after some threshold is crossed (or at least that's what would seem to make the most sense for an open hash table.) Before the threshold is crossed, we will have some items with different hashes but which are stored in the same bucket. So we still need the pointer regardless.

Another possible data structure to use here is an IterableSplayTree. That might wind up being faster in the end due to the splaying status. However it has the major disadvantage that we would need a write lock to do lookups into it, so that may be worse in the end.

comment:3 by waddlesplash, 3 months ago

For future reference, here's the timing instrumentation code:

@@ -105,11 +102,21 @@ Directory::RemoveChild(Node* node)
 
 
 Node*
 Directory::FindChild(const StringKey& name)
 {
-	return fChildTable.Lookup(name);
+	static bigtime_t total = 0;
+	static int32 lookups = 0;
+
+	bigtime_t time = system_time();
+	Node* node = fChildren.Find(name);
+	time = system_time() - time;
+	atomic_add64(&total, time);
+	if ((atomic_add(&lookups, 1) % 1000) == 0)
+		dprintf("FindChild lookups time: %" B_PRIdBIGTIME "\n", atomic_get64(&total));
+
+	return node;
 }
 
 
 void
 Directory::AddDirectoryIterator(DirectoryIterator* iterator)

comment:4 by waddlesplash, 3 months ago

There is an implementation of "compact string tries" that looks interesting here: https://github.com/kampersanda/poplar-trie

It may be useful for the string pool also.

Note: See TracTickets for help on using tickets.