Opened 2 months ago

Last modified 2 months ago

#19254 new enhancement

[BFS] Take advantage of block-cache prefetching

Reported by: waddlesplash Owned by: axeld
Priority: normal Milestone: Unscheduled
Component: File Systems/BFS Version: R1/beta5
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

In hrev58332, an API was added to the block_cache to prefetch blocks. We should take advantage of this API in BFS, especially for directory iteration, to speed up the first reads of directories and such. This could go a long way to improving performance with Git with cold disk caches, I think.

Change History (2)

comment:1 by waddlesplash, 2 months ago

Type: bugenhancement

comment:2 by waddlesplash, 2 months ago

I came up with this:

diff --git a/src/add-ons/kernel/file_systems/bfs/BPlusTree.cpp b/src/add-ons/kernel/file_systems/bfs/BPlusTree.cpp
index f836d4a7d7..f49e477a70 100644
--- a/src/add-ons/kernel/file_systems/bfs/BPlusTree.cpp
+++ b/src/add-ons/kernel/file_systems/bfs/BPlusTree.cpp
@@ -747,7 +747,33 @@ BPlusTree::SetTo(Inode* stream)
 		|| (stream->Mode() & S_ALLOW_DUPS) != 0;
 
 	cached.SetTo(fHeader.RootNode());
-	RETURN_ERROR(fStatus = cached.Node() ? B_OK : B_BAD_DATA);
+	fStatus = ((cached.Node() != NULL) ? B_OK : B_BAD_DATA);
+
+#if !_BOOT_MODE
+	if (fStatus == B_OK) {
+		// prefetch some child node blocks, if possible
+		const bplustree_node* node = cached.Node();
+		Unaligned<off_t>* values = node->Values();
+		for (int32 i = 0; i < node->NumKeys(); i++) {
+			const off_t value = BFS_ENDIAN_TO_HOST_INT64(values[i]);
+			if (bplustree_node::LinkType(value) != 0)
+				continue;
+
+			block_run run;
+			off_t fileOffset;
+			if (fStream->FindBlockRun(value, run, fileOffset) == B_OK) {
+				dprintf("PREFETCH! %" B_PRIdOFF ":%" B_PRIdOFF ", count %" B_PRIdOFF " (keys count %d)\n",
+					off_t(run.AllocationGroup()), off_t(run.Start()), off_t(run.Length()), int(node->NumKeys()));
+				Volume* volume = fStream->GetVolume();
+				size_t numBlocks = run.Length();
+				block_cache_prefetch(volume->BlockCache(), volume->ToBlock(run), &numBlocks);
+			}
+			break;
+		}
+	}
+#endif
+
+	RETURN_ERROR(fStatus);
 }

However, the added dprintf() doesn't fire nearly as often as I'd expect; running "git status" with a cold disk cache in the Haiku repository, I see less than a hundred prints of the log, while of course there are thousands of directories. Not sure what I am doing wrong here, or if I'm putting this in the wrong place...

Note: See TracTickets for help on using tickets.