Opened 8 months ago

Last modified 5 months ago

#18567 new task

Tracker does not show folder size

Reported by: thaflo Owned by: nobody
Priority: normal Milestone: Unscheduled
Component: Applications/Tracker Version: R1/beta4
Keywords: Cc:
Blocked By: Blocking:
Platform: All

Description

In some cases, Tracker shows icon size. See picture. Would be more then obvious and make sense, if the size is displayed for every folder.

Attachments (1)

screenshot1.png (33.4 KB ) - added by thaflo 8 months ago.
Tracker folder size

Download all attachments as: .zip

Change History (18)

by thaflo, 8 months ago

Attachment: screenshot1.png added

Tracker folder size

comment:1 by waddlesplash, 8 months ago

Keywords: tracker folder size removed

I'm confused about what you mean here. What do you mean by "icon size"?

comment:2 by waddlesplash, 8 months ago

"config" is reported as being 4KB in this image because it's a packagefs mountpoint, and the packagefs root is reported as being that size.

comment:3 by pulkomandy, 8 months ago

The size of a directory (including all contained files) would be useful, but it is also not easy to compute (it would need to stat all files in the directory and its subdirectories). That's why it's not done, and you can use DiskUsage instead.

I think we should remove the size report for the packagefs mountpoint? It seems a bit confusing and not very useful.

comment:4 by humdinger, 8 months ago

I think we should remove the size report for the packagefs mountpoint? It seems a bit confusing and not very useful.

+1

A quick way to get the size of a single folder is to use "Get info" (ALT+I).

comment:5 by pulkomandy, 8 months ago

There is already #10291 for not showing the size for packagefs mountpoints.

comment:6 by thaflo, 8 months ago

Yes, in Get Info it's showed, and I assume it's calculated instantly when the window is opened. One question is, if it slows down the tracker significantly, if the size is calculated on opening the tracker window. It could be a optional function from tracker, like image preview. For older hardware or users who like it faster. Or it can be stored as a folder attribute, with an update on every change inside the folder. Just some thoughts

comment:7 by humdinger, 8 months ago

One question is, if it slows down the tracker significantly, if the size is calculated on opening the tracker window.

I don't think it's a good idea to have Tracker churn through all (sub)folders to update the size of all folders every time you open a Tracker window. Think of dozens of folders each with dozens of subfolders with a hundred files each. Maybe even on a spinning disk... or even worse creating traffic over the net for remote folders...
I think, keeping attributes uptodate won't work well either, as it's hard to keep track of files being changed in Terminal or over the net etc. Or at all, in case of foreign filesystems.

Close ticket?

comment:8 by waddlesplash, 7 months ago

I think we should remove the size report for the packagefs mountpoint? It seems a bit confusing and not very useful.

This is from the blocks reported by packagefs_read_fs_info. Most filesystems report a real value, but packagefs just reports 1 block. Changing this to 0 would make it display as "0 bytes".

We could set this to -1, and then make that a "magic value" indicating "unknown" (right now, setting it to -1 makes Tracker display "-4096 bytes", which isn't an improvement.)

comment:9 by jscipione, 5 months ago

This is a Mac thing. The Classic Mac OS that BeOS was copying and even the current version of macOS does not display size for folders, only files. This is because it needs to recurse over all subdirectories to get the size which is potentially expensive. You have to open the Get Info window and then it will calculate the size.

I suppose that we could do better with our mulit-threaded OS but so far nobody has done this work.

comment:10 by nephele, 5 months ago

This is something that would be quit inexpensive is we'd design a filesystem with that in mind, but recaclculating and caching this would be another option.

I suppose if writes are atomic one could update size attibutes recursively but that sounds prone to race conditions in this case. (maybe with locks this could work?)

in reply to:  10 comment:11 by jscipione, 5 months ago

Replying to nephele:

This is something that would be quite inexpensive is we'd design a filesystem with that in mind, but recalculating and caching this would be another option.

I suppose if writes are atomic one could update size attributes recursively but that sounds prone to race conditions in this case. (maybe with locks this could work?)

We already do the work of calculating the directory size in the Get info panel so whatever locking needs to happen has already been worked out. To complete this feature you'd have to iterate over each folder in a directory, calculate the size, and fill the result into the Size attribute. You'd want to do this on a separate thread (or 2 or 3). Thumbnail generation provides an example for how this kind of thing can be done in Tracker.

This would cause spinning disks to spin up and increase network traffic whenever you open a folder so there's a tradeoff here. Load times are already a problem and increasing the I/O especially on a slow USB stick is only going to exacerbate the problem. Of course all this is true for thumbnails too (minus the recursive part).

comment:12 by nephele, 5 months ago

so whatever locking needs to happen has already been worked out.

I don't think it has. I don't expect Get info to report an accurate size while two processes are writing or deleting in that tree, that is the concurrency issues I am talking about. This would essentially be a file system feature.

For packagefs we can just do this ahead of time, and then "just" merge the sizes on overlay mounting (assuming there are not too many intersecting branches between packages that should not be too bad) (though apparent size would be better as an attribute there)

We can do this for local drives too, if we want. We should never do this for attached foreign filesystems or networked filesystems, and I don't think this code belongs in tracker. The thumbnails generation is already a bit too much to be in tracker.

Anyhow, this would make much more sense and be cleaner as a filesystem feature instead of building it ontop of one, if we do build it ontop of any we should only do it for BFS.

comment:13 by nephele, 5 months ago

When writing a file this can be updated, as a primary way anyhow, makes weites somewhat slower (since you have to update the folders in the tree when done) bit leaves read performance basically untouched unless there is missing info.

comment:14 by pulkomandy, 5 months ago

We already do the work of calculating the directory size in the Get info panel so whatever locking needs to happen has already been worked out. To complete this feature you'd have to iterate over each folder in a directory, calculate the size, and fill the result into the Size attribute. You'd want to do this on a separate thread (or 2 or 3). Thumbnail generation provides an example for how this kind of thing can be done in Tracker.

You are talking about two different things.

To clarify: the existing code in Tracker works "after the fact", it enumerates all files in a directory and counts their size. DiskUsage works similarly.

Nephele was suggesting to instead make this a built-in feature of the filesystem, so the information would be immediately available.

I don't like the idea of Tracker automatically computing folder sizes everytime I open a directory, because that requires potentially a lot of disk access and that in turn means the disk is not available to do other, more important IO tasks. And, this information is very volatile: unlike thumbnails, there isn't a lot of performance win by having a cache, because the total size for a directory changes all the time whenever something happens with it. So the cache would always be invalid, or, you would have to do a lot of read access anyway just to check if it is valid anyways.

So, I think doing this without built-in support from the filesystem is a great idea. Since this is a costly operation, restricting it to a place where it is explicitly triggered by the user (running DiskUsage or "Get Info" on a directory) seems the right way to handle it.

Now, about supporting this in filesystems: this will not be easy at all on existing filesystems. Maybe it can be done. There are also quite a few edgecases to consider, things like hardlinks (you don't want to count the same file twice if there are two hardlinks to it), thinks like zfs or btrfs snapshots, can make it a bit unclear what the total disk usage of a directory actually means. Do we accept that and sometimes show slightly invalid information? Or do we make something that's even more complex and uses even more io and computing resources to give the exact answer?

comment:15 by jscipione, 5 months ago

Let's add this idea as a future file system feature for BFS2. It makes sense that the file system should be able to keep track of size of directories. I was thinking of how to implement said feature in Tracker for existing file systems. It makes sense to do the work in Tracker when you open the directory because that can be done lazily like how thumbnails work. However, if the feature was built into the file system that sure would make the job a lot easier. Also yes, we could potentially precompute sizes for packagefs mounts to display in Tracker.

comment:16 by korli, 5 months ago

The suggestion doesn't make sense IMO: mount points could exist in one folder, this is just confusing. Example /boot/home/config/non-packaged in /boot/home/config.

comment:17 by waddlesplash, 5 months ago

Tracker showing sizes for packagefs folders was fixed in hrev57433.

Note: See TracTickets for help on using tickets.