What's the motivation behind limiting the file system block size?

Question

The file system defines the block as basic allocation unit. Both the upper and lower limits specified on behalf of operating system. For example, Linux kernel requires the file system block size to be the power of two, but not greater than the virtual memory page size.

What's the motivation behind limiting the file system block size to a virtual memory page size? How this two completely different terms might be related to each other? Is this somehow refers to a mapping mechanism?

Your title doesn't match the actual question. Filesystems are not "tightly coupled" with virtual memory. They are separate subsystems, but do interact. "What's the motivation behind limiting the file system block size ..." -- Efficiency. — sawdust, Jan 31 '23 at 20:42
it doesn't mean that it's impossible to read files systems with block size larger than page size though How to mount ext4 fs with block size of 65536? — phuclv, Feb 01 '23 at 14:07

score 0 · Answer 1 · answered Jan 31 '23 at 18:22

0

The motivation is just simplicity.

Just imagine the bookkeeping that would be required by the operating system if virtual pages sizes were calculated in fractions of file-blocks...

answered Jan 31 '23 at 18:22

harrymc

480,290

Each 4k page in block cache has masks for which 512-byte blocks are valid or dirty. It's not that hard. – stark Feb 01 '23 at 13:46
@stark: Not that hard, but why do it when speed is critical. – harrymc Feb 01 '23 at 13:52
There is just one virtual page size in an operating system regardless of all used file systems where cluster sizes may vary from one file system to another. The size of a virtual page is typically a fixed one and its choice depends of the hardware support of the processors(s). – r2d3 Feb 01 '23 at 13:54
@r2d3: I haven't seen a virtual page size other than 4K in many years, at least on Intel. – harrymc Feb 01 '23 at 13:55
@r2d3 most modern architectures support multiple page sizes, at least with hugepages – phuclv Feb 01 '23 at 14:06

Joep van Steen · Answer 2 · 2023-02-01T14:03:07.177

I always looked at it like this:

If we consider older and limited file systems, block/clustersize had largely to do with the maximum amount of clusters possible/allowed by the file system.

Example, a 12 bit FAT entry puts an upper limit to amount of clusters that can be addressed. 111111111111 is the largest 12-bit binary number. The decimal equivalent of this number would be 4095, and the hexadecimal equivalent would be FFF.

A 16 bit value increases amount of clusters that can be addressed.

So one way of addressing a disk size limitation that is the result of number of clusters we can address is increasing number of bits we can use to address a cluster/block.

Another way however is increasing the cluster or block size.

Increasing number of clusters we can address increases overhead. On the other hand increasing cluster/block size increases waste: Store a 1 KB file inside a 4KB cluster and we waste 3 KB. Or, store a 13 KB file in 4 KB clusters and we again waste 3 KB as we need to allocate 4 clusters to the file.

So, it's a trade-off between overhead and potentially wasted space increase when using large block/cluster sizes. For example, if we know in advance the file system will largely have to deal with large files we can opt for a large cluster size and have the advantage of reduced overhead.

Pages 'act' as 'middle man' between the OS and storage, but pages unlike block/cluster sizes can not be defined by something as file system formatting, rather they're fixed. Efficiency requires common ground between page and block/cluster size and so it's the page size that determines minimum block size as it is a fixed value.

You seem to assume that "lower limit being the page size" is just a coincidence... — Tom Yan, Feb 01 '23 at 13:29
Joep, your reasoning always applies regardless of the use of virtual memory or not. — r2d3, Feb 01 '23 at 13:47
Thanks for comments. I didn't assume I simply forgot to consider it. — Joep van Steen, Feb 01 '23 at 13:59

score 0 · Answer 3 · answered Feb 01 '23 at 13:44

0

The efficient use of cache memory is an argument to prohibit the use of clusters that are smaller than a virtual page.

answered Feb 01 '23 at 13:44

r2d3

3,554

1

the limit on Linux is the upper, not lower. With the kernel driver you can mount filesystems with block size smaller than or equal to virtual page size, but not larger – phuclv Feb 01 '23 at 14:08
Thank you for letting me know. "but not less than the virtual memory page size." I did not question the statement in the initional posting! – r2d3 Feb 01 '23 at 15:51
@phuclv: you are right. I just edited my question. Man page for mke2fs says that: "the kernel is able to mount only file systems with block-size smaller or equal to the system page size - 4k on x86 systems, up to 64k on ppc64 or aarch64 depending on kernel configuration" – mejjete Feb 01 '23 at 16:33

What's the motivation behind limiting the file system block size?

3 Answers3