March 6, 2012

Internals of ext4 File System

Internals of ext4 File System

A specialized version of a B-tree is implemented:
Htree indexes (a specialized version of a B-tree) are turned on by default in ext4. This feature is implemented in Linux kernel 2.6.23. Htree is also available in ext3 when the dir_index feature is enabled. It enables the ext4 to go beyond the limitation of subdirectories from 32000 to 64000 and so on but count problem can occur beyond 64000.

e2fsck is used to check the ext2/ext3/ext4 family of file systems. For ext3 and ext4 filesystems that use a journal, if the system has been shut down uncleanly without any errors, normally, after replaying the committed transactions in the journal, the file system should be marked as clean. Hence, for filesystems that use journalling, e2fsck will normally replay the journal and exit, unless its superblock indicates that further checking is required.

Multiblock allocator:
When a file is being appended to, ext3 calls the block allocator once for each block individually; with multiple concurrent writers, files can easily become fragmented on disk. With delayed allocation, however, ext4 buffers up a larger amount of data, and then allocates a group of blocks in a batch. This means that the allocator has more information about what’s being written and can make better choices for allocating files contiguously on disk. It is used when delayed allocation is enabled or when files are opened in O_DIRECT mode. But it does not affect the disk format.

Date-created timestamps:
Ext4 also adds support for date-created timestamps. But, as Theodore Ts’o points out, while it is easy to add an extra creation-date field in the inode (thus technically enabling support for date-created timestamps in ext4), it is more difficult to modify or add the necessary system calls, like stat() (which would probably require a new version), and the various libraries that depend on them (like glibc). These changes would require coordination of many projects. So, even if ext4 developers implement initial support for creation-date timestamps, this feature will not be available to use programs for now.

An extent is a range of contiguous physical blocks, improving large file performance and reducing fragmentation.
A single extent in ext4 can map up to 128 MiB of contiguous space with a 4 KiB block size.There can be 4 extents stored in the inode. When there are more than 4 extents to a file, the rest of the extents are indexed in an Htree. Ext4 Filesystem can support very large files it has 48 bits to address a block.

Last updated: March 19, 2014