January 30, 2012

DAFT (disk geometry-aware file system traversal)

DAFT (disk geometry-aware file system traversal) is an optimization available for modern file system. It fetches the files into memory according to the disk geometry. It is implemented and tested with commercial AV scanners and data backup agents. It is observed that it can reduce the elapsed time by a factor of 5 to 15 either file system is fragmented or non-fragmented. It provides a new API, BulkFileRead for a bulk file access application. In which FileList can be a list of directories and files will be fetched recursively. DAFT loads the metadata of all files into memory and then access the files according to the analysis result from the metadata. DAFT handles the files according to their file systems, e.g. NTFS and other types. And when no. of files are large so it divides them in small subsets and process one subset at a time to manage according to the memory size available. Then it sorts the files according to their respective disk locations. DAFT idea can be applied and ported to arbitrary file systems. Metadata can help to sort, arrange the files. And instead of looking for files, just ignore the file and file fragment boundary like DAFT, this will allow to use large disk reads in most cases. DAFT manage the throughput by loading data in buffer memory by loading data irrespective of target files. It covers the physical range and if in that range other files are being covered so it will be buffered in memory as well. Metadata update issue is addressed and covered by usage of VSS snapshot, so usage of metadata will work perfectly for DAFT. And its caching feature prevent extra physical disk accesses. DAFT is not reported as a suitable tool for non-fragmented disks. File systems which are designed to access large files efficiently and can routinely read/write large files at a throughput which is very close to the raw disk throughput. If we talk about the fast file system where FFS [10] places a file’s inode and its data blocks in the same cylinder group. DAFT fully facilitate in form of optimization of performance of access to the large files according to the disk geometry. DAFT is highly efficient and its prototype presents that it can consistently improve end-to-end elapsed time required to bring a large set of small files into memory according to the metadata available which will be updated because of VSS snapshot.

Last updated: March 24, 2014