I was reading a chapter on persistence from a book called OSTEP (a classic OS book by Prof. Remzi). It has all kinds of crazy things that I don’t hear about every day: storage devices, drivers, pages, virtualization of disk, and many other things.
The most interesting thing I found was the File System. The early stages were pretty basic - they didn’t even have a hierarchical structure (directories inside directories) - but things evolved pretty fast from there.
I was very intrigued, so I started building one from scratch. (Well, not exactly from scratch though, as it would have required an external HDD and a lot of low-level code).
There are 3 major pieces:
I wanted to keep things simple and not trying to build fully-fledged File System. I restricted myself to these commands:
I also planned to benchmark things wherever I could.
As you know, disk operations are usually happen in blocks - you read or write one block at a time. The usual block size is 4KB, but I have kept it at 8 bytes here. The disk size I have chosen is 4KB (4096⁄8 = 512 blocks).
So, the name of the game is persistence. Your file system should work just fine after reboots and crashes. You need to have some structure in place that doesn’t change with restarts (unlike RAM, which loses its data as soon as the power is lost).
Sounds tricky to implement? Yes, it is. It is so tricky that I actually gave up on this and moved to simpler design.
First of all, I fixed the block size of 1 byte (it makes life so much easier)
**** ***
!)So, we have 4096 blocks and we can store 1365 files, each of 1 byte size.
In this attempt, I got following results (numbers are in microseconds):
Performance of rm
is pretty bad here. There are multiple reasons
In this iteration, I have revamped the disk structure to keep filetype
and filename
together. Data blocks will now be filled from right to left, and inodes will be filled from left to right.
This is how it will look (similar to heap-stack kind of design)
-- START OF DISK--
#number_of_inodes block
#inode_#1 file_type
#inode_#1 file_name
#inode_#2 file_type
#inode_#2 file_name
.
.
.
#inode_#2 data
#inode_#1 data
-- END OF DISK--
Here are the result of this iteration:
It looks like impl #2 is better than #3. It is simple and it has file_block_seek
which keep tracks of blocks filled till now, thus making it more efficient.
Although, metrics for delete_1000_files
test are pretty good given we have implemented the tombstone method.
Benchmark with file_block_seek
improvement (you can see that creation is much faster now)
There are huge limitations with current approaches like
In the next phase, I’ll work on these and see where it takes me.
Write me up at me@shivangnagaria.com for anything :)