2

I am reading this paper from Facebook (Beaver). The paper says that

We initially stored thousands of files in each directory of an NFS volume which led to an excessive number of disk operations to read even a single image. Because of how the NAS appliances manage directory metadata, placing thousands of files in a directory was extremely inefficient as the directory’s blockmap was too large to be cached effectively by the appliance. Consequently it was common to incur more than 10 disk operations to retrieve a single image. After reducing directory sizes to hundreds of images per directory, the resulting system would still generally incur 3 disk operations to fetch an image: one to read the directory metadata into memory, a second to load the inode into memory, and a third to read the file contents.

I have following questions:

  1. What is the meaning of "directory's blockmap"? Does it refer to file containing the mapping between filenames and inode numbers?
  2. In the end, paper says 3 I/Os are needed to read a file. In my opinion, there should be 4 I/Os. First I/O to load the directory inode (inode contains metadata). Second I/O to read the directory entries (that will give us file inode number). Third I/O to load file inode into memory. Fourth I/O to the read the file contents. Where am I wrong here?

Similar question has been asked before here and here, but I could not find appropriate answer.

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy

Browse other questions tagged or ask your own question.