File systems
Basic
A filesystem is the methods and data structures that an operating system uses to keep track of files on a disk or partition; that is, the way the files are organized on the disk.
- File content stored in blocks on storage device
- Files are organized into directories (folders)
Detail about file
File descriptor
A file descriptor is a number that uniquely identifies an open file in a computer’s operating system. It describes a data resource, and how that resource may be accessed.
简单的说,根据Linux一切皆文件的概念来看,当进程打开或者创建文件的时候,内核会向进程返回一个数字,这个数字就是文件描述符,所有执行I/O操作的系统调用都通过文件描述符来进行。
Hard link and symbolic link/soft link
A hard link is essentially a synced carbon copy of a file that refers directly to the inode of a file. Symbolic links on the other hand refer directly to the file which refers to the inode, a shortcut.
1 |
|
File permission mode
3种模式 r:read w:write x:execute
3种身份: user group others
3*3 = 27 种访问权限
rw-r–r– => 110 (owner permission) 100 (group) 100 (others)
可以使用chomod
来进行权限的修改
Inode
The inode (index node) is a data structure in a Unix-style file system that describes a file-system object such as a file or a directory. 更多inode细节
- Stores metadata/attributes about the file ( use
stat file
to check the metadata ) - Also stores locations of blocks holding the content of the file
metadata
1 |
|
locations of data
An inode has:
- A number of direct pointers,each points to a data block
example: a inode has 8 pointers, each pointer points to a 4K block, then this inode is enogh for 8*4K = 32KB size of file - Also has a slot for indirect pointer : a pointer points to a data block storing direct pointers
example: 1 block’s size: 4KB, pointer size: 4 bytes, then a block can hold 1024 pointers => Now file can have (8 + 1024) blocks , enough for 4MB size of file
How to store larger files? - Multi-level index
- Pointers may be organized into multiple levels (double,triple,…n times)
1 direct pointer => 1 block (4KB)
1 indirect pointer => 2^10 direct pointer = 2^10*4KB = 4MB
1 double indirect pointer => 2^10 indirect pointers => 2^20 direct pointers => 2^20*4KB = 4GB
Organization of blocks
Assumption:(便于理解,做些假设)
- Disk consists of a list of blocks and they are array-based.(other forms:Tree-based, e.g., SGI XFS -Blocks are organized into variable-length extents)
- a disk with 64 blocks
- 4KB/block =>文件系统的最小单元
- 512B/sector =>存储系统的最小单元
- so there are 2^12/2^9 = 2^3 = 8 sectors/block and capacity of disk = 64 * 4KB = 256KB
Structure about these blocks:
- Data region (56 blocks (#8-63))
- Inode table (5 blocks #3 – #7) 元数据+文件内容地址
- assume 256 bytes/inode 5 blocks, 4KB/block
- => 80 inodes total (4KB/256B * 5)
- => File system can store at most 80 files
- Bitmaps (#1, #2) a vector of bits, 0 for free (inode/block), 1 for in-use
- Inode bitmap (imap):keep track of which inodes in the inode table are available (这里4KB一个block 1byte=8bit,所以一个block的话最多以存储80K个inode的状态)
- Data bitmap (dmap):keep track of which blocks in data region are available
- Superblock(#0): Track where i/d blocks and inode table are; Indicate type of FS & inumber of its root dir; Will be read first when file system is mounted
Inumber
Each inode is identified by a number: Low-level number of file name
Can figure out location of inode from inumber 通过inumber可以寻址到其代表的inode所在的sector
Location: ⌊(inodeStartAddress + inumber ∗ inode size)/sector size⌋
example: inumber = 32
Address:12K+32*256=20K
Sector #: 20K/512 = 40
总结inumber,inode,data一句话来说就是inumber => inode => data
Detail about directory
Basic
- Directory itself stored as a file
- For each file in the directory, it stores:
- name, inumber, record length, string length
- If file is deleted (using rm command) or a name is unlinked (using unlink command),then inumber in its directory entry set to 0 (reserved for empty entry)
- File is finally deleted when its last (hard) link is removed
- File is finally deleted when its last (hard) link is removed
Storing a directory
- Also as a file with its own inode + data block
- inode:
- file type: directory (instead of regular file)
- pointer to block(s) in data region storing directory entries
Operations on file
- Create: open(), write()
- Read: open(),read(), lseek()
- Update: write(), lseek()
- Delete: unlink()
Note: 这些函数都是操作系统自带提供的系统调用函数
Read
- User interface via GUI or touch command in Linux
- Implementation, e.g., via a C program with a system call: open()
- open() returns a file descriptor. Reserved fds: stdin 0, stdout, 1, stderr 2
- After getting the file descriptor of file, start read()
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26fd = open("/foo/bar", O_RDONLY)
- Need to locate inode of the file "/foo/bar"
- Assume inumber of root, say 2, is known (e.g., when the file system is mounted)
step1: read inode and content of / (2 reads)
- Look for "foo" in / -> foo's inumber
step2: read inode and content of /foo (2 reads)
- Look for "bar" in /foo -> bar's inumber
step3: read inode of /foo/bar (1 read)
– Permission check + allocate file descriptor
open-file table per process(维护这些进程,系统有一个 open-file table)
read(fd, buffer, size)
– Note fd is maintained in per-process open-file table
– Table translates fd -> inumber of file
step1: consult bar's inode to locate a block
step2: read the block
step3: update inode with newest file access time
step4: update open-file table with new offset
step5: repeat above steps until done(with reading data of given size)
I/O cost for open(): 5 reads
I/O cost for reading a block: 2 reads + 1 write
Write
1 |
|
本博客所有文章除特别声明外,均采用 CC BY-SA 4.0 协议 ,转载请注明出处!