- Files can be structured in any of several ways. Three common possibilities are depicted in Fig. 10.6.
Figure 10.6:
Three kinds of files. (a) Byte sequence. (b) Record sequence. (c) Tree.
|
- Stream of Bytes. The file in Fig. 10.6a is an unstructured sequence of bytes. In effect, the operating system does not know or care what is in the file. All it sees are bytes. Both UNIX and Windows use this approach.
- Records. The first step up in structure is shown in Fig. 10.6b. A file is a sequence of fixed-length records, each with some internal structure.
- Tree of Records. The third kind of file structure is shown in Fig. 10.6c. In this organization, a file consists of a tree of records, not necessarily all the same length, each containing a key field in a fixed position in the record.
- Internally, locating an offset within a file can be complicated for the OS.
- Disk systems typically have a well-defined block size determined by the size of a sector. All disk I/O is performed in units of one block (physical record), and all blocks are the same size.
- It is unlikely that the physical record size will exactly match the length of the desired logical record. Packing a number of logical records into physical blocks is a common solution to this problem.
- For example, the UNIX OS defines all files to be simply streams of bytes. Each byte is individually addressable by its offset from the beginning (or end) of the file. In this case, the logical record size is 1 byte. The file system automatically packs and unpacks bytes into physical disk blocks -say, 512 bytes per block- as necessary.
- The file may be considered to be a sequence of blocks. All the basic I/O functions operate in terms of blocks.
- Because disk space is always allocated in blocks, some portion of the last block of each file is generally wasted. If each block were 512 bytes, for example, then a file of 1,949 bytes would be allocated four blocks (2,048 bytes); the last 99 bytes would be wasted.
- The waste incurred to keep everything in units of blocks (instead of bytes) is internal fragmentation. All file systems suffer from internal fragmentation; the larger the block size, the greater the internal fragmentation.
Cem Ozdogan
2011-02-14