Database Management Systems - Chapter 2. Disk Storage and Basic File Structures - Võ Thị Ngọc Châu
2.1. Disk Storage
2.2. File Operations
2.3. Unordered Files
2.4. Ordered Files
2.5. Hash Files
2.6. Other File Structures
2.7. Today’s Storage Technologies
2.8. Physical Storage in Today’s DBMSs
2.2. File Operations
2.3. Unordered Files
2.4. Ordered Files
2.5. Hash Files
2.6. Other File Structures
2.7. Today’s Storage Technologies
2.8. Physical Storage in Today’s DBMSs
Bạn đang xem 20 trang mẫu của tài liệu "Database Management Systems - Chapter 2. Disk Storage and Basic File Structures - Võ Thị Ngọc Châu", để tải tài liệu gốc về máy hãy click vào nút Download ở trên.
File đính kèm:
- database_management_systems_chapter_2_disk_storage_and_basic.pdf
Nội dung text: Database Management Systems - Chapter 2. Disk Storage and Basic File Structures - Võ Thị Ngọc Châu
- 2.1. Disk Storage Magnetic disks Transfer of data between main memory and disk takes place in units of disk blocks. A disk is a random access addressable device. The hardware address of a block = a combination of a cylinder number, track number (surface number within the cylinder on which the track is located), and block number (within the track) For a read command, the disk block is copied into the buffer; whereas for a write command, the contents of the buffer are copied into the disk block. 16
- 2.1. Disk Storage Magnetic disks A disk controller, typically embedded in the disk drive, controls the disk drive and interfaces it to the computer system. The controller accepts high-level I/O commands and takes appropriate action to position the arm and causes the read/write action to take place. Locating data on disk is a major bottleneck in database applications. Minimizing the number of block transfers is needed to locate and transfer the required data from disk to main memory. 18
- 2.1. Disk Storage Disk parameters Rotational delay: waiting time for the beginning of the required block to rotate into position under the read/write head once the read/write head is at the correct track rd = (1/2)*(1/p) min = (60*1,000)*(1/2)*(1/p) msec = 30,000/p msec 20
- 2.1. Disk Storage Disk parameters Rewrite time: time for one disk revolution. This is useful in cases when we read a block from the disk into a main memory buffer, update the buffer, and then write the buffer back to the same disk block on which it was stored. In many cases, the time required to update the buffer in main memory is less than the time required for one disk revolution. If we know that the buffer is ready for rewriting, the system can keep the disk heads on the same track, and during the next disk revolution the updated buffer is rewritten back to the disk block. Trw = 2*rd msec = 60,000/p msec 22
- 2.1. Disk Storage Disk parameters Bulk transfer rate: the rate of transferring useful bytes in the data blocks btr = (B/(B+G))*tr bytes/msec 24
- 2.1. Disk Storage Making data access more efficient on disk Buffering of data Proper organization of data on disk Reading data ahead of request Proper scheduling of I/O requests Use of log disks to temporarily hold writes Use of flash memory for recovery purposes 26
- 2.1. Disk Storage Making data access more efficient on disk Double buffering: a technique for more efficiency, using two buffers Double buffering: use of two buffers, A and B, for reading from disk Figure 16.4, pp. 557, [1] 28
- 2.1. Disk storage Making data access more efficient on disk Buffering of data To enable its operation, the buffer manager keeps two types of information on hand about each block (page) . pin-count: the number of used times . dirty-bit: 1 if updated; otherwise, 0 Buffer replacement strategies Least recently used (LRU) Clock policy: a round-robin variant of the LRU policy First-in-first-out (FIFO) Most recently used (MRU) 30
- 2.2. File Operations Fixed-length records The same record type, the same size Variable-length records The same type, variable-length field(s) Special separator characters (such as ? or % or $)— which do not appear in any field value—to terminate variable-length fields The same type, repeating field(s) The same type, optional field(s) Different record types with different sizes 32
- 2.2. File Operations (a) A fixed-length record with six fields and size of 71 bytes. (b) A record with two variable-length fields and three fixed-length fields. (c) A variable-field record with three types of separator characters. Figure 16.5, pp. 562, [1] 34
- 2.2. File Operations Types of record organization. (a) Unspanned. (b) Spanned. Figure 16.6, pp. 563, [1] 36
- 2.2. File Operations Placing file blocks on disk In contiguous allocation, the file blocks are allocated to consecutive disk blocks. Double buffering In linked allocation, each file block contains a pointer to the next file block. A combination of the two allocates clusters (file segments or extents) of consecutive disk blocks, and the clusters are linked. In indexed allocation, one or more index blocks contain pointers to the actual file blocks. 38
- 2.2. File Operations A file organization: the organization of the data of a file into records, blocks, and access structures The way that records and blocks are placed on the storage medium and interlinked The goal of a good file organization is to avoid linear search or full scan of the file and to locate the block that contains a desired record with a minimal number of block transfers. An access method provides a group of operations that can be applied to a file. Static files vs. Dynamic files How frequently is a file updated? 40
- 2.3. Unordered Files Employee’s data ID Name Salary Department Experiences Deletion_marker 12 Peter 2000 D1 E5 0 35 Jane 1000 D3 E2 0 9 Mary 2500 D2 E1 0 10 Smith 1500 D1 E3 0 24 John 1800 D5 E0 0 15 Ann 2000 D2 E2 0 8 Daisy 1900 D4 E3 0 1 Alice 2100 D3 E5 1 7 Brown 2500 D3 E5 0 53 White 2300 D2 E5 0 28 Mike 1600 D1 E1 1 3 Beth 2000 D5 E0 0 36 Lily 2300 D4 E2 0 42
- 2.3. Unordered Files Employee’s data ID Name Salary Department Experiences Deletion_marker 12 Peter 2000 D1 E5 0 35 Jane 1000 D3 E2 0 9 Mary 2500 D2 E1 0 10 Smith 1500 D1 E3 0 24 John 1800 D5 E0 0 15 Ann 2000 D2 E2 0 8 Daisy 1900 D4 E3 0 1 Alice 2100 D3 E5 1 7 Brown 2500 D3 E5 0 53 White 2300 D2 E5 0 28 Mike 1600 D1 E1 1 3 Beth 2000 D5 E0 0 36 Lily 2300 D4 E2 0 ? New employee 13 David 3000 D2 E5 44
- 2.3. Unordered Files Employee’s data ID Name Salary Department Experiences Deletion_marker 12 Peter 2000 D1 E5 0 35 Jane 1000 D3 E2 0 9 Mary 2500 D2 E1 0 10 Smith 1500 D1 E3 0 24 John 1800 D5 E0 0 15 Ann 2000 D2 E2 0 8 Daisy 1900 D4 E3 0 1 Alice 2100 D3 E5 1 7 Brown 2500 D3 E5 0 53 White 2300 D2 E5 0 28 Mike 1600 D1 E1 1 3 Beth 2000 D5 E0 0 36 Lily 2300 D4 E2 0 ? ? Remove details of Ann? 15 Ann 2000 D2 E2 1 46
- 2.3. Unordered Files Unordered files = Heap files = Pile files Inserting a new record is very efficient. The last disk block of the file is copied into a buffer, the new record is added, and the block is then rewritten back to disk. To delete a record, a program must first find its block, copy the block into a buffer, delete the record from the buffer, and finally rewrite the block back to the disk.=> reorganization For deletion, an extra byte or bit, a deletion marker, is stored with each record. A record is deleted by setting the deletion marker to a certain value. => reorganization For modifying a fixed-length record, a program must first find its block, copy the block into a buffer, modify the record from the buffer, and finally rewrite the block back to the disk. Modifying a variable-length record may require deleting the old record and inserting a modified record because the modified record may not fit in its old space on disk. 48
- 2.4. Ordered Files Employee’s data ID Name Salary Department Experiences Deletion_marker 2 Peter 2000 D1 E5 0 5 Jane 1000 D3 E2 0 9 Mary 2500 D2 E1 0 10 Smith 1500 D1 E3 0 14 John 1800 D5 E0 0 15 Ann 2000 D2 E2 0 17 Daisy 1900 D4 E3 0 19 Alice 2100 D3 E5 1 21 Brown 2500 D3 E5 0 23 White 2300 D2 E5 0 28 Mike 1600 D1 E1 1 30 Beth 2000 D5 E0 0 36 Lily 2300 D4 E2 0 Ordered by column ID => ordering field 50
- 2.4. Ordered Files Ordered files = Sorted files = Sequential files Search with a search criterion involving the conditions >, <, ≥, and ≤ on the ordering field is efficient using binary search. At least log2(b) block accesses Search with a search criterion on other non- ordering fields or other search criteria is done with a linear search for random access. At least b/2 block accesses 52
- 2.4. Ordered Files Ordered files = Sorted files = Sequential files Inserting and deleting records are expensive because the records must remain physically ordered. One frequently used insertion method The actual ordered file (called main or master file) for binary search on ordering field values A temporary unordered file (called overflow or transaction file) for inserting new records at the end File reorganization for periodically sorting and merging the overflow file with the master file For record deletion, deletion markers and periodic reorganization are used. 54
- 2.4. Ordered Files Average Access Times for a File of b Blocks under Basic File Organizations Table 16.3, pp. 572, [1] 56
- 2.5. Hash Files The collision problem: a variation of chaining in which a pointer is maintained in each bucket to a linked list of overflow records for the bucket Record pointer = a block address + a relative record position within the block A choice of a good hash function To distribute the records uniformly over the address space to minimize collisions, thus making it possible to locate a record with a given key in a single access To achieve the buckets fully, thus not leaving many unused locations: 70%-90% full 58
- 2.5. Hash Files Search with the equality condition on the hash field is efficient by using the hash function. At least one block access !!! Search with other search conditions on the hash field or with any search conditions on other non-hash fields are not efficient by using a linear search. At least b/2 block accesses Insertion can be done efficiently by using the hash function. Collision must be handled. 60
- 2.5. Hash Files Modifying a specific record’s field value depends on two factors: the search condition to locate that record and the field to be modified. If the search condition is an equality comparison on the hash field, we can locate the record efficiently by using the hash function; otherwise, we must do a linear search. A non-hash field can be modified by changing the record and rewriting it in the same bucket. Modifying the hash field may require the record to be moved to another bucket. delete the old record and then, insert the modified record 62
- 2.5. Hash Files Extendible hashing The access structure, Directory, is built on the binary representation of the hash function result, which is a string of bits, called the hash value of a record. Directory contains the pointers to the buckets. Records are distributed among buckets based on the values of the leading bits in their hash values. The number of the leading bits used in record distribution is the global depth, d, and dynamically determines the maximum number 2d of buckets. Two block accesses for equality search 64
- 2.5. Hash Files K: hash field h(K): hash function d: global depth d': local depth a = h(v) for K=v a: a value in a binary representation, showing the index of the bucket where a record whose K field value is v is going to be stored NO overflow area! 66
- 2.5. Hash Files Extendible hashing - Example Record K h(K) = K mod 32 h(K)B Each bucket can store r1 2657 1 00001 two records. r2 3760 16 10000 r3 4692 20 10100 d’ = local depth r4 4871 7 00111 d = global depth r5 5659 27 11011 r6 1821 29 11101 r7 1074 18 10010 r8 2123 11 01011 r1 (00001) r9 1620 20 10100 r2 (10000) d’ =0 r10 2428 28 11100 r11 3943 7 00111 d = 0 r12 4750 14 01110 r13 6975 31 11111 r14 4983 23 10111 r15 9208 24 11000 68
- 2.5. Hash Files Extendible hashing - Example r1 (00001) r4 (00111) d’ =1 0 1 d = 1 r2 (10000) r3 (10100) d’ =1 Insert r5 (11011) Overflow split the bucket double the directory 70
- 2.5. Hash Files Dynamic hashing The storage of records in buckets for dynamic hashing is similar to extendible hashing. The major difference is in the organization of the directory. Extendible hashing uses the global depth (high-order d bits) for the flat directory and then combines adjacent collapsible buckets into a bucket of local depth d-2. Dynamic hashing maintains a tree-structured directory with two types of nodes: . Internal nodes that have two pointers: the left pointer corresponding to the 0 bit (in the hashed address) and the right pointer corresponding to the 1 bit. . Leaf nodes hold a pointer to the bucket with records. 72
- 2.5. Hash Files Dynamic hashing - Example Record K h(K) = K mod 32 h(K)B Each bucket can store r1 2657 1 00001 two records. r2 3760 16 10000 r3 4692 20 10100 r4 4871 7 00111 r5 5659 27 11011 How to store these r6 1821 29 11101 records with this r7 1074 18 10010 dynamic hashing r8 2123 11 01011 r9 1620 20 10100 scheme? r10 2428 28 11100 r11 3943 7 00111 r12 4750 14 01110 r13 6975 31 11111 r14 4983 23 10111 r15 9208 24 11000 74
- 2.5. Hash Files Linear hashing i n = 0: a hash function hi(K) = K mod 2 M is applied to distribute the records over all the buckets. If any bucket is overflowed, the bucket 0 is split into 0 and M; and its content and the overflow area’s content are re- i+1 distributed by the hash function hi+1(K) = K mod 2 M. After that, n is increased by 1. n > 0: the buckets with hi(K) ≥ n are examined with hi(K) while the others with hi+1(K). If any bucket is overflowed, the bucket n is split into n and M + n; and its content and the overflow area’s content are i+1 re-distributed by the hash function hi+1(K) = K mod 2 M. After that, n is also increased by 1. i+1 If n=M, n is set to 0 and hi+1(K) = K mod 2 M is i+1 used as hi(K) and M is updated to be 2 M. 76
- 2.5. Hash Files Linear hashing n=0 0 1 2 3 r2 : r3 r1: r6 r7 : r4 : r5 Insert r8 (2123 mod 4 = 3) Overflow at bucket 3 r8 is inserted in the overflow chain of bucket 3. Split bucket 0 into bucket 0 and bucket 4 h1(K) = K mod 2*M 78
- 2.5. Hash Files Linear hashing n=1 0 1 2 3 4 r2 : r1: r6 r7 : r4 : r5 r3 : Insert r9 (1620 mod 4 = 0 < n=1) (1620 mod 8 = 4) r8 : 80
- 2.5. Hash Files Linear hashing n=2 0 1 2 3 4 5 r2 : r1: r7 : r4 : r5 r3 : r9 r6 : r8 : r10 : Overflow at bucket 4 r10 is inserted in the overflow chain of bucket 4. Split bucket 1 into bucket 1 and bucket 5 h1(K) = K mod 2*M 82
- 2.5. Hash Files Linear hashing n=3 0 1 2 3 4 5 6 r2 : r1: r7 : r4 : r5 r3 : r9 r6 : : Insert r12 (4750 mod 4 = 2 < n=3) (4750 mod 8 = 6) r8 : r11 r10 : 84
- 2.5. Hash Files Linear hashing n=0 0 1 2 3 4 5 6 7 r2 : r1: r7 : r5 : r8 r3 : r9 r6 : r12 : r4 : r13 r11 : r10 : How about insertion of r14 and r15? 86
- 2.7. Today’s Storage Technologies Data Storage Media Density in storage media Source: Potomac Institute for Policy Studies, The Future of DNA Data Storage, Report, 2018. 88
- 2.7. Today’s Storage Technologies Flash-based Solid-State Disk Drives (SSDs) SSD is a block external storage device that saves data after turning off the power. SSD is different from HDD in the following points: SSD has no mechanical components, thus has the same I/O speed for any block. The average exchange time with an arbitrary HDD block is about 10 ms for both reading and writing, while reading time with an arbitrary SSD block is about 20 microseconds and writing time is about 200 microseconds. SSDs are more expensive than HDDs but getting cheaper. Source: S. Kuznetsov, New Storage Devices and the Future of Database Management, 90 Baltic J. Modern Computing, Jan 2018, doi: 10.22364/bjmc.2018.6.1.01.
- 2.7. Today’s Storage Technologies 92 Source: Potomac Institute for Policy Studies, The Future of DNA Data Storage, Report, 2018.
- 2.7. Today’s Storage Technologies Direct-attached storage (DAS) JBOD RAID Network-attached storage (NAS) Storage area network (SAN) Cloud storage and virtualization 94
- 2.7. Today’s Storage Technologies Direct-attached storage (DAS) JBOD (just-a-bunch-of-disks) the disks are presented to a computer as if they were directly attached internal disks used with the Storage Spaces technology to combine all of the physical disks into a storage pool, and then one or more virtual disks are created Source: MS SQL Server, Storage Technologies Overview, 2016. 96
- 2.7. RAID - Redundant Arrays of Inexpensive/Independent Disks Striping of data across multiple disks. (a) Bit-level striping across four disks. (b) Block-level striping across four disks. Figure 16.13, pp. 585, [1] 98
- 2.7. RAID - Redundant Arrays of Inexpensive/Independent Disks Wikipedia.com 100
- 2.7. RAID - Redundant Arrays of Inexpensive/Independent Disks Other RAIDs RAID 01 RAID 10 RAID 1.5 How many disks can be failed? RAID 0 RAID 1 RAID 3 RAID 5 102
- 2.7. RAID - Redundant Arrays of Inexpensive/Independent Disks Software RAID Hardware-assisted Software RAID Hardware RAID (controller: RAID-on-chip or add-in card) Source: adaptec®, Hardware RAID vs. Software RAID: Which Implementation is Best for 104 my Application? White paper, 2006.
- 2.7. Today’s Storage Technologies Storage area network (SAN) A storage area network is a dedicated network that allows sharing storage. A SAN does not provide file abstraction, only block-level operations. A SAN uses serial attached SCSI, Fibre Channel, or iSCSI protocol over a Fibre Channel or Ethernet network. A SAN can be managed by using Simple Network Management Protocol (SNMP) or a proprietary management protocol. SNMP is based on TCP/IP, and it offers basic alert management. This allows a node in SAN to alert the management system of failures. Source: MS SQL Server, Storage Technologies Overview, 2016. 106
- 2.8. Physical Storage in Today’s DBMSs Oracle 19c Schema Object Storage Storage structure = segment: data segment, index segment Tablespace: a database storage unit that contains objects from different schemas. One tablespace is associated with one or many physical data files. . One segment cannot span multiple tablespaces. . One data segment for one table spans two data files, which are both part of the same tablespace. 108
- 2.8. Physical Storage in Today’s DBMSs Tablespace Segment Extent Oracle Data Block An extent is a set of logically contiguous data blocks Storage in Oracle 19c allocated for storing a specific type of information 110
- 2.8. Physical Storage in Today’s DBMSs An index-organized table in Oracle 19c 112
- Summary Static files vs. Dynamic files File operations File organization vs. Access method Each primary file organization Placing records in blocks and placing blocks on disks Goal: minimize the number of block transfers Further readings Parallelizing disk access using RAID technology Modern storage architectures: storage area networks (SAN), network-attached storage (NAS), 114
- Check for understandings 2.1. Describe the memory hierarchy for data storage. 2.2. Distinguish between persistent data and transient data. 2.3. Describe disk parameters when magnetic disks are used for storing large amounts of data. 2.4. Describe double buffering. What kind of time can be saved with this technique? 2.5. Describe the read/write commands with magnetic disks. 116
- Check for understandings 2.11. Distinguish between static files and dynamic files. 2.12. Compare unordered files, ordered files, and hash files. 2.13. Which operations are more efficient for each file organization: unordered, ordered, hash? Why? 2.14. Distinguish between static hashing and dynamic hashing. 2.15. Compare three dynamic hashing techniques: Extendible Hashing, Dynamic Hashing, and Linear Hashing. 118