Exercise Database Management Systems (CO3021) - Chapter 2: Disk Storage and Basic File Structures

Question 2.1. Describe the memory hierarchy for data storage.
Question 2.2. Distinguish between persistent data and transient data.
Question 2.3. Describe disk parameters when magnetic disks are used for storing large amounts of data.
Question 2.4. Describe double buffering. What kind of time can be saved with this technique?
Question 2.5. Describe the read/write commands with magnetic disks.
Question 2.6. Distinguish between fixed-length records and variable-length records.
Question 2.7. Distinguish between spanned records and unspanned records.
Question 2.8. What is blocking factor? How to comp 
pdf 4 trang xuanthi 30/12/2022 580
Bạn đang xem tài liệu "Exercise Database Management Systems (CO3021) - Chapter 2: Disk Storage and Basic File Structures", để tải tài liệu gốc về máy hãy click vào nút Download ở trên.

File đính kèm:

  • pdfexercise_database_management_systems_co3021_chapter_2_disk_s.pdf

Nội dung text: Exercise Database Management Systems (CO3021) - Chapter 2: Disk Storage and Basic File Structures

  1. e. Suppose that the disk drive rotates the disk pack at a speed of 2,400 rpm (revolutions per minute); what are the transfer rate (tr) in bytes/msec and the block transfer time (btt) in msec? What is the average rotational delay (rd) in msec? What is the bulk transfer rate? f. Suppose that the average seek time is 30 msec. How much time does it take (on the average) in msec to locate and transfer a single block, given its block address? g. Calculate the average time it would take to transfer 20 random blocks, and compare this with the time it would take to transfer 20 consecutive blocks using double buffering to save seek time and rotational delay. Question 2.17. A file has r = 20,000 STUDENT records of fixed length. Each record has the following fields: Name (30 bytes), Ssn (9 bytes), Address (40 bytes), PHONE (10 bytes), Birth_date (8 bytes), Sex (1 byte), Major_dept_code (4 bytes), Minor_dept_code (4 bytes), Class_code (4 bytes, integer), and Degree_program (3 bytes). An additional byte is used as a deletion marker. The file is stored on the disk whose parameters are given in Question 2.16. a. Calculate the record size R in bytes. b. Calculate the blocking factor bfr and the number of file blocks b, assuming an unspanned organization. c. Calculate the average time it takes to find a record by doing a linear search on the file if (i) the file blocks are stored contiguously, and double buffering is used; (ii) the file blocks are not stored contiguously. d. Assume that the file is ordered by Ssn; by doing a binary search, calculate the time it takes to search for a record given its Ssn value. Question 2.18. Suppose that only 80% of the STUDENT records from Question 2.17 have a value for Phone, 85% for Major_dept_code, 15% for Minor_dept_code, and 90% for Degree_program; and suppose that we use a variable-length record file. Each record has a 1-byte field type for each field in the record, plus the 1-byte deletion marker and a 1-byte end-of-record marker. Suppose that we use a spanned record organization, where each block has a 5-byte pointer to the next block (this space is not used for record storage). a. Calculate the average record length R in bytes. b. Calculate the number of blocks needed for the file. Question 1.19. Suppose that a disk unit has the following parameters: seek time s = 20 msec; rotational delay rd = 10 msec; block transfer time btt = 1 msec; block size B = 2400 bytes; interblock gap size G = 600 bytes. An EMPLOYEE file has the following fields: Ssn, 9 bytes; Last_name, 20 bytes; First_name, 20 bytes; Middle_init, 1 byte; Birth_date, 10 bytes; Address, 35 bytes; Phone, 12 bytes; Supervisor_ssn, 9 bytes; Department, 4 bytes; Job_code, 4 bytes; deletion marker, 1 byte. The EMPLOYEE file has r = 2
  2. Question 2.24. Suppose we have a sequential (ordered) file of 100,000 records where each record is 240 bytes. Assume that B = 2,400 bytes, s = 16 ms, rd = 8.3 ms, and btt = 0.8 ms. Suppose we want to make X independent random record reads from the file. We could make X random block reads or we could perform one exhaustive read of the entire file looking for those X records. The question is to decide when it would be more efficient to perform one exhaustive read of the entire file than to perform X individual random reads. That is, what is the value for X when an exhaustive read of the file is more efficient than random X reads? Develop this as a function of X. Question 2.25. Suppose that a static hash file initially has 600 buckets in the primary area and that records are inserted that create an overflow area of 600 buckets. If we reorganize the hash file, we can assume that most of the overflow is eliminated. If the cost of reorganizing the file is the cost of the bucket transfers (reading and writing all of the buckets) and the only periodic file operation is the fetch operation, then how many times would we have to perform a fetch (successfully) to make the reorganization cost effective? That is, the reorganization cost and subsequent search cost are less than the search cost before reorganization. Support your answer. Assume s = 16 msec, rd = 8.3 msec, and btt = 1 msec. 4