Database Management Systems - Chapter 5. Introduction to Transaction Processing - Võ Thị Ngọc Châu
5.1. Introduction to Transaction Processing
5.2. Transaction and System Concepts
5.3. Desirable Properties of Transactions
5.4. Characterizing Schedules based on
Recoverability
5.5. Characterizing Schedules based on
Serializability
5.6. Transaction Support in SQL
5.2. Transaction and System Concepts
5.3. Desirable Properties of Transactions
5.4. Characterizing Schedules based on
Recoverability
5.5. Characterizing Schedules based on
Serializability
5.6. Transaction Support in SQL
Bạn đang xem 20 trang mẫu của tài liệu "Database Management Systems - Chapter 5. Introduction to Transaction Processing - Võ Thị Ngọc Châu", để tải tài liệu gốc về máy hãy click vào nút Download ở trên.
File đính kèm:
- database_management_systems_chapter_5_introduction_to_transa.pdf
Nội dung text: Database Management Systems - Chapter 5. Introduction to Transaction Processing - Võ Thị Ngọc Châu
- 5.1. Introduction to Transaction Processing The phantom problem A transaction T1 may read a set of rows from a table, perhaps based on some condition specified in the SQL WHERE-clause. Now suppose that a transaction T2 inserts a new row that also satisfies the WHERE-clause condition used in T1, into the table used by T1. If T1 is repeated, then T1 will see a phantom, a row that previously did not exist. 18
- 5.1. Introduction to Transaction Processing Several possible reasons for a transaction to fail in the middle of execution: 1. A computer failure (system crash) 2. A transaction or system error 3. Local errors or exception conditions detected by the transaction 4. Concurrency control enforcement 5. Disk failure 6. Physical problems and catastrophes Recovery and backup 20
- 5.1. Introduction to Transaction Processing Why recovery is needed: 3. Local errors or exception conditions detected by the transaction: - certain conditions necessitate cancellation of the transaction. For example, data for the transaction may not be found. A condition, such as insufficient account balance in a banking database, may cause a transaction, such as a fund withdrawal from that account, to be canceled. - a programmed abort in the transaction causes it to fail. 5. Concurrency control enforcement: The concurrency control method may decide to abort the transaction, to be restarted later, because it violates serializability or because several transactions are in a state of deadlock. 22
- 5.2. Transaction and System Concepts A transaction is an atomic (logical) unit of work that is either completed in its entirety or not done at all. Transaction states: Active state Partially committed state Committed state Failed state Terminated State 24
- 5.2. Transaction and System Concepts Transaction states Active state: a transaction goes into an active state immediately after it starts execution, where it can issue READ and WRITE operations. Partially committed: when the transaction ends, some recovery protocols need to ensure that a system failure will not result in an inability to record the changes of the transaction permanently (usually by recording changes in the system log). Committed: once this check is successful, the transaction is said to have reached its commit point. Once a transaction is committed, it has concluded its execution successfully and all its changes must be recorded permanently in the database. Failed: if one of the checks fails or if the transaction is aborted during its active state. The transaction may then have to be rolled back to undo the effect of its WRITE operations on the database. Failed or aborted transactions may be restarted later - either automatically or after being resubmitted by the user-as brand new transactions. Terminated: the transaction leaving the system. The transaction information that is maintained in system tables while the transaction has been running is removed when the transaction terminates. 26
- 5.2. Transaction and System Concepts For recovery purposes, the system needs to keep track of when the transaction starts, terminates, and commits or aborts. commit_transaction: This signals a successful end of the transaction so that any changes (updates) executed by the transaction can be safely committed to the database and will not be undone. rollback (or abort): This signals that the transaction has ended unsuccessfully, so that any changes or effects that the transaction may have applied to the database must be undone. 28
- 5.2. Transaction and System Concepts The System Log Log or Journal : The log keeps track of all transaction operations that affect the values of database items. This information may be needed to permit recovery from transaction failures. The log is kept on disk, so it is not affected by any type of failure except for disk or catastrophic failure. In addition, the log is periodically backed up to archival storage (tape) to guard against such catastrophic failures. T in the following discussion refers to a unique transaction-id that is generated automatically by the system and is used to identify each transaction. 30
- 5.2. Transaction and System Concepts The System Log Protocols for recovery that avoid cascading rollbacks do not require that READ operations be written to the system log, whereas other protocols require these entries for recovery. Cascading rollback is the case such that the rolling-back of transaction T1 makes transaction T2 rollback and so on. Strict protocols require simpler WRITE entries that do not include new_value. Strict implies no cascading rollbacks and further no committed values used in any read/write operation. 32
- 5.2. Transaction and System Concepts The System Log Given the following executions, write the content of the system log. Supposed that there are originally X = 10 and Y = 5 in the database. T1 T2 Cascading rollback: read_item(X); T1 aborts and rollbacks. X := X+2; Then, T2 aborts and rollbacks. write_item(X); read_item(X); read_item(Y); Y := Y+2; write_item(Y); abort; X := X-1; write_item(X); abort; 34
- 5.2. Transaction and System Concepts The System Log Given the following executions, write the content of the system log. Supposed that there are originally X = 10 and Y = 5 in the database. T1 T2 //strict: read & write committed read_item(X); data items read_item(X); //no record for read operations X := X+2; //no new value in write records write_item(X); [start_transaction, T1] read_item(Y); [start_transaction, T2] Y := Y+2; [write_item, T1, X, 10] write_item(Y); [write_item, T1, Y, 5] commit; [commit, T1] X := X-1; [write_item, T2, X, 10] write_item(X); [commit, T2] commit; Log content 36
- 5.2. Transaction and System Concepts Commit Point of a Transaction Definition: A transaction T reaches its commit point when all its operations that access the database have been executed successfully and the effect of all the transaction operations on the database has been recorded in the log. Beyond the commit point, the transaction is said to be committed, and its effect is assumed to be permanently recorded in the database. The transaction then writes an entry [commit,T] into the log. Roll Back of transactions: Needed for transactions that have a [start_transaction,T] entry into the log but no commit entry [commit,T] into the log. 38
- 5.3. Desirable Properties of Transactions Transactions possess the ACID properties enforced by the concurrency control and recovery methods. Atomicity A transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all. Consistency preservation A transaction is consistency preserving if its complete execution take(s) the database from one consistent state to another. Isolation A transaction should appear as though it is being executed in isolation from other transactions. That is, the execution of a transaction should not be interfered with by any other transactions executing concurrently. Durability (or permanency) The changes applied to the database by a committed transaction must persist in the database. These changes must not be lost because of any failure. 40
- 5.4. Characterizing Schedules based on Recoverability Constraint on schedules Consider a schedule (or history) S of n transactions T1, T2, , Tn For each transaction Ti that participates in S, the operations of Ti in S must appear in the same order in which they occur in Ti. However, operations from other transactions Tj can be interleaved with the operations of Ti in S. 42
- 5.4. Characterizing Schedules based on Recoverability write (w) operation of Schedule Sa transaction T T1 T2 1 read_item(X); Shorthand notation: X:=X-N; b = begin_transaction, read_item(X); r = read_item, X:=X+M; w = write_item, write_item(X); e = end_transaction, read_item(Y); write_item(X); c = commit, TIME Y := Y+N; a = abort. write_item(Y); 44
- 5.4. Characterizing Schedules based on Recoverability Recoverable schedules Recoverable schedule Cascadeless schedule (to avoid cascading rollback) Strict schedule Nonrecoverable schedules Nonrecoverable schedule: once a transaction T is committed, T is asked to be rolled-back. Sc is not recoverable. Why? 46
- 5.4. Characterizing Schedules based on Recoverability Recoverable schedules Recoverable schedule Cascadeless schedule (to avoid cascading rollback): if every transaction in the schedule reads only items that were written by committed transactions. Se is recoverable; but not cascadeless. Why? Sf: r1(X); w1(X); r1(Y); w1(Y); c1; r2(X); w2(X); c2; Sf is cascadeless. Why? Strict schedule Nonrecoverable schedules 48
- 5.4. Characterizing Schedules based on Recoverability Recoverable schedules Recoverable schedule: once a transaction T is committed, it should never be necessary to roll back T. No transaction T in a schedule commits until all transactions T’ that have written an item that T reads have committed. Cascadeless schedule (to avoid cascading rollback): if every transaction in the schedule reads only items that were written by committed transactions. Strict schedule: transactions can neither read nor write an item X until the last transaction that wrote X has committed (or aborted). Nonrecoverable schedules Nonrecoverable schedule: once a transaction T is committed, T is asked to be rolled-back. 50
- 5.4. Characterizing Schedules based on Recoverability Recoverable Nonrecoverable schedules schedules Cascadeless schedules Strict schedules Relationships between schedules based on recoverability All strict schedules are cascadeless. All cascadeless schedules are recoverable. 52
- 5.5. Characterizing Schedules based on Serializability Figure 20.5. (a) Serial schedule A: T1 followed by T2. (b) Serial schedule B: T2 followed by T1. [1], pp. 764 Originally in DB: X = 5; Y = 10 Constants: N = 1; M = 2 Schedule A: X = 6; Y = 11 Schedule B: X = 6; Y = 11 54
- 5.5. Characterizing Schedules based on Serializability The concept of serializability of schedules is used to identify which schedules are correct when transaction executions have interleaving of their operations in the schedules. Serial schedules The operations of each transaction are executed consecutively, without any interleaved operations from the other transaction. Nonserial schedules The operations of each transaction are executed in an interleaved fashion with the operations of other transactions. 56
- 5.5. Characterizing Schedules based on Serializability Nonserial schedules Some nonserial schedules give the correct expected result. We would like to determine which of the nonserial schedules always give a correct result and which may give erroneous results. The concept used to characterize schedules in this manner is that of serializability of a schedule. Serializable schedules 58
- 5.5. Characterizing Schedules based on Serializability Equivalence of schedules Result equivalence Conflict equivalence View equivalence 60
- 5.5. Characterizing Schedules based on Serializability Equivalence of schedules Conflict equivalence Case 1: Sa: w1(X) r2(X) Sb: w1(X) r2(X) Sa: w1(X) r2(X) Sb’: r2(X) w1(X) Case 2: Sa: r1(X) w2(X) Sb: r1(X) w2(X) Sa: r1(X) w2(X) Sb’: w2(X) r1(X) Case 3: Sa: w1(X) w2(X) Sb: w1(X) w2(X) Sa: w1(X) w2(X) Sb’: w2(X) w1(X) 62
- 5.5. Characterizing Schedules based on Serializability Equivalence of schedules Conflict equivalence Conflict serializable schedule S Test for conflict serializability of a schedule . The algorithm looks at only the read_item and write_item operations in a schedule to construct a precedence graph (or serialization graph), which is a directed graph G (N, E) that consists of a set of nodes N = {T1, T2, , Tn} and a set of directed edges E = {e1, e2, , em}. . There is one node in the graph for each transaction Ti in the schedule. . Each edge ei in the graph is of the form (Tj Tk), 1 j n, 1 k n, where Tj is the starting node of ei and Tk is the ending node of ei. Such an edge is created if one of the operations in Tj appears in the schedule before some conflicting operation in Tk. 64
- Sa: r1(X); w1(X); r1(Y); w1(Y); r2(X); w2(X) serial Sb: r2(X); w2(X); r1(X); w1(X); r1(Y); w1(Y) serial Sc: r1(X); r2(X); w1(X); r1(Y); w2(X); w1(Y) not serializable 66 Sd: r1(X); w1(X); r2(X); w2(X); r1(Y); w1(Y) conflict serializable
- 5.5. Characterizing Schedules based on Serializability Check conflict-serializability of each following schedule. Determine their equivalent serial schedules. Your turn! S1: w1(Y); w2(Y); w2(X); w1(X); w3(X); S2: r2(A); r1(B); w2(A); r3(A); w1(B); w3(A); r2(B); w2(B); S3: r2(A); r1(B); w2(A); r2(B); r3(A); w1(B); w3(A); w2(B); S4: r1(A); r2(A); r3(B); w1(A); r2(C); r2(B); w2(B); w1(C); S5: r1(A); w1(B); r2(B); w2(C); r3(C); w3(A); S6: w3(A); r1(A); w1(B); r2(B); w2(C); r3(C); S7: r1(A); r2(A); w1(B); w2(B); r1(B); r2(B); w2(C); w1(D); S8: r1(A); r2(A); r1(B); r2(B); r3(A); r4(B); w1(A); w2(B); [3], pp. 890-897 70
- 5.5. Characterizing Schedules based on Serializability Equivalence of schedules Result equivalence Two schedules are called result equivalent if they produce the same final state of the database. . Two different schedules may accidentally produce the same final state. . Result equivalence alone cannot be used to define equivalence of schedules. The safest and most general approach to defining schedule equivalence is not to make any assumption about the types of operations included in the transactions. For two schedules to be equivalent, the operations applied to each data item affected by the schedules should be applied to that item in both schedules in the same order. Conflict equivalence View equivalence 72
- 5.5. Characterizing transaction schedules based on serializability Equivalence of schedules Result equivalence Conflict equivalence View equivalence The idea behind view equivalence is that, as long as each read operation of a transaction reads the result of the same write operation in both schedules, the write operations of each transaction must produce the same results. The read operations are hence said to see the same view in both schedules. Condition 3 ensures that the final write operation on each data item is the same in both schedules, so the database state should be the same at the end of both schedules. View serializable: a schedule S is said to be view serializable if it is view equivalent to a serial schedule. 74
- 5.6. Transaction Support in SQL Characteristics specified by a SET TRANSACTION statement in SQL: Access mode: READ ONLY or READ WRITE. The default is READ WRITE unless the isolation level of READ UNCOMMITTED is specified, in which case READ ONLY is assumed. Diagnostic size n, specifies an integer value n, indicating the number of conditions that can be held simultaneously in the diagnostic area. (Supply user feedback information) 76
- 5.6. Transaction Support in SQL Potential problem with lower isolation levels: Dirty Read: Reading a value that was written by a transaction which failed. Nonrepeatable Read: Allowing another transaction to write a new value between multiple reads of one transaction. A transaction T1 may read a given value from a table. If another transaction T2 later updates that value and T1 reads that value again, T1 will see a different value. Consider that T1 reads the employee salary for Smith. Next, T2 updates the salary for Smith. If T1 reads Smith's salary again, then it will see a different value for Smith's salary. 78
- 5.6. Transaction Support in SQL Sample SQL transaction: EXEC SQL whenever sqlerror go to UNDO; EXEC SQL SET TRANSACTION READ WRITE DIAGNOSTICS SIZE 5 ISOLATION LEVEL SERIALIZABLE; EXEC SQL INSERT INTO EMPLOYEE (FNAME, LNAME, SSN, DNO, SALARY) VALUES ('Robert','Smith','991004321',2,35000); EXEC SQL UPDATE EMPLOYEE SET SALARY = SALARY * 1.1 WHERE DNO = 2; EXEC SQL COMMIT; GOTO THE_END; UNDO: EXEC SQL ROLLBACK; THE_END: 80
- 5.6. Transaction Support in DBMSs with SQL A transaction in Oracle a logical, atomic unit of work with one or more SQL statements Oracle Database assigns every transaction a unique identifier called a transaction ID. All Oracle transactions obey the basic properties of a database transaction, ACID. Statement-level atomicity: a SQL statement is an atomic unit of work and either completely succeeds or completely fails. A single SQL statement executes successfully if the database parses and runs it without error as an atomic unit, as when all rows are changed in a multirow update. If a SQL statement causes an error during execution, then it is not successful and so all effects of the statement are rolled back. This operation is a statement-level rollback. 82 Source: Oracle 19c
- 5.6. Transaction Support in DBMSs with SQL Transaction control statements in Oracle The COMMIT statement ends the current transaction and makes all changes performed in the transaction permanent. COMMIT also erases all savepoints in the transaction and releases transaction locks. The ROLLBACK statement reverses the work done in the current transaction; it causes all data changes since the last COMMIT or ROLLBACK to be discarded. The ROLLBACK TO SAVEPOINT statement undoes the changes since the last savepoint but does not end the entire transaction. The SAVEPOINT statement identifies a point in a transaction to which you can later roll back. 84 Source: Oracle 19c
- Summary Single-user vs. multiuser systems Problems with uncontrolled execution of concurrency in multiuser systems: Lost update, temporary read, incorrect summary, non-repeatable read, phantom Failures & recovery issues & the system log Transaction: a logical unit of database processing (read, write) ACID properties Commit point 86
- Chapter 5: Introduction to Transaction Processing Concepts and Theory 88
- Check for Understandings 5.7. What is a schedule? Give an example. 5.8. Given the following transactions: T1: r1(X); r1(Z); w1(X); c1; T2: r2(Z); r2(Y); w2(Z); w2(Y); c2; T3: r3(X); r3(Y); w3(Y); c3; What are valid schedules? Write their log contents. S1: r1(X); r1(Z); r3(X); r3(Y); w1(X); w3(Y); c1; c3; S2: r2(Z); r3(X); r2(Y); r3(Y); w2(Z); w3(Y); w2(Y); a3; c2; S3: r2(Z); r1(X); r1(Z); w2(Z); w1(X); r2(Y); w2(Y); c2; c1; S4: r1(X); r3(Y); r1(Z); w3(Y); w1(X); c1; r3(X); c3; 90
- Check for Understandings 5.11. What are conflict-serializable schedules? 5.12. What is the conflict-serializable characteristic of each following schedule? Draw their precedence graphs. Determine their equivalent serial schedules. S12: r1(X); r3(X); w1(X); r2(X); w3(X); S13: r1(X); r3(X); w3(X); w1(X); r2(X); S14: r3(X); r2(X); w3(X); r1(X); w1(X); S15: r3(X); r2(X); r1(X); w3(X); w1(X); S16: r1(X); r2(Z); r1(Z); r3(X); r3(Y); w1(X); w3(Y); r2(Y); w2(Z); w2(Y); S17: r1(X); r2(Z); r3(X); r1(Z); r2(Y); r3(Y); w1(X); w2(Z); w3(Y); w2(Y); S18: r2(X); r1(X); r3(Y); w3(Y); r1(Y); w2(X); w1(X); w1(Y); S19: r1(Z); r1(X): r2(Y); w2(Y); r3(X); r2(Z); w1(Z); w1(X); r2(X); w3(X); w2(X); S20: r1(X); w1(X); r2(Y); r2(X); w2(Y); r1(Y); w2(X); r3(Y); w1(Y); r3(X); w3(X); w2(Y); w3(Y); 92