14:03:58 Hello, now we're going to be discussing. 14:04:04 Ways to set what we're learning about database in context. With the rest of the IT world. 14:04:16 Now in software engineering. 14:04:23 At least if you took it from me. You learned about things like how important it is. 14:04:30 To make the system available. Reliable. Save, secure. Resilient. 14:04:41 This is important stuff. I am going to be discussing in this next series. How's these things specifically? 14:04:53 Interplay. With the database world. 14:04:59 Besides software engineering. 14:05:05 It also ties in with networking. Security operating systems. 14:05:17 System administration. I have probably any other course you might ever take. Good tie in with it. So. 14:05:27 But I'm going to. 14:05:38 Call this section like database needs. And we may have to break it down into parts, of course. 14:05:54 Well. 14:06:09 Anyway. So. One of the major concepts. Okay. 14:06:26 It deals with transaction processing. That should not be a surprise. That is. You wanna do things. 14:06:37 But the transactions. 14:06:43 Are units. Sometimes they have multiple parts. 14:06:51 No. 14:06:54 Other ideas we'll talk about, by the way, include recovery, security. You know, I've been put away in. 14:07:10 Recovery security, encryption. 14:07:17 Storage. 14:07:22 Backup, this isn't about. Sequel. But it is about database. Anyway. 14:07:32 Hopefully I'll remember that I left those down there. Now. 14:07:41 When you do transactions 14:07:47 I mean, you want to, 14:08:03 You want your transactions to be logical and indivisible. That is, if you're adding something to a database, you don't want to end up having to add half of the data and get interrupted. 14:08:17 And Given how complicated some database operations can get that can be really intense And also. Okay. 14:08:34 There's a problems. Because Most databases are not single-user databases. 14:08:51 And how you deal with transactions. Kayen was operating systems. You know, is very different. When you're doing multiple users. 14:09:09 Yeah, and in fact. 14:09:15 Multi-programming, also concurrent programming. 14:09:27 In currency if you have processes going on. Even if they aren't technically concurrent. 14:09:35 They can overlap. 14:09:46 And they can be on single or multiple CPUs. All of these things have an effect. On how you do database. 14:09:56 I hear and after and corporate by reference, anything you learned in operating systems and some of the things you should have learned but didn't. 14:10:05 When you're doing transactions 14:10:10 You want, of course, to be able to insert. Delete. 14:10:19 Modify. And to retrieve those are some of the different kinds. Of things you might wanna do. 14:10:31 And of course, if you're dealing with data. It's a big difference. Whether you're doing a read. 14:10:44 Or whether you're doing. Weed and write. 14:10:51 No. When you're dealing with transactions, a lot of 14:10:59 Programming languages. A lot of database systems have some way of marking the beginning and ending of a transaction and even actually use. 14:11:13 To begin in an end. And, But transactions don't always read. 14:11:18 From the same. Section of disc. 14:11:40 And if you're operations gets interrupted, discuss, for one of the labs. You're doing airline reservations. 14:11:42 Just because you've looked up an airline reservation doesn't mean you've actually put yourself in the seat. 14:11:49 So you might. In your database. Situation you might have a buffer. For information in process. 14:12:02 And of course, like we discussed an operating system, there's 14:12:07 Different algorithms for student first out, least recently used, etc. 14:12:15 You have a concurrency. Your field took operating systems. 14:12:22 From me you probably Remember I did an example something like if 2 people are doing x plus equal x plus one in parallel That's actually 3 machine language operations, low decks, increment x store x. 14:12:42 And so if you have a multi-user database. And everybody is trying to grab an airline seat for example. 14:12:50 You really have to watch. 4 so you might need the things we have. For the indivisible hardware operations as discussed in operating systems with semaphores. 14:13:04 So. 14:13:13 So some of the problems. You want to look for and these operating systems. Techniques. 14:13:26 You wanna make sure you do not have the, you do not have a chance of losing an update the 2 made. 14:13:38 You don't want to read something. 14:13:43 Incorrectly. You know, if there's a transmission error. And if you change something, and someone else tries to read from the same thing, you want to make sure you get a consistent version of the data. 14:13:57 Information changes as you're doing a database. 14:14:05 So between when you start calculating a query. And when you finish calculating the query, sometimes These answer has changed. 14:14:19 So you can get an incorrect summary. 14:14:24 And. If you read information from the database and someone else modifies it. 14:14:36 You could have an unrepeatable read. That is, it doesn't give you the information you expected. 14:14:46 And if you go back to reread it in the information has changed. 14:14:55 Did I mention the database gets complicated when you really get into it? 14:15:04 I would like to say that databases never fail. 14:15:08 If I said that. I would be lying to you. You probably know that. 14:15:17 You could have the computer you're on failed. 14:15:24 You could have the transaction you're trying to do fail. It could fail because You asked to do something impossible or it could fail because something's not available because someone else is using the same material at the same time. 14:15:39 Lots of possibilities. 14:15:43 You could have a system error. I wish I had a nickel for every time. I tried to do something on the computer and it said unexpected error. 14:15:53 Try again later. 14:15:58 You could have a local error on your terminal. 14:16:04 You could have an exception condition. Divide by 0 is my all time favorite exception condition, but there's all kinds of 14:16:13 Errors that can occur, some of which are very legitimate errors. It's an error. If the data you're looking for isn't available, it's an error if you have. 14:16:25 Not enough space. Yeah. 14:16:37 Concernency enforcement. You've got to protect your data from the other users. To generally an operating system thing, but sometimes the databases, the database management system. 14:16:50 Takes over some of the operations of an operating system. 14:16:57 That will discuss. 14:17:00 Discs a little later in the series if you remember redundant array. Of inexpensive or independent discs depending on who you're talking to. 14:17:11 There are ways to have more reliability to a system. To have less disc failure. 14:17:23 You could actually have a catastrophe. Your server is hit by a hurricane. 14:17:33 I have had. Computers hit by tornadoes. There's reasons why a lot of companies for a lot of reasons I like to have off-site backups. 14:17:44 In cloud backup. 14:17:48 Like, how are we gonna deal with this kind of stuff? 14:17:52 One common approach. 14:17:56 In database. Is to have basically 14:18:02 Hey, system log. That is, you track. 14:18:11 Everything that happens. Before you make a change, you write down to the system log. I am about to make this change. 14:18:19 In a system log. 14:18:27 Is sequential append only that is you're not updating the data you're just recording what changes You are making. 14:18:38 Even in today's modern world. System logs and other archival backups may be still done. 14:18:47 With magnetic tape. One of the reasons. 14:18:56 You want your system log to be non volatile. If you lose complete power to the system. 14:19:03 You want the information to be there and that's the case. 14:19:09 For a magnetic tape. Now why do you want to do this? Well, sometimes 14:19:19 You need to roll back your database to an earlier point. 14:19:25 Worst case scenario if you start at an earlier save point and move forward from there, you can recreate the database. 14:19:34 For as long as you want. But if you have a record of the changes you made. And that implies that you have a record of what the data was before you made the changes, of course. 14:19:50 It's easier to do a roll back. 14:19:56 A lot of times in a database. Something goes wrong, you've got to roll back what you did. 14:20:05 And then restart. If you're transaction cannot finish. 14:20:12 That's what you usually do. The system log. 14:20:20 Is also useful for audits. And if you think about how much money is involved in the systems that databases are used for, I think you will realize that the occasional audit is important. 14:20:35 In fact, There is a career called EDP auditing. That is a very useful commit. 14:20:44 Useful. Thinking a career. No, as you're doing these transactions at some point. 14:20:57 There's what's called a commit point that is where what you have done, your transaction. Is it's done and no longer. 14:21:11 Pentitive. 14:21:14 And at least larger database systems. There's some way. 14:21:35 You have a begin transaction and an in transaction. And when you've reached the end transaction, it looks like everything is good. 14:21:44 Then you commit. 14:21:50 But in the meantime. 14:21:56 Sometimes you have to back up. 14:22:00 Or whatever reason. 14:22:10 Transaction should generally be atomic that is indivisible. 14:22:21 But you need to be able to do recovery. 14:22:32 You also should do consistency preservation. Stop and think about it. If you have a database. Of information. 14:22:44 Let's say the registration database. 14:22:48 And you delete a student. 14:22:52 That isn't enough to just delete the student record because if you just delete the student record, there are still all of the registration records. 14:23:03 That refer to that student. Particularly if they drop in the middle of the semester. So. You sometimes need to say, okay, if I'm going to withdraw. 14:23:16 Remove this object from the database. I have to go out and see. Everything that this object relates to and do I need to change that change or delete. 14:23:27 Those relations. 14:23:35 The transaction should be at least in principle. 14:23:40 Done. In isolation. 14:23:47 And the transactions should be durable. That is, if you do a transaction and it actually commits. 14:23:54 You shouldn't lose that transaction. Later. Yeah, there's different levels of isolation. 14:24:03 Mother, type that in, but there's 14:24:09 There's one level you never overwrite. 14:24:15 A dirty read at another level you're guaranteed that you have never lost an update. You don't do any updates, no dirty reads. 14:24:25 And I'll read the repeatable, you know, there's different levels you can have of isolation. 14:24:31 But that keeps the transactions from interfering with each other. 14:24:46 Yeah. 14:24:52 Recovery or recoverability. 14:24:59 Is another really important concept. 14:25:06 In order to have recovery. 14:25:12 You kind of need to have. 14:25:25 The total ordering of all. Transactions. 14:25:32 I mean, it is possible to do. Partial ordering. But if your 14:25:40 Having a database if you don't know which transaction came first. How do you know which one was done? 14:25:48 And if but if you do know. What order they came from and you're recovering, you simply recover them in order. 14:25:55 So total recovery, code ordering of all transactions really does. Help. But that doesn't mean there aren't conflicts. 14:26:16 You can have different transactions that. 14:26:22 Access the same data. 14:26:27 And if at least one is a right. 14:26:32 That's a conflict right there. 14:26:46 If you change the ordering. Of transactions that are in conflict. You can actually change the result. Which kind of ties into my old Excel x plus one example from operating system. 14:27:03 But if you change the order in airline reservations, it can make a difference whether you get to see it or you don't get a seat. 14:27:12 And you would be really annoyed if You think you booked your seat. And then the airline real orders thing brings the database up from backup, but suddenly you don't. 14:27:23 Have a seat. 14:27:28 Okay, so 14:27:34 You can have a complete schedule. But ideally. 14:27:46 If every operation And with either. 14:27:54 A commit. Or an abort. 14:28:01 That. Helps. You understand the schedule. Oh, it the finished story didn't finish. 14:28:11 It thumb hanging up in the air. 14:28:26 Generally speaking. The order in your complete schedule should be the same order they actually appeared. 14:28:39 If any to conflict. 14:28:43 One must. Up here. Before the other, you know. That way you know. 14:28:57 Which one actually came first? 14:29:05 Now. 14:29:08 Even if you do a partial ordering. A partial ordering. Should have an organ for any pair that conflicts. 14:29:18 You can have a partial ordering. Pairs that don't conflict it doesn't matter which one came first A complete schedule. 14:29:31 Has no 14:29:35 Active. Operations. At the end of the schedule. 14:29:45 If there's still things going on, then you don't have a complete schedule. 14:29:54 And you will. Sometimes. 14:30:00 Hear this the idea of a 14:30:05 Committed projection. 14:30:18 Which is 14:30:24 Only operations. 14:30:34 That are either committed 14:30:40 Or aborted. Let me try to make that a little intuitive. In my bank. Account. 14:30:52 If I write a check and still on my desk, it doesn't show up. 14:30:57 If I deposited. 14:31:00 To pay an account. It doesn't show up at the bank yet. 14:31:07 But every day when I look on to my bank account 14:31:13 The balance it gives me is yesterday's balance. And then it lists. Sending transactions. Which is Well, depending on how the bank does it. 14:31:27 Some places do have current transactions and then pending transactions and the current transactions are the ones. That have followed through, but they're today so they haven't updated the official. 14:31:40 Balance of the bank and the pending transaction is something that is on the way. So I'll say. 14:31:59 The committed projection. Would have current transactions that are completed. But it would not have pending transactions that are not completed. 14:32:09 And one of the examples of a classic example of a transaction is the gas station. You've probably all noticed those little signs that if you Stick your credit card into the gas pump. 14:32:26 The first thing it does is grab a hundred bucks. Off of your account. Temporarily pending. 14:32:35 Until the transaction finishes, is transmitted to the bank. Is posted. 14:32:45 That can show as pending. That would not be part of a complete schedule. 14:32:58 So if you look at want to be looking at a complete schedule. 14:33:06 You can't be looking at things in process. 14:33:14 And there we can, there's some theoretical. Discussions you can talk about conflict serializability and talk about nodes and graphs. 14:33:24 I'm not going to go into details. On that. But there's There's a lot of different issues. 14:33:35 And you need if you go into the database world in depth, you'll probably want to study database. 14:33:42 A little more but 14:33:46 You definitely want to. 14:33:51 Have. 14:33:57 Awareness. Of the difficulty. But a lot of the difficulties are more difficult because we have concurrency. 14:34:09 Let's see. 14:34:12 Now. Continuing on. The recovery area. 14:34:31 You can always have disasters. I think I've mentioned disasters a minute ago. 14:34:39 You can often. 14:34:43 A method of dealing with disasters is to have an off-site backup. And with your transaction log. 14:34:52 With the log, which is just a. 14:34:58 Record of the transactions. Again, go the You've got to schedule them. Properly. Now you can have also have 14:35:10 Minder disruptions. If you have a minor disruption. Often you return to the most. 14:35:19 Recent. Consistent state. 14:35:24 Which by the way in the operating system world if you have a problem with your operating system you have a roll back. 14:35:31 To the previous installation that's kind of philosophically parallel. Your return to the most consistent state. 14:35:42 Sometimes you use the backup. 14:35:46 To do that, you know, you may need to do some. 14:35:52 Undoing in order. 14:36:01 Who deal with it and 14:36:07 Sometimes. Because of the desire to zeal deal with minor disruption smoothly. 14:36:23 Before you update something, you put it in the log. 14:36:32 And then after you commit. 14:36:36 You put it in the database. You never put it in the database. Until after you're sure. 14:36:43 Ideally. 14:36:51 If you do that, you never. 14:36:55 Need an undo. That's 14:37:04 Oops. 14:37:11 So like that that would be kind of. 14:37:20 A third, the deferred update. Kind of system. But. Also sometimes 14:37:34 And sometimes it's because, A, you almost never need to correct things. You do an immediate update. 14:37:49 Go ahead and put it in the database, but. 14:37:53 If you have to do a roll back. You must be. 14:38:00 Able to do an undo. 14:38:06 And there, so there. A number of times. 14:38:12 When People will take an image of the whole database. Or at least of the relevant tables. So that they have that information in case they need to do. 14:38:27 Bondu's now will. 14:38:33 Talk more in storage about this, but. 14:38:40 It's a very common approach. 14:38:43 To save disc blocks. In a cash. 14:38:55 Which is traditionally the operating systems job. Right, but a database management system must do it and then use like when you're cashing them, you use techniques like a dirty bit to indicate that you've changed it. 14:39:16 A pinned bit. 14:39:22 To indicate something is currently in use. Like, input or output transaction. So that it just can't be swapped out. 14:39:34 No way, know how. 14:39:40 Ideally. You can have in place updating so that. He store the record. In the same spot it came from. 14:39:51 Now if roll back is required if you need to do an undo. The image before. Might need to be somewhere. 14:40:02 Maybe it should be. In a log. 14:40:18 Shadowing. Is a term some people use. 14:40:30 Keep an extra copy. Any record. 14:40:43 That's not been committed. That That takes a lot of space. 14:40:53 So it's not always. 14:40:58 An ideal. 14:41:12 But it's commonly this more commonly done to do right ahead logging. That is, you write your log to the disk before you make the change so that if something goes wrong you'll at least know what you were trying to do. 14:41:24 It is very frustrating to have. 14:41:29 A system where well, I know I got interrupted from doing it, but I don't know what I'm doing, so up too bad. 14:41:38 Bad enough when your word processor crashes and you lose what you just typed. You don't want that to happen in the database. 14:41:47 Another concept. And some of these are more terms. 14:42:01 Steel versus no steel in steel. You can write. Before the 14:42:16 Before the change is committed. 14:42:21 But then another process can steal that. 14:42:27 And the no deal. 14:42:30 Is You can't write before the change is committed. So that means the memory is not available. And There's a decision to make on each database. 14:42:44 If I am looking at your database. Record for your registration and you're looking at it in your room. 14:42:56 If I change it but haven't committed the change yet. 14:43:02 And you look at it, you want to see the old value or the new value. 14:43:11 Or do you wanna be told that that's not available right now? 14:43:19 And when the transaction commits. Then you force all the rights. To the ultimate disc. Now, some of you have had experience with. 14:43:34 Collective software writing like on GitHub. And some of the issues that you've may have experienced on GitHub are parallel to the issues in database. 14:43:53 Be aware. That if you have to undo. 14:44:05 Rollback can be cascading if you roll back one thing. There may be other transactions. Based on that, went ahead, in which case. 14:44:17 You have to roll back the second one and it may have had. Things on there. But 14:44:24 If you have sufficient log, you can recover. But unless you try to hard to keep it cascadeless. 14:44:34 It is really difficult. Most. 14:44:40 Most databases. 14:44:43 Avoid. 14:44:48 Yes, and roll back. 14:44:53 If at all possible. 14:44:59 Now, just reading does not change. The database. So. If you read the information, you're all set up. 14:45:10 And then you commit it. Well, someone else comes in after that, well, they'll just see your. 14:45:17 Data. 14:45:24 Okay. 14:45:32 Okay, now. 14:45:38 That's 14:45:42 Just really a little bit of a start here. I'm going to take. A break and then next we'll start talking about some of the security and encryption issues and then we'll come to back to storage and backup. 14:45:59 Later, I think I'll probably put the outline still in the same file though. 14:46:10 And.