This Saturday is the fishing opener here in Wisconsin. The DNR designates some time in spring when the only fishing allowed on most inland waters is panfishing. This rule was created to protect the fish, since most species spawn in the spring. I can’t wait to get out there on Saturday and try to catch some opening day fish! This is my first opener owning my own boat, so I can certainly say I’ve never been looking forward to a fishing opener this much. Hopefully my next post will have a picture or two of a beautiful 2015 muskie.
SQL Server blogger Steve Stedman recently started a weekly database corruption challenge. The challenges each involve the simulation of some sort of corruption or database failure, and we are tasked with recovering without losing any data. I missed the first two weeks, but decided to spend a bit of time last Saturday morning working on the third challenge. Here is the scenario Steve set up:
Here is the timeline as things occurred
At 2:53pm a full backup was performed. After that full backup, users continued to do work and the database continued to grow.
At 2:54pm a transaction log backup was performed. Followed by more changes to the database.
At 3:01pm another transaction log backup was performed. Followed by more changes to the database.
At 3:12 another transaction log backup was performed. Followed by more changes to the database.
At 3:19 a catastrophic failure occurred causing the operating system running SQL Server to reboot. Let’s blame it on a power surge, or maybe it was just a DBA who accidently bumped the power cord. I like the power surge idea, so that I don’t get blamed for kicking the power cord.
After the server rebooted, several other databases on this SQL Server recovered just fine, however the CorruptionChallenge3 database would not come on line. After examining the SQL Server data directory, it was discovered that the single .mdf file for this database was gone, missing, just not there. The only thing that remained on this SQL Server was the .ldf file for this database.
Steve included five files needed to recover all the data – the last full backup, the three transaction log backups that were taken between the last full backup and the disaster event, and the log file.
After thinking about the problem briefly, I formulated my recovery plan. I would restore the full backup and each of the transaction log backups. This would bring the database current through the time of the last backup. Then I would detach the database and swap out the log file with the one provided by Steve, and take another log backup. Then I could use this new log backup to bring the database online with all transactions accounted for. The first part of my plan went according to plan:
USE [master] RESTORE DATABASE [CorruptionChallenge3] FROM DISK = N'\\server\Gugg\CorruptionChallenge3_Full.bak' WITH FILE = 1, MOVE N'CorruptionChallenge3' TO N'X:\SQLData\CorruptionChallenge3.mdf', MOVE N'CorruptionChallenge3_log' TO N'Y:\SQLLogs\CorruptionChallenge3_log.LDF', NORECOVERY, NOUNLOAD, STATS = 5 RESTORE LOG [CorruptionChallenge3] FROM DISK = N'\\server\Gugg\TransLog_CorruptionChallenge30.trn' WITH FILE = 1, NORECOVERY, NOUNLOAD, STATS = 5 RESTORE LOG [CorruptionChallenge3] FROM DISK = N'\\server\Gugg\TransLog_CorruptionChallenge31.trn' WITH FILE = 1, NORECOVERY, NOUNLOAD, STATS = 5 RESTORE LOG [CorruptionChallenge3] FROM DISK = N'\\server\Gugg\TransLog_CorruptionChallenge32.trn' WITH FILE = 1, RECOVERY, NOUNLOAD, STATS = 5 EXEC MASTER.dbo.sp_detach_db @dbname = N'CorruptionChallenge3' GO
However, when I tried to re-attach the database after replacing the log file, I got the following message:
One or more files do not match the primary file of the database. If you are attempting to attach a database, retry the operation with the correct files. If this is an existing database, the file may be corrupted and should be restored from a backup.
Log file ‘R:\SQLLogs\CorruptionChallenge3_log.LDF’ does not match the primary file. It may be from a different database or the log may have been rebuilt previously.
At this point I remembered some Oracle training I’d done regarding Log Sequence Numbers (LSNs). When connecting a database, the LSN in the log has to match the LSN in the data file. At this point I couldn’t think of what else to do. I could not think of any way to get the data out of the log file without taking a backup of it, and I couldn’t take a backup without first attaching it. I thought about it for another 15 minutes and then turned my attention to other tasks I had to get done that Saturday.
When Steve revealed the answer on Tuesday, I was disappointed by how close I came, but I missed an important detail. In order to take a tail-log backup without a valid data file, you need to set a “dummy” version of the database offline, delete the data file, replace the log file, and the when you turn it back online you’ll get an error message but you will still be able to take a tail-log backup. (See this post by Paul Randal for details.) So I should have done this:
USE [master] GO CREATE DATABASE [CorruptionChallenge3] ON ( NAME = 'CorruptionChallenge3', FILENAME = N'S:\SQLData\CorruptionChallenge3.mdf' ), ( NAME = 'CorruptionChallenge3_log', FILENAME = N'R:\SQLLogs\CorruptionChallenge3_log.LDF' ) GO ALTER DATABASE CorruptionChallenge3 SET OFFLINE; --Delete the mdf file and replace the ldf file ALTER DATABASE CorruptionChallenge3 SET ONLINE
BACKUP LOG CorruptionChallenge3 TO DISK = N'\\server\Gugg\Final_Log_Backup.trn' WITH INIT, NO_TRUNCATE; GO
The key here is to use the NO_TRUNCATE option. With this I am able to take a log backup, and now have one full backup and four log backups, enough to restore the database with no data loss!
I have a busy weekend coming up since it’s the fishing opener on Saturday, but I hope to be able to work on Corruption Challenge #4, coming up on Friday.