Wednesday, September 21, 2011

Does A Log Backup Failure Break The Chain?

We recently ran into a situation where the file share we were using to store full database backups along with transaction log backups ran out of space.  When the transaction log tried to backup we received the following error:

SQL Server Alert System: 'Severity 016' occurred on <Server>

DESCRIPTION:    BackupIoRequest::ReportIoError: write failure on backup device '<filepath>\<filename>.trn'. Operating system error 112 (There is not enough space on the disk.).

Our first thought was that since the transaction log backup failed, our backup chain was broken. So once we cleared space on the file share, we ran a full backup on each of the affected databases.

After some research regarding transaction log LSN's and the Backupset table in the MSDB database, I quickly relearned that the chain had never been broken. Since the Tlog backup failed to complete successfully, no record was written in the Backupset table and the inactive portion of the log was never truncated. You can verify this by looking at the First_LSN and the Last_LSN column in the MSDB..BackupSet table using the following query.

SELECT  backup_start_date
  FROM  msdb..backupset
WHERE   database_name = '<DBName>'   
       and type = 'L'  
ORDER BY backup_start_date

In order for the chain to be unbroken, the First_LSN must equal the previous Tlog backup's Last_LSN. In this scenario, log backups are taken once an hour and failed starting at 11 PM until space was cleared on the file share and the next Tlog backup ran at 6 AM. Once space was cleared, the transaction log backup could finally complete and the inactive portion of the log could be truncated.  We can verify this by looking at the Last_LSN on the 10 PM Tlog backup and correlating that to the First_LSN on the next day's 6 AM Tlog backup.

So even though the space error occurred and the Tlog backup failed, the transactions in the log were preserved until such time that a successful transaction log backup could be completed.

Monday, September 19, 2011

Monitor Backup and Restore Progress Using DMV's

When using SQL Agent jobs or a third party tool to control database backups on a scheduled basis, you may lose the ability to monitor the progress of the backup job. 

For instance, if we executed the following BACKUP command in a T-SQL window, we have the ability to monitor its progress in the messages window.

NAME = N'<Name>',
STATS = 10

But when jobs are executed via the SQL Agent, that information is abstracted away from us since it is happening in the background. Luckily, there are a couple Dynamic Management Views (DMV) that can tell us the BACKUP commands progress as well as the exact command issued.

By executing the following query, we can get an idea of how far along our backup, restore or rollback operation has progressed. Unfortunately, this will not work for queries.

 SELECT er.percent_complete
                ,db_name(er.database_id) AS [DB]
 FROM sys.dm_exec_requests er
         CROSS APPLY sys.dm_exec_sql_text(er.sql_handle) st
 WHERE percent_complete <> 0

Tuesday, September 13, 2011

Help! My Query Is Running Slow! How Execution Plans Can Save The Day.

Have you ever said this or has a developer ever come running to you in a panic saying what happened? How many times have you heard: “We just promoted this small report change and now it’s taking twice as long to run”? Probably more times than you would like to admit. When this happens, where do you start looking for issues? The answer can likely be found in Execution Plans.
Execution Plans can be very useful in debugging why a query is running slowly and can give insight into what SQL Server is doing behind the scenes to return the data.  Depending on the complexity of the query, Execution Plans will show where the data is being pulled from, how much data is being pulled, what manipulations are being performed, and what the cost is for each of those operations. There are many more things that a complex Execution Plan can, and will, show; but for simple, one table queries like the ones below, the Execution Plan is pretty straight forward.

To start, we’ll create a small employee table like so:

        EmployeeID INT IDENTITY(1, 1)
        ,FirstName VARCHAR(50)
       ,MiddleInitial CHAR(1)
       ,LastName VARCHAR(50)
       ,Address1 VARCHAR(50)
       ,Address2 VARCHAR(50)
       ,City VARCHAR(50)
       ,[State] CHAR(2)
       ,Zip CHAR(10)
And add a small nonclustered index like so:
Next, I populated the table with about 500,000 employees, each with unique values. Now, entering in 500,000 employee records isn't really feasible, so I used Red Gate’s SQL Data Generator. After the data has been loaded, we can start looking at Execution Plans.
This exercise assumes that we will be using SQL Server Management Studio (SSMS) to generate and review Execution Plans and that all the statistics on the table are up to date. There are several ways to extract and view Execution Plans, but we will stick with the simpliest which is viewing the actual Execution Plan in SSMS after the query executes. This can be acomplished by keying CTRL+M in the query window. Now, after the query executes, the actual Execution Plan will be shown beside the messages tab.
Now that we have SSMS setup to show the Execution Plan, we can run our query that was running well prior to the modification.
SELECT  FirstName
FROM    Employees
WHERE   LastName = 'Whitehead'
        AND FirstName = 'Aisha'
This query returned 3 rows and generated a nice Index Seek on the nonclustered index that we defined on the FirstName and LastName column.

Now, let’s say that a developer gets a request to return FirstName, LastNane, and MiddleInitial. Using the following query, we can return that information.
SELECT  FirstName
FROM    Employees
WHERE   LastName = 'Whitehead'
        AND FirstName = 'Aisha'

The problem is now, what once was a sub-second query is now taking much longer. Why? All the developer did is was add 1 column to the query. The answer is in the Execution Plan.

What we see now is two added steps, a Key Lookup and Nested Loop Join. In this instance, the Key Lookup is the operator that is causing the issue and you can see this by looking at the actual cost listed under the operator, which is 52% of the entire query. This is because the nonclustered index that we created at the beginning of the project has most, but not all, of the columns the modified query is looking for. In order the satisfy the request, the query processor must go outside the index and back to the clustered index in order to find MiddleInitial and return it as a part of the query. As you can see, this is an expensive operation and one that could be averted if only the nonclustered index included the MiddleInitial column.

For this exercise, we will add the MiddleInitial to the index as a part of the composite index rather than adding it as an Included Column in the index.

IF  EXISTS (SELECT * FROM sys.indexes WHERE object_id = OBJECT_ID(N'[dbo].[Employees]') AND name = N'NCL_Emp_FName_LName')
DROP INDEX [NCL_Emp_FName_LName] ON [dbo].[Employees]

CREATE NONCLUSTERED INDEX [NCL_Emp_FName_LName] ON [dbo].[Employees]
      [FirstName] ASC,
      [LastName] ASC,
      [MiddleInitial] ASC

Now when we go back and run our modified query, we will see the following Execution Plan.


And our query is back to running like it was before… fast.