Sunday, March 28, 2010

Code Camp 2010 decompression

Today while reviewing my notes from Code Camp Orlando 2010, I kept thinking about how most of the presenters kept encouraging IT professionals to blog and tweet. Andy Warren (sqlandy.com) has spoken to me many times in the past about the power of blogging and social networking but never took either all that seriously. In one of the last sessions for the day, presented by Andy, he had asked the room our reasons for not blogging. The only reason I could come up with for why I did not blog is a lack of content to share. I have posted a few items here and there that I felt were useful and informative, or things I felt like sharing so others would have a reference later on, but I don't encounter those situations on a daily basis. I am torn between posting useful content (quality), over a quantity of posts.

How do you, or the authors of blogs you read, provide useful blog content between those "blow your mind" posts that Paul Randal puts out oh so frequently?

Friday, February 12, 2010

Gaps in identites

Holes in the values of an identity column can cause data discrepencies in come cases as i have found in the past. Currently I am working on an issue with a 20,000 record table, with a current identity value above 180,000. To resolve this issue, I first need to know where the gaps in my data are coming from. The script below will display all of the gaps in my data to allow me to pinpoint when and where the problem may exist. I thought I would share this with the world.

WITH Gaps (ID,RecordNumber)
AS ( SELECT IdentityColumn AS ID,
ROW_NUMBER() OVER(ORDER BY IdentityColumn) AS RecordNumber
FROM dbo.Table )

SELECT g2.ID, g2.ID - g1.ID Gap_Afterwards
FROM Gaps g1
JOIN Gaps g2
ON g2.RecordNumber - 1 = g1.RecordNumber
WHERE g2.ID - g1.ID > 1
ORDER BY g2.ID - g1.ID

Monday, February 1, 2010

Today I ran into an issue with my automated index maintenance. It was trying to rebuild an index that I had recently removed. This prompted me to add a job to update the list of indexes that I have i my database. This job would do three tasks. The first would be to add any new indexes that I had created. It will look at the indexes i am currently maintaining (by referencing dbo.IndexMaintenance) and compare that to a list of indexes from a set of the currently existing indexes in my database. It will add a record into dbo.IndexMaintenance for each new index. After this, It will disable the maintenance for any removed indexes, as well as re-enable the maintenance on any indexes that are currently present, yet have been previously disabled. Please check it out and comment with any suggestions or thoughts.


CREATE PROC dbo.Job_UpdateIndexMaintenance
AS

-- same script we used to populate the IndexMaintenance table to begin with.
-- We will use this to snapshot what we should be maintaining now
SELECT DISTINCT schema_name(st.schema_id) SchemaName, -- schema name
object_name(si.object_id) [TableName], --Table Name
si.name IndexName, -- Index Name
0 OptionStatisticsNorecompute, -- Statistics Norecompute
1 OptionAllowRowLocks, -- Allow Row Locks
1 OptionAllowPageLocks, -- Allow Page Locks
0 OptionSortInTempDB, -- Sort In TempDB
CASE sum(CASE WHEN sc.system_type_id = 35 THEN 1 -- text
WHEN sc.system_type_id = 99 THEN 1 -- ntext
WHEN sc.system_type_id = 34 THEN 1 -- image
WHEN sc.system_type_id = 241 THEN 1 -- xml
WHEN sc.system_type_id = 165 and sc.max_length = -1 THEN 1 -- varbinary
WHEN sc.system_type_id = 167 and sc.max_length = -1 THEN 1 -- varchar
WHEN sc.system_type_id = 231 and sc.max_length = -1 THEN 1 -- nvarchar
ELSE 0
END) WHEN 0 THEN 1 else 0 end OptionOnlineRebuild, -- Online Rebuilds
1 KeepMaintained, -- Keep Maintained
CASE si.fill_factor WHEN 0 THEN 100 ELSE si.fill_factor END Fill_Factor-- Fill Factor
into #IndexMaintenance
FROM sys.indexes si
JOIN sys.tables st
ON st.object_id = si.object_id
JOIN sys.columns sc
ON si.object_id = sc.object_id
WHERE si.type_desc <> 'HEAP' -- we do not want heap data
GROUP BY st.schema_id, si.name,si.object_id,si.fill_factor

GO
-- Locate new indexes and insert them
INSERT INTO [IndexMaintenance].[dbo].[IndexMaintenance]
([SchemaName]
,[TableName]
,[IndexName]
,OptionStatisticsNorecompute
,[OptionAllowRowLocks]
,[OptionAllowPageLocks]
,OptionSortInTempdb
,[OptionOnlineRebuild]
,[KeepMaintained]
,[Fill_Factor])
SELECT im.SchemaName
,im.TableName
,im.IndexName
,im.OptionStatisticsNorecompute
,im.OptionAllowRowLocks
,im.OptionAllowPageLocks
,im.OptionSortInTempdb
,im.OptionOnlineRebuild
,im.KeepMaintained
,im.Fill_Factor
FROM #IndexMaintenance im
WHERE NOT EXISTS (SELECT 1
FROM dbo.IndexMaintenance im2
WHERE im2.SchemaName= im.SchemaName
AND im2.TableName = im.TableName
AND im2.IndexName = im.IndexName)

--Disable Old Indexes
UPDATE im
SET KeepMaintained = 0
FROM dbo.IndexMaintenance im
WHERE NOT EXISTS (SELECT 1
FROM #IndexMaintenance im2
WHERE im2.SchemaName= im.SchemaName
AND im2.TableName = im.TableName
AND im2.IndexName = im.IndexName)
AND im.KeepMaintained = 1

--Enable Old Indexes
UPDATE im
SET KeepMaintained = 1
FROM dbo.IndexMaintenance im
WHERE EXISTS (SELECT 1
FROM #IndexMaintenance im2
WHERE im2.SchemaName= im.SchemaName
AND im2.TableName = im.TableName
AND im2.IndexName = im.IndexName)
AND im.KeepMaintained = 0

Thursday, January 14, 2010

DCM and Transaction Rollbacks

Over the past week, I performed a large update on a 16 million row table within a test database. This update did nothing more than flipped a bit on the table from 0 to 1 but due to its size was an intensive process. Being that I was testing, I had this wrapped in a transaction and rolled it back. Thinking nothing was affected I went on with my work for the day.

In the morning I came in to work to find what was generally a 100mb differential backup file exceeding 14gb. This caught me completely off guard since i know i did not make any significant data changes, so I began to really dig. I knew the only thing that i had done that could have affected 14gb of my 33gb test database was a massive operation on its largest table. "But I rolled that transaction back, nothing changed!" I thought to myself.

As you may or may not know, there are Differential Changed Map (DCM) pages that contain bits corresponding to each extent in the database and whether or not they have changed since the last full backup. When data is inserted or updated on a page, the DCM bit for the extent that the page exists on is flipped from 0 to 1 and only reverted to 0 upon a full backup.

Knowing this, I decided to try a few tests.

I created a database with a single table, and populated that table with some data by using the following script

CREATE DATABASE DiffBackupTest
GO
USE DiffBackupTest
GO

CREATE TABLE dbo.TabA (ID INT IDENTITY(1,1),
ColA UNIQUEIDENTIFIER,
Row ROWVERSION)
GO

INSERT dbo.TabA (ColA) VALUES (NEWID())
GO 100000

Now I have a database with some data in it, so to reset the DCM, I preform a full backup.

BACKUP DATABASE [DiffBackupTest]
TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Backup\DiffBackupTest.bak'

Processed 648 pages for database 'DiffBackupTest', file 'DiffBackupTest' on file 10.
Processed 2 pages for database 'DiffBackupTest', file 'DiffBackupTest_log' on file 10.
BACKUP DATABASE successfully processed 650 pages in 0.392 seconds (12.939 MB/sec).

This is followed by a differential backup.

BACKUP DATABASE [DiffBackupTest]
TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Backup\DiffBackupTest.bak' WITH DIFFERENTIAL

Processed 56 pages for database 'DiffBackupTest', file 'DiffBackupTest' on file 11.
Processed 1 pages for database 'DiffBackupTest', file 'DiffBackupTest_log' on file 11.
BACKUP DATABASE WITH DIFFERENTIAL successfully processed 49 pages in 0.026 seconds (14.704 MB/sec).

Note that 56 pages were backed up on the differential backups.

Now since we know that 56 pages, or 7 extents are backed up in this differential backup I am preforming an update within a transaction that I will then roll back.

SET NOCOUNT ON
SELECT TOP 10 ID,row FROM dbo.TabA ORDER BY ID DESC
BEGIN TRAN
UPDATE dbo.TabA
SET ColA = NEWID()
ROLLBACK TRAN
SELECT TOP 10 ID,row FROM dbo.TabA ORDER BY ID DESC

After the rollback, the data still matches what was inserted before the backups since we see the rowversion has not changed. At this time, we will preform another differential backup and look at the number of pages backed up.

BACKUP DATABASE [DiffBackupTest]
TO DISK = N'C:\Program Files\Microsoft SQL Server\MSSQL10.MSSQLSERVER\MSSQL\Backup\DiffBackupTest.bak' WITH DIFFERENTIAL

This time we have 552 pages, or 69 extents, backed up even though no row changes were committed.

This shows that even if changes are rolled back, SQL Server does not rollback the change to the DCM pages. This caught me off guard on our test database, and has made me more cautious when doing large data changes in databases with differential backups scheduled. From here on out, I will be sure to preform full backups after any significant data changes, I sure have learned my lesson.

Tuesday, January 12, 2010

Automated Index Maintenance

Click here to download my code samples from the Jan. 12 OPass meeting.

Far too often the IT department has a server administrator, a software developer, or even a member of management playing the role of, what Paul Randal would call, the accidental DBA. This role often has little to no database administration experience. They might know how to create a backup maintenance plan through a wizard, create logins and databases to allow the rest of the team to function. While in this role, one of the most frequently forgotten aspects of database administration is that of proper indexing and maintenance.

How can an administrator properly maintain existing indexes without existing knowledge of the indexing strategies in place and application usage? Using system views and DMVs, one can eaisly inventory current indexes and implement a system to intellegently maintain them.

Why not use maintenance plans?
SQL Server maintenance plans can be created to manage indexes but it has many limitations. The largest limitation is the lack of logging. When using a maintenance plan, there is no way to monitor the fragmentation before maintenance, and the time each index took to process. Another limitation is found when you wish to reorganize or rebuild only those indexes that excede a fragmentation threshold. These maintenance plans can be quick to set up but lack critical settings when dealing with high transaction count database systems.

How to automate the maintenance of indexes
In the system described here, we will set up automated index maintenance on the database and all of its indexes. Instead of running the query ALTER INDEX IDX_TableA_Index1 ON dbo.TableA REORGANIZE for each index in the database we will perform this based on data in a table. We create tables by running the following scripts.

1_dbo.IndexLog.sql
1_dbo.IndexMaintenance.sql
1_dbo.IndexStatus.sql

After creating the tables, Run the simple data population script located in 3_PopulateIndexStatus.sql.

At this point, we have three tables set up titled dbo.IndexLog which holds logging information about the index rebuild and reorganization procedures, dbo.IndexMaintenance which contains a list of all indexes to be maintained and their options, and dbo.IndexStatus which is used as a lookup used by the table dbo.InsertLog.

The core of the data used in the process lies within the dbo.IndexMatintnace table. This table contains columns that will be used by future jobs. In the table below you will find the purpose of each column.

SchemaName - The schema the table is contained in
TableName - The table that index belongs to
IndexName - The index to be maintained
OptionStatisticsNorecompute - Sets the STATISTICS_NORECOMPUTE flag to ON during an index rebuild
OptionAllowRowLocks - Sets the ALLOW_ROW_LOCKS flag to ON during an index rebuild
OptionAllowPageLocks - Sets the ALLOW_PAGE_LOCKS flag to ON during an index rebuild
OptionSortInTempdbSets the SORT_IN_TEMPDB flag to ON during an index rebuild
OptionOnlineRebuild - Sets the ONLINE flag to ON during an index rebuild
KeepMaintained - Determines whether or not the automated process should include this index.
Fill_Factor - Sets the FILLFACTOR flag to the specified value during an index rebuild

Next step is to populate the dbo.IndexMatintnace table with all of the indexes to be maintained. This is a simple task by use of the system object views sys.indexes, sys.columns, and sys.tables, which are all, provided by SQL Server. The query located in the script file 3_PopulateIndexMaintenance.sql will insert all data required by the automated index maintenance process into this table. Please note that the options for Statistics Norecompute, Allow Row Locks, Allow Page Locks, and Sort in TempDB are all defaulted whereas the option for Online Index rebuild is determined by the presence of LOB data on the table the index is placed on.

Now, with the dbo.IndexMaintnance and the dbo.IndexStatus tables both populated, it is time to create the jobs to rebuild and reorganize all indexes within the database. To do this, we will need to add stored procedures located in 2_dbo.job_RebuildAllIndexes.sql and 2_dbo.job_ReorganizeIndexes.

As you can see, the stored procedure dbo.job_RebuildAllIndexes has parameters that allow you to run the proc in such a way to perform only online rebuilds, offline rebuilds, or both. This allows you to schedule jobs to do online rebuilds during times not included in a maintenance window. The proc dbo.job_ReorganizeIndexes will perform a reorganize on all indexes that are marked to be kept maintained and meet the thresholds defined in the proc parameters.

At this point you have created and populated the tables dbo.IndexLog, dbo.IndexMaintenance, and dbo.IndexStatus as well as the user stored procedures dbo.job_ReorganizeIndexes and dbo.job_RebuildAllIndexes and have been created. At this point all we have left to do is create jobs to call these stored procedures on a schedule.

At this time everything is setup and ready to run. By invoking this reorganize job, it will run a reorganize on all table based on the fragmentation and pagecount threshold.

Try this out on some of your development systems and let me know your thoughts. This is still very much a work in progress so feel free to share your experiences with the rest of the community.