Thursday, June 28, 2012

SQL 2012 (Denali) - Paged Query Results

Anyone who has been developing web based or even windows based systems must have gone through the pain of fetching few records at a time from the database.

Suppose your grid view displays 20 records per page. Your option was to either fetch all records on every grid page and every time user changed the page, make another trip and fetch all records again (or save everything in view state / session) - not an ideal solution. Your other option was to get a list of all primary / identity keys and then pass a set of keys to the query to only fetch the records that pertain to those keys. For example, get keys 1 - 20 on page 1, 21 - 40 on page 2 and so on. But what if someone deleted, say record #22 from the database? Another implementation I have seen is where code passes the page number and number of records that each page displays and the stored procedure applies some complex logic to figure out what records to send back. Using ROW_NUMBER() and CTE was a better alternative, but it was still not very intuitive and easy to use.

SQL 2012 introduced Ad-Hoc paging, allowing you to only fetch the appropriate number of records. Two clauses "OFFSET" and "FETCH" make this happen. Fetch clause fetches only the next X number of records you want to retrieve and OFFSET tells the query to start from the record # Y.

For example, suppose I have 1000 users in my table and my grid only shows 10 users at a time. I can pass two parameters to my query and it will return me correct 10 users every time.
CREATE PROCEDURE getPagedRecords

@Page INT = 1,
@RecordsPerPage INT=10

AS
BEGIN
    DECLARE @Offset INT
    SET @OffSet = Page * @RecordsPerPage

    SELECT 
        UserID,
        FirstName,
        LastName,
        EmailAddress 
    FROM Users 
    ORDER BY UserID
    OFFSET @Offset ROWS
    FETCH NEXT @RecordsPerPage ROWS ONLY;
END

You can even use expressions in OFFSET and FETCH Clause such as @Offset - 1 and @RecordsPerPage   - 1.

You don't have to have paging in grid view to make use of this. Often time you may need X number of records from the middle of the result set, where this could be very helpful.

For Fetch to return expected set of records every time, you obviously have to make sure other routines aren't deleting / inserting records in the middle which may cause an unexpected behavior. For example, let's say your OFFSET clause tells the query to start at record # 11, but what if someone deleted a record say at #5 after you retrieved first 10 records? Record number 11 is now what would have been record number 12, thus your query will not return you one record (original record at #11 but now at # 10).

Thank you.

Monday, June 25, 2012

SQL Server 2012 (Denali) Sequence Object

In addition to several enhancements in SQL 2012 - (some of which we will cover in subsequent posts), one of the key feature introduced is a sequence object.

Think of sequence object as an identity field on steroids. An identity field allows you to generate an auto number incremented by whatever increment value you desire (default is one) every time a record is inserted in a table. But, identity field is specific to a table, just like any other field defined in a given table. Additionally, if you delete the previous record, the identity value associated with the deleted record is lost forever until you do something to reseed it. Once the identity field hits the maximum limit (depends on the data type of this field), you've got a problem and you must reseed the table to start over or do something else.

Sequence object can help you mitigate this problem somewhat. First, let's briefly review this object. The Sequence object is created independently of a table and can be used across multiple tables.

For example - I can create a sequence object called OrderNumber, have it start with 1000 and increment by 1.
Create SEQUENCE [dbo].[OrderNumber]
as int
START WITH 1000
INCREMENT BY 1;


Now this sequence object is available to be used in multiple tables.

Let's imagine I have a table called OrderHeader and OrderDetails in my database.
CREATE TABLE [dbo].[OrderHeader](
    [OrderNumber] [int] NOT NULL,
    [OrderDate] [smalldatetime] NOT NULL,
    [StatusID] [int] NOT NULL
) ON [PRIMARY]

CREATE TABLE [dbo].[OrderDetails](
    [OrderNumber] [int] NOT NULL,
    [ItemID] [int] NOT NULL,
    [Quantity] [int] NOT NULL,
    [Price] [money] NOT NULL
) ON [PRIMARY]

Instead of inserting record in OrderHeader and then using SCOPE_IDENTITY() or some other method to get the OrderNumber before inserting the record in OrderDetails table, I can use the sequence "OrderNumber" I created above and insert the same number in both tables.
DECLARE @OrderNumber INT
SELECT @OrderNumber = NEXT VALUE FOR dbo.OrderNumber
INSERT INTO Orders(OrderNumber,OrderDate,StatusID)
VALUE (@OrderNumber,GetDate(),1)

INSERT INTO OrderDetails(OrderNumber,ItemID,Quantity,Price)
VALUE (@OrderNumber,100,10,'2.25')

NEXT VALUE FOR gives you the next available sequence number. Every time you call it, the object will return the next available number, so if you want to use the same number in multiple tables, you have to be careful to call it only once.


Another nice feature of the sequence object is that you can reset it. As in my example above, if you are deleting previous records, then you know your previous order numbers are becoming available. After you have reached fairly large number, you can start over. You obviously have to ensure you don't reset sequence and then try to use the new sequence number which may already exist in your table.

There are two ways to reset the sequence (dropping and recreating is a third way). One, by defining a maximum value for the sequence when you created it and another by altering an already created sequence.
Create SEQUENCE [dbo].[OrderNumber]
as INT
START WITH 1000
INCREMENT BY 1
MINVALUE  1000
MAXVALUE  1000000
CYCLE


The OrderNumber sequence in above example will recycle once it hits 1000000.

ALTER SEQUENCE OrderNumber
 RESTART WITH 1000 ;


As you can see, Sequence object is quite flexible and if used judiciously can serve as a nice tool. Sys objects provide additional methods to get the range of values or to find out the current sequence value etc.

Thank you.

Friday, June 22, 2012

SQL Server - Checking Where an Object is used

As databases grow in size and complexity, often there are dead objects that are left behind. By dead, I mean these objects are no longer being used, but everyone is afraid to drop them for fear of breaking something. SQL Server provides a simple way to determine where an object is being used so you can decide whether it is needed or not. SQL Server will tell you whether your object is being used in a function or stored procedure or a view etc. it obviously can't tell you whether another application is referencing such an object. So, before you delete an object if the query below doesn't return anything, further research may be necessary.

The script below will accept the object name and list all other objects that are using/referencing this object.


    --Declare a variable or alternatively you could create 
    --a stored procedure and pass this as a parameter.
    DECLARE @objectName nvarchar(4000)
    --name of your object.
   SET @ObjectName = 'MyObject'
   SELECT
        S.NAME,
        C.Text

    FROM SysObjects S
    INNER JOIN SysComments C
        ON S.ID = C.ID
    WHERE 
        C.Text LIKE '%' + @ObjectName + '%'

For example, if I run the above script in Northwind database to check where "Categories" table is being used, I get the following result.


Happy Coding!

Thank you.

Saturday, June 16, 2012

SQL Server Pivot Function

In previous post we discussed exporting data from SQL Server into an excel file and how you can pivot the data etc. Pivoting is basically a way for you to convert rows into columns. Starting with 2005, SQL Server introduced PIVOT function which allows you to retrieve columns in tabular format. PIVOT function doesn't work on certain data types such as XML.

In the past, suppose you wanted to group employees by the city they reside in. You can use Group By clause to get the count of the number of employees by city, but such a query will result in each city as one row and you will have to do something extra to convert them into columns. The query below will return the number of employees that reside in three cities with the count for each city appearing in one row.

USE AdventureWorks
GO

SELECT City,COUNT(*) AS [Count] FROM HumanResources.vEmployee 
WHERE City IN ('Monroe','Redmond','Seattle')
Group By City

The result will look like this

CityCount
Monroe14
Redmond21
Seattle44

Suppose you want to pivot this data so that each city appears as a column. You can modify the above query and use PIVOT function.

Use AdventureWorks
GO
SELECT [Monroe],[Redmond],[Seattle]
FROM
(
    SELECT E.City FROM HumanResources.VEmployee E
) C
PIVOT
( 
 COUNT (City)
 FOR City
 IN ([Monroe],[Redmond],[Seattle])
 ) AS PivotTable;

Now the result will look like this
MonroeRedmondSeattle
142144

You may be wondering about missing WHERE clause. Basically FOR City IN () under PIVOT function takes care of filtering your data based on your criteria. This query works but the nested query still returns all rows. You can add a WHERE clause in your nested query to only return rows that satisfy your criteria. Above query can be rewritten as follows...
Use AdventureWorks
GO
SELECT [Monroe],[Redmond],[Seattle]
FROM
(
    SELECT E.City FROM HumanResources.VEmployee E
    WHERE City IN ('Monroe','Redmond','Seattle')
) C
PIVOT
( 
 COUNT (City)
 FOR City
 IN ([Monroe],[Redmond],[Seattle])
 ) AS PivotTable;

PIVOT is a useful function but not very intuitive, as you can see above, it takes a minute to make sense of it. Question for you guys - have you used PIVOT function before and if so, how did you use it?

Thank you.

Thursday, June 14, 2012

Exporting data from SQL Server to Excel

Excel is a very powerful tool and it can come in handy if you just want to build a quick report from your SQL Server database. In this post, we will discuss ways to export data and create a quick employee report.

1. Fire up your Excel and browse to data tab, click on From Other Sources and select From SQL Server.


2. Enter the SQL Server name. If your SQL Server's instance is other than default, you will also need to enter the instance name.


You will have to do this only once. After successfully connecting to your database, you can save the connection string and in the future just click on Existing Connections from the data tab to reuse the connection.

3. After entering the information, click on Next and select the database you want to export data from. In this example, we are going to use AdventureWorks. Select a table or a view to retrieve data from and click Next.

4. This is your chance to save your connection string for any future usage. Rename the connection string to a more appropriate name and click on Finish.

5. You are now ready to export data. You can either export table as it is saved in the database or create a pivot report or a pivot report with chart. In the example, I chose the second option.


6. Properties tab allows you to set certain parameters. Generally defaults work OK for most situations.


7. When you click on OK, Excel lists all the fields that are available in your selected table or view and you can select the ones you want to show up on report. As you select a column, you will notice that the value is added in the spreadsheet. The order you want the data to appear in the report is the order you should select the columns. For example, if you want Last Name to appear before First, select Last Name first and then select First Name column.

8. Close the Fields List and here is your nice little report.

Thank you.

Monday, June 11, 2012

Compressing your data


If deleting data is not an option, SQL Server 2008 and 2012 provide another option to compress individual tables by compressing rows or pages.

Before you compress a particular table, it may be prudent to know which table will provide you the biggest gain. In my previous post I talked about running a script to determine the tables with largest amount of data which can be used to determine which tables to tackle first.

Let's assume you have determined the table(s) to compress, fire up your management studio and then right click on the table > Storage > Manage Compression


Click on Compression and select Compression Type - Row Level Data Compression or Page Level Data Compression.

After you select compression type, you can click on Calculate which will show you how much you will gain.
  • Row Level Data Compression: This compression turns the fixed length data types into variable length data types and frees up empty space. As you can image, a table with more columns of char/nchar data type will yield more space. Row level compression can also ignore zero and null values resulting in additional space savings. 
  • Page Level Data Compression: When you choose this compression type, it starts with row level data compression and then adds additional compression by performing two extra layer of compression...
    • Prefix Compression
      • SQL Server analyzes the repeated pattern at the beginning of the values in each column and substitutes them with an abbreviated reference which is stored in the compression information structure. This type of compression only works on one column.
    • Dictionary Compression
      • It works similarly to prefix compression except that it searches for repeated patterns anywhere in a page and replaces them with an abbreviated reference and stores the reference information in the compression information structure.

How much savings you will gain by the page level compression depends on the patterns of the data. Repeatable patterns will result in more savings, whereas random patterns will result in less savings.

As you can imagine, this will result in performance degradation, but how much and whether it is acceptable will depend on your environment.

Thank you.




Thursday, June 7, 2012

Determining SQL Table Size

You may come across a situation where you need to shrink your database by selectively deleting data from some tables.

It sure would be nice to know which tables are the biggest data hog and see if you can delete records to free up some space and then shrink the database. There is indeed a nice little script that you can use to determine the table size and act accordingly.

CREATE TABLE #tmpTable
(name SYSNAME, 
rows CHAR(11), 
reserved VARCHAR(18),  
data VARCHAR(18), 
index_size VARCHAR(18), 
unused VARCHAR(18)
)  
EXEC sp_msforeachtable 'INSERT INTO #tmpTable EXEC sp_spaceused ''?''' 
SELECT * FROM #tmpTable ORDER BY data desc 
DROP TABLE #tmpTable 

If you are running SQL 2008 (or SQL 2012 enterprise edition, then you can also compress individual tables. We will discuss table compression in next post.

Thank you

Tuesday, June 5, 2012

Checking Foreign Key References

You may come across a situation where you need to check which tables are connected to a particular via a foreign key referential constraint. You may need this information when you are trying to delete one or more records or truncate a table.

Below is a script that you can use to check all the tables that are related to the table in question. In this example I am checking for the table "Employees"

SELECT f.name AS ForeignKey,
OBJECT_NAME(f.parent_object_id) AS TableName,
COL_NAME(fc.parent_object_id,
fc.parent_column_id) AS ColumnName,
OBJECT_NAME (f.referenced_object_id) AS RefTableName,
COL_NAME(fc.referenced_object_id,
fc.referenced_column_id) AS RefColumnName
FROM sys.foreign_keys AS f
INNER JOIN sys.foreign_key_columns AS fc
ON f.OBJECT_ID = fc.constraint_object_id

WHERE OBJECT_NAME (f.referenced_object_id)='Employees'

Below is a result showing the three tables it is connected with.

ForeignKey TableName ColumnName RefTableName RefColumnName
FK_Emp_Addresses EmpAddresses EmployeeID Employees EmployeeID
FK_Emp_Phones EmpPhones EmployeeID Employees EmployeeID
FK_Emp_Attributes EmpAttributes EmployeeID Employees EmployeeID

Thank you.

Monday, June 4, 2012

Change Database Compatibility Level

Recently I came across a database environment that had SQL Server 2005 (don't ask me why), but the databases (there were about 50 databases) were running at SQL Server 2000 compatibility level. After ensuring that nothing will break if I changed a database's compatibility level to 2005, I decided to go ahead and run a script to change all databases' compatibility level to 2005.

Below is a script that you can run in your environment.
DECLARE  @tmpTable TABLE (Name varchar(50),compLevel INT,StateDesc varchar(50))

INSERT INTO @tmpTable(Name,CompLevel,StateDesc)
SELECT Name,Compatibility_level,State_Desc FROM sys.databases 

DECLARE @DBName varchar(50)
DECLARE curDBName CURSOR FOR
    SELECT Name FROM @tmpTable WHERE StateDesc='ONLINE'
OPEN curDBName
FETCH NEXT FROM curDBName INTO @DBName
WHILE @@FETCH_STATUS=0
    BEGIN
        DECLARE @compLevel INT
        SELECT @compLevel=compLevel FROM @tmpTable WHERE Name=@DBName
        If @compLevel=80
            BEGIN
            Exec sp_dbcmptlevel @DBName,90
            END
        FETCH NEXT FROM curDBName INTO @DBName
    END
CLOSE curDBName
DEALLOCATE curDBName

You can change the version to move from 2000 to 2008 or 2005 to 2008, but before you do that, make sure that your database will not break. One way to check would be to create database scripts by logging into Management Studio > Right Click on Database > Tasks > Generate Scripts. Make sure you select all objects i.e. tables, views, stored procedures, functions etc. and then save scripts. Run these scripts to create an empty database in the SQL environment you are upgrading your database to. If there are no errors, there is a good chance that your production database will work fine.

Thank you.



Saturday, June 2, 2012

Creating a Composite Foreign Key

In previous post we discussed various types of primary keys such as natural key, surrogate key and composite key. To recap, a composite primary key is a combination of more than one columns. Ordinarily a single column primary key is preferable but there are situation as we discussed previously when it makes sense to create a composite primary key.

Naturally, if you want to use composite primary key as a foreign key in other tables, you will have to create the columns in the secondary table that corresponds to the primary table. For example, you have a table called Customers and you have a composite primary key "FirstName, LastName" in this table. To use this key as a foreign key in another table say "CustomerAddresses" you will have to create the FirstName and LastName columns in CustomerAddresses table and then add the referential integrity. (I am not saying this is a good way to design your table schema, this is just an example).

Below is the script that you can run to create a foreign key in CustomerAddresses table.
ALTER TABLE dbo.CustomerAddresses
   ADD CONSTRAINT FK_Customer
   FOREIGN KEY(FirstName, LastName)
   REFERENCES dbo.Customers(FirstName, LastName)


You can use the same script to create foreign key constraint on any other columns. For example, if your Customers table has "CustomerID" as primary key, then you would add "CustomerID" column in CustomerAddresses table and then use the script above replacing "FirstName, LastName" with "CustomerID".

Thank you.