Matthew Mouras Development: SQL Server

Showing posts with label SQL Server. Show all posts

Working with XML data in SQL Server

The XML data type and some associated functions were added back in SQL Server 2005. I'm looking forward to digging into this more deeply as I'm told Measurement Inc uses it extensively in their systems. Today I'll be writing some simple scripts to test out some of the functionality.

You can retrieve the results of a query directly into XML using the FOR XML statement. There are a few different modes that format the returned data differently. I tried a number of them below:

The results window displays links to the full XML that was returned with each query.

The full XML returned with the PATH mode.

This is cool stuff, but what if we want to store this XML natively in a table? Store it in an XML column like so:

Statement 1 creates a table with a single XML column. A variable is declared with the XML data type in statement 2. The third statement saves the information from two tables in that variable. Statement 4 inserts a row into the table using the variable and the final query returns the column without using the FOR XML clause. The data is stored in the XML format, so the results look just like the SELECT statement from the first script in this post.

More UDFs and Stored Procedures

Today I'm working with functions again and examining the differences between User Defined Functions (UDF) and stored procedures (SP).

Both take parameters, but stored procedures also accept OUTPUT parameters. These are used to get modified values from stored procedures.
Return values are optional in stored procedures.
Stored procedures can be used to create other objects.
Stored procedures can update data.
Stored procedures can call a proc.

I've been told by other developers that anything client facing should use stored procedures. Why is that? The reasoning is pretty simple: Processes outside of SQL Server should not have direct access to tables, views, or functions. This helps to prevent things like SQL injection attacks.

But there are other benefits as well. They allow for more modular development by separating the data application layer from other programming logic. A stored procedure can be created once at the database layer and then other developers can concentrate on their code instead of SQL. The database logic can then be modified independently of the program's source code.

Stored procedures can also be used to reduce network traffic. They are generally faster because they are compiled when first run and the query plans are cached by SQL Server's optimizer. You can perform an operation that requires many lines of SQL code through a single statement that executes the code in a procedure, rather than by sending hundreds of lines of code over the network.

Here's a simple refresher on constructing stored procedures in T-SQL:

As usual, I'm working with the MS produced "AdventureWorks" database in SQL Server 2008 R2. This script creates a stored procedure with a default value for @CustomerID. There are two calls to the procedure. When a parameter value is specified, the SP returns the single corresponding record. When no value is passed in, all the records are returned.

Error Handling in SQL Server

I'm looking into error handling in SQL Server this morning. As of SQL Server 2005, you can use a TRY... CATCH statement that is similar to other .NET languages. There are a number of built-in functions that you can use within the CATCH block. The values produced by these functions persist within the CATCH block and can be accessed as many times as needed. The values revert to NULL outside the CATCH block:

ERROR_NUMBER() Provides the error number.
ERROR_SEVERITY() Provides the severity of the error. The severity must exceed 10 in order to be trapped.
ERROR_STATE() Provides the state code of the error. This refers to the cause of the error.
ERROR_PROCEDURE() Returns the name of a stored procedure or trigger that caused the error.
ERROR_LINE() Returns the line number that caused the error.
ERROR_MESSAGE() Returns the actual text message describing the error.

A simple test of the TRY... CATCH block

The TRY...CATCH can't trap every error. If a database is not available or a table is typed in incorrectly, the batch will simply fail. A TRY...CATCH is valuable when working with transactions. If the transaction in the TRY block fails, it can be rolled back in the CATCH. Here's some code:

Transaction rolled back within TRY...CATCH block

Part 2.1.2 of the TRY block commits an error by trying to update a field with a unique constraint to a value that already exists. If the entire transaction had been successful, the COMMIT statement would have gone through. Since there was an error, the CATCH block is triggered and it gives us a chance to fire a message to the user and rollback the entire transaction.

SQL Server Programming Logic

I'm working through some logic elements in SQL Server this morning. It's been a good review of how to use some basic SQL logic to manipulate data. It's also helped me to become more familiar with some of the built-in functions of SQL Server.

Working with dates is always a bit different in every language, so I worked through some scripts that used them in T-SQL. Here's an example of some of the IF ELSE statements I executed:

We have a variable that is holding the count of all customers in the Sales table. Then there is some conditional logic that outputs different responses based on the total count and the result of a call to the built-in DATEPART function.

Next up I worked through a slightly more complex problem with some nested logic:

The outer block checks to see if the month is August or September. The inner block checks to see if the year is even or odd using a modulo.

Next up this afternoon, I'll run through some loops and start reviewing the syntax for stored procedures in SQL Server.

User Defined Functions (UDF)

Today I'll be working with User Defined Functions (UDF) in T-SQL. Microsoft defines User Defined Functions as "routines that perform calculations/computations and return a value - scalar (singular) or a table". That sounds simple enough. UDF's can be developed using T-SQL or .NET languages in the CLR. CLR UDFs are good for tasks that T-SQL cannot perform efficiently like procedural logic, complex calculations, string manipulation, etc... Though .NET should be avoided when the task is mainly set-based.

I'm specifically looking at nondeterministic functions. They are functions that are not guaranteed to return the same output when invoked multiple times with the same input. Built in nondeterministic functions like RAND and GETDATE are invoked once for the whole query and not once per row. The NEWID function is different. It generates a globally unique identifier (GUID).

I wrote a script to test this behavior:

The Random and Date columns were both only invoked once for the whole query - the data repeats for each row. The guid column was populated by the NEWID function and was invoked on each row. A new value was generated for each record.

T-SQL Elegance

I've heard of Itzik Ben-Gan and have wanted to attend one of his conferences for some time. One of my former coworkers was able to see him not long ago and came back professing his genius. What's the next best thing to witnessing him solve programming problems in person? I hope it's reading his books, because I've been going through them this week. Here's what is on my desk:

Inside Microsoft SQL Server 2008: T-SQL Querying
Inside Microsoft SQL Server 2008: T-SQL Programming

The books are a companion set and Ben-Gan recommends reading them in order. I began to plow into the companion Programming book until I found the first book was constantly referenced. The Querying book will be a good review with some suggestions for how to elegantly use the features of SQL Server 2008 at an enterprise level.

Performance in T-SQL

I'm taking a look at performance issues in T-SQL this morning. How do indexes and functions affect performance? I'm working with the "AdventureWorks" test database from Microsoft. It can be downloaded here: link

Since there isn't currently an index on the Sales.SalesOrderHeader table, I'm going to create one with the following script:

Now that a nonclustered index exists for the SalesOrderHeader, I'm going to run some test scripts. You can see the queries and the results below:

These queries return all the orders placed in 2001. Query 1 accomplishes this without the use of a function, while query 2 uses the YEAR function. So what's the lesson here? Take a look at the execution plans. Query 1 has a cost of only 7 percent. The database engine can use the index I set up with this query because it is comparing actual indexed values. Query 2 uses a function that requires the engine to scan the entire index in order to see whether the results of the YEAR function applied to each value meets the criteria.

I dropped the index and ran the two queries again. Now each query takes the same amount of time. How you construct queries and how you implement indexes makes a big difference in performance. It's not enough to just retrieve the desired results.

SQL Server Management Studio

I'm working in SQL Server Management Studio (SMSS) this afternoon. All of my previous SQL Server work has been done from within Visual Studio. There are some nice features in SSMS:

Query window uses IntelliSense to assist in writing code.
Individual queries can be highlighted within the window and executed.
Easy access to lots of administrative tasks.
Integrated Object Explorer window that is similar to VS. It allows for some useful filtering. With a few clicks of the mouse, you can browse to specific objects to create simple scripts in the query editor. These can then be easily edited.
Enable "Execution Plan" in order to create additional tab in query results window that will give the developer detailed information regarding query performance... cool.
Immediate feedback on executed queries. Just click the "Messages" tab on the query results window and you will see the number of rows affected and any pertinent errors.

The Execution Plan results

The Messages tab in the query results window

Something Old, Something New

I've worked with Oracle exclusively in my past professional experience. School exposed me to a small amount of T-SQL and MySQL. Measurement Inc uses SQL Server, so it's time to beef up my knowledge of T-SQL. I'm looking forward to making the switch. Being a .NET developer with limited exposure to SQL Server is like being a movie director with limited exposure to editing equipment - it just doesn't make sense.

There are tons of great resources online and many thick books published on the subject. I've surfed to a few sites and bought a stack of books. Reading through the material gets me excited to do more with SQL Server than the simple implementations I've used while developing the back end of web applications in Visual Studio.

More to come soon.

Pages