Tag Archives: SQL

The curious case of trailing spaces in SQL

A while ago I was quite surprised to see that the following query returns 1 instead of 0:


SELECT COUNT(*) WHERE N'Tim' = N'Tim '; -- notice the trailing space

Apparently this is just standard behaviour. Here is an extract from sql1992.txt (Section 8.2 Paragraph 3):

     3) The comparison of two character strings is determined as fol-
            lows:

            a) If the length in characters of X is not equal to the length
              in characters of Y, then the shorter string is effectively
              replaced, for the purposes of comparison, with a copy of
              itself that has been extended to the length of the longer
              string by concatenation on the right of one or more pad char-
              acters, where the pad character is chosen based on CS. If
              CS has the NO PAD attribute, then the pad character is an
              implementation-dependent character different from any char-
              acter in the character set of X and Y that collates less
              than any string under CS. Otherwise, the pad character is a
              .

            b) The result of the comparison of X and Y is given by the col-
              lating sequence CS.

            c) Depending on the collating sequence, two strings may com-
              pare as equal even if they are of different lengths or con-
              tain different sequences of characters. When the operations
              MAX, MIN, DISTINCT, references to a grouping column, and the
              UNION, EXCEPT, and INTERSECT operators refer to character
              strings, the specific value selected by these operations from
              a set of such equal values is implementation-dependent.

Sample query to demonstrate influence of collation in Sql Server

Lately I had the pleasure to investigate collations and here is a sample query that demonstrates how a collation impacts the behaviour of a query:

WITH [Words] AS (
	SELECT N'Een' AS [word]
	UNION ALL
	SELECT N'Eèn'
	UNION ALL
	SELECT N'EEN'
)
	SELECT [word]
	FROM [Words]
	WHERE [word] 
		--COLLATE Latin1_General_CS_AS -- returns Een
		--COLLATE Latin1_General_CI_AI -- returns Een, Eèn and EEN
		--COLLATE LAtin1_General_CI_AS -- returns Een and EEN
		COLLATE Latin1_General_CS_AI -- returns Een and Eèn
		= N'Een';

An example of Common Table Expression and Window function usage…

Earlier this week some colleague had been assigned a maintenance task and asked me how I would solve it. Every customer is permitted to have an amount of publications. All excess publications should be removed from the system (only the n most recent ones are permitted to remain on the system).

Here is an example of the Customer table:

CREATE TABLE [dbo].[Customer](
	[CustomerId] [int] IDENTITY(1,1) NOT NULL,
	[CustomerName] [nvarchar](50) NOT NULL,
	[PermittedPublications] [int] NOT NULL
);

INSERT INTO [dbo].[Customer] 
			([CustomerName], [PermittedPublications]) 
VALUES	
			('timvw', 2),
			('mike', 3);

An example of the customer publications table:

CREATE TABLE [dbo].[Publication](
	[PublicationId] [int] IDENTITY(1,1) NOT NULL,
	[CustomerId] [int] NOT NULL,
	[PublicationName] [nvarchar](50) NOT NULL,
	[PublicationTime] [datetime2] NOT NULL
);
                         
INSERT INTO [dbo].[Publication] 
			([CustomerId], [PublicationName],[PublicationTime])
VALUES      
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'timvw'), 'tim pub1', SYSUTCDATETIME()),
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'timvw'), 'tim pub2', SYSUTCDATETIME()),
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'timvw'), 'tim pub3', SYSUTCDATETIME()),
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'timvw'), 'tim pub4', SYSUTCDATETIME()),
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'mike'), 'mike pub1', SYSUTCDATETIME()),
			((SELECT [CustomerId] FROM [dbo].[Customer] WHERE [CustomerName] = 'mike'), 'mike pub2', SYSUTCDATETIME());

My colleague was keen on using some cursor logic, but I demonstrated him how a set-based alternative:

		
WITH [RankedPublication] AS (
	SELECT [CustomerId]
	      ,[PublicationId]
	      ,[PublicationName]
	      ,[PublicationTime]
	      ,ROW_NUMBER() OVER(PARTITION BY [CustomerId] ORDER BY [PublicationTime]) AS [PublicationRank]
	FROM [dbo].[Publication]
), [ExcessPublication] AS (
	SELECT [PublicationId]
	FROM [RankedPublication]
	INNER JOIN [dbo].[Customer] ON [Customer].[CustomerId] = [RankedPublication].[CustomerId]
	WHERE [PublicationRank] > [Customer].[PermittedPublications]
)
	DELETE FROM [dbo].[Publication]
	WHERE [PublicationId] IN (SELECT [PublicationId] FROM [ExcessPublication]);

Use SQL Server Profiler to see if a connection is pooled

It took me a couple of websearches to discover how i can see in SQL Server Profiler whether or not a connection is pooled. Apparently you have to check ‘Show all columns’ and then you can check the ‘EventSubClass’ column:

This is how it looks like in your trace window:

Building a Nums table (quickly)

A while ago i presented my approach to generate a nums table here.

DECLARE @count INT = 1000;
 
WITH
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] )
SELECT * FROM [Numbers];

Because we only use this code once to fill the table we don’t really care that it is not very fast. Today i discovered there is a way to speed it up in this wonderful book: Inside Microsoft® SQL Server® 2008: T-SQL Querying:

WITH
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY (SELECT 0)) AS [Value] FROM[Nums4] )
SELECT * FROM [Numbers];

Yay for features like ORDER BY (SELECT <Constant>).

Invoke parameterized stored procedure with PowerShell

Here is a quick snippet that demonstrates how you can invoke a parametrized stored procedure with PowerShell:

$CreateTraceCommand = $SqlConnection.CreateCommand();
$CreateTraceCommand.CommandType = [System.Data.CommandType] "StoredProcedure";
$CreateTraceCommand.CommandText = "sp_trace_create";
$TraceIdParameter = $CreateTraceCommand.Parameters.Add("@traceid", [System.Data.SqlDbType] "Int");
$TraceIdParameter.Direction = [System.Data.ParameterDirection] "Output";
$OptionParameter = $CreateTraceCommand.Parameters.Add("@options", [System.Data.SqlDbType] "Int");
$OptionParameter.Value = [int] 2;
$TraceFileParameter = $CreateTraceCommand.Parameters.Add("@tracefile", [System.Data.SqlDbType] "NVarChar");
$TraceFileParameter.Value = [string] "c:\temp\test";
[void] $CreateTraceCommand.ExecuteNonQuery();
$TraceId = $TraceIdParameter.Value;
Write-Host "You created a trace with id: $TraceId";

Techniques learned in SQLtopia: Look Ma, no loops!

I have been programming in c-based languages for more than 10 years now. Lately i have spent quite a bit of time in SQLtopia and learned a couple of techniques that are quite different from what i was used to.

Let me explain with an example: The boss comes in and asks for a report that contains all the days on which i have worked this year. As a typical c# programmer i come up with the following solution: Iterate over all the days in the year and skip the days i was out on holidays:

IEnumerable<DateTime> FindWorkingDays()
{
 var begin = new DateTime(2010, 01, 01);
 var end = new DateTime(2010, 12, 31);

 var holidays = new[]
 {
  new Holiday { Begin = new DateTime(2010, 07, 01), End = new DateTime(2010, 07, 31) }, 
  new Holiday { Begin = new DateTime(2010, 09, 01), End = new DateTime(2010, 09, 15) }
 };             

 for (var date = begin; date <= end; date = date.AddDays(1))
 {
  if(holidays.Any(holiday => holiday.Begin <= date && date <= holiday.End)) continue;
  yield return date;
 }
}

In SQLtopia it is recommended to use set-based solutions instead of loops so we need a radically different solution.

Here is how we generate rows:

WITH 
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
SELECT * FROM [Nums4];

With rows we can generate numbers:

DECLARE @count INT = 1000;

WITH 
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] ) 
SELECT * FROM [Numbers];

And with numbers we can generate dates:

DECLARE @begin DATETIME =  '2010-01-01';
DECLARE @end DATETIME =  '2010-12-31';
DECLARE @count INT = DATEDIFF(DAY, @begin, @end) + 1;

WITH 
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] )
  , [Dates] AS ( SELECT DATEADD(DAY, [Value], @begin) AS [Date] FROM [Numbers] )    
SELECT * FROM [Dates];

With dates we can build the dates on which we were out on holidays:

DECLARE @begin DATETIME =  '2010-01-01';
DECLARE @end DATETIME =  '2010-12-31';
DECLARE @count INT = DATEDIFF(DAY, @begin, @end) + 1;

WITH 
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] )
  , [Dates] AS ( SELECT DATEADD(DAY, [Value], @begin) AS [Date] FROM [Numbers] )    
  , [Holidays] AS ( SELECT '2010-07-01' AS [Begin], '2010-07-31' AS [End]
                    UNION
                    SELECT '2010-09-01' AS [Begin], '2010-09-15' AS [End]
                  )
SELECT [Date] FROM [Holidays],[Dates] WHERE [Date] BETWEEN [Begin] AND [End] ORDER BY [Date];

And now we can easily select the dates that are not holiday dates:

DECLARE @begin DATETIME =  '2010-01-01';
DECLARE @end DATETIME =  '2010-12-31';
DECLARE @count INT = DATEDIFF(DAY, @begin, @end) + 1;

WITH 
	[Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  , [Numbers] AS ( SELECT TOP(@count) -1 + ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] )
  , [Dates] AS ( SELECT DATEADD(DAY, [Value], @begin) AS [Date] FROM [Numbers] WHERE [Value] <= DATEDIFF(DAY, @begin, @end))    
  , [Holidays] AS ( SELECT '2010-07-01' AS [Begin], '2010-07-31' AS [End]
                    UNION
                    SELECT '2010-09-01' AS [Begin], '2010-09-15' AS [End]
                  )
  , [HolidayDates] AS (SELECT [Date] FROM [Holidays],[Dates] WHERE [Date] BETWEEN [Begin] AND [End] )
                  
SELECT [Dates].[Date] FROM [Dates] 
					  LEFT OUTER JOIN [HolidayDates] ON [Dates].[Date] = [HolidayDates].[Date]
					  WHERE [HolidayDates].[Date] IS NULL;

Look Ma, no loops!

Exploring window functions

Here is a sample query that allows you to explore the behavior of various aggregate window functions:

WITH 
    [Nums1] AS ( SELECT 1 AS [Value] UNION SELECT 2 AS [Value] )
  , [Nums2] AS ( SELECT A.* FROM [Nums1] AS A, [Nums1] AS B, [Nums1] AS C)
  , [Nums3] AS ( SELECT A.* FROM [Nums2] AS A, [Nums2] AS B, [Nums2] AS C)
  , [Nums4] AS ( SELECT A.* FROM [Nums3] AS A, [Nums3] AS B )
  -- Build numbers from 1 to 1000
  , [Numbers] AS ( SELECT TOP(1000) ROW_NUMBER() OVER(ORDER BY [Value]) AS [Value] FROM[Nums4] )
SELECT 
    [Value]
  , ROW_NUMBER() OVER(PARTITION BY [Value] / 100 ORDER BY [Value] % 10) AS [RowNumber]
  , RANK() OVER(PARTITION BY [Value] / 100 ORDER BY [Value] % 10) AS [Rank]
  , DENSE_RANK() OVER (PARTITION BY [Value] / 100 ORDER BY [Value] % 10) AS [DenseRank]
  , NTILE(4) OVER (PARTITION BY [Value] / 100 ORDER BY [Value] % 10) AS [Tile100]
FROM	
    [Numbers]
ORDER BY 
    [Value];

Deploying a Database Project does not create artefacts for custom schemas

Last week we noticed that the artefacts for a custom schema in our Database Project were not created (or updated). It took us a while to figure out what was wrong. When you add a script via the ‘Add Existing item’ menu in VS2010 the file’s Build Action is set to Not in Build for some unknown reason. As soon as we changed the value to Build and Deployed again our artefacts were created.

T-SQL: Remove all rows from all tables in your schema

Sometimes i want to quickly clean up a database and start from fresh. Here is a small script that does exactly that.. (I just run the script a couple of times, untill no affected rows remain… Far more efficient than figuring out which constraints exist, building up a dependency tree, and cleanly deleting all rows):

-- Remove all rows from all tables
DECLARE @tableName VARCHAR(255)
DECLARE tableNames CURSOR FOR SELECT name FROM sys.Tables;
OPEN tableNames
FETCH NEXT FROM tableNames INTO @tableName
WHILE @@FETCH_STATUS = 0 BEGIN
	EXEC('DELETE FROM [' +  @tableName + ']')
	FETCH NEXT FROM tableNames INTO @tableName
END
CLOSE tableNames
DEALLOCATE tableNames