Developing Matt

My Technical Journal

Breaking up is hard to do

leave a comment »

There are a lot of great posts about splitting a table into a list or splitting a list into a table.  Do a search and you will be a little overwhelmed by the good and the bad advice.  Let me turn you on to a couple of excellent articles.  Anith wrote an excellent must read article titled Faking Arrays where he explores the various  sundry ways to do this.  Itzik Ben-Gan (my performance hero) has spent a lot of time determining the most efficient way to split data in a whitepaper titled Arrays and Lists in Sql Server 2005 (it still applies in 2008). 

After gleaning over these articles and countless others I’ve surmised the following four methods. 


The first is splitting your table to a character list.  It’s pretty simple, really.  Just declare your variable and set it to an empty string.  Setting it to an empty string is required unless you want to mess with the ‘SET CONCAT_NULL_YIELDS_NULL’ setting which I don’t recommend…some settings actually modify the database and cause your stats to restart…not sure about all this particular setting though and I’m not really up on this little fact.  Why is it required?  Well, you can’t add a string to a null value, can you.  1 + null just doesn’t make any sense and will always return you a null (unless you mess with the setting above of course.  Anyway, this is how you do it.

declare @d varchar(100)=” 
select @d = @d + ‘ your delimeter ‘ +  field from yourtable
print @d


The second method is splitting your table into a character list utilizing the new 2005 ‘for xml’ command.  From what I understand this performs better than the first, but it’s not as readable:

select yourfield= STUFF(
             (SELECT  ‘ ‘ + yourfield AS [text()]
              from yourtable 
              for xml PATH(”)),1,1,’,’)


The third method we broach the discussion of splitting a list into a table.  The most efficient method of doing this is using a custom clr.  Text-parsing isn’t sql server’s forte, but it is for iterative-based languages like vb, c#, etc.  But this isn’t the way I do it because it requires modifying the database to allow clr operations and this means getting permission and jumping through hoops, etc.  So the next best method is using a numbers table with a table-valued function that joins to this table.  This is Itzik-tested because it comes from Itzik.  As far as creating a creating a numbers table, don’t waste your time searching, just use this:

CREATE TABLE [dbo].[Numbers](
    [Number] [int] NOT NULL
declare @fromNumber int =0, @toNumber int=100,
@byStep int=1


WITH CTE_NumbersTable AS (
SELECT @fromNumber AS i
SELECT i + @byStep
FROM CTE_NumbersTable
(i + @byStep) <= @toNumber
insert into numbers SELECT * FROM CTE_NumbersTable

Now that this is done here is the function that can be used to split a list into a table (which utilizes the numbers table):

CREATE FUNCTION [dbo].[split_to_table](@array AS VARCHAR(1000), @delimiter as varchar(6)) RETURNS TABLE

      ltrim(rtrim(SUBSTRING(@array, number,
      CHARINDEX(@delimiter, @array + @delimiter, number) – number))) AS element
  FROM dbo.Numbers
  WHERE number <= LEN(@array) AND SUBSTRING(@delimiter + @array, number, 1) = @delimiter


Method number 4 applies when you don’t have access to create a numbers table.  Lots of ways to do this and this is one of them.  This was put together with the first article I mentioned by Anith.  I couldn’t get the max recursion option to work because I was in sql 2000 but I left it in.

@p VARCHAR(500), @delimeter varchar(3)
–select * from split(”,’|’)
RETURNS @tblSplit TABLE (value varchar(100))
IF ISNULL(@P,”) <> ”
        WITH CTE ( pos, pos_begin, pos_end ) AS (
                SELECT 0, 1, CHARINDEX( @delimeter, @p + ‘,’ )
                 UNION ALL
                SELECT pos + 1, pos_end + 1, CHARINDEX( @delimeter, @p + @delimeter, pos_end + 1 )
                  FROM CTE
                 WHERE CHARINDEX( @delimeter, @p + @delimeter, pos_end + 1 ) > 0 )
            insert @tblSplit SELECT  SUBSTRING( @p, pos_begin , pos_end – pos_begin ) AS "value"
              FROM CTE
            –OPTION ( MAXRECURSION 0 ) ;


Lots of ways to do this….these are my top four.  If you have a better way let me know.


Written by matt

June 23, 2009 at 11:39 am

Posted in T-Sql

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: