Developing Matt

My Technical Journal

SQLclr or T-SQL? Performance Options

leave a comment »

Adam Machanic, out of Boston, taught this class on the sql 24 pass online training class promoted and supported by sqlpass.  THANK YOU sqlpass and THANK YOU Adam Machanic..  I didn’t know that Adam Machanic hosted the http://sqlblog.com site. 

First off, some great quotes:

“Performance tuning is an art as well as a science.”
“If you don’t have the common sense, ask someone who does”
”If you have a solution in your software project that isn’t the right solution for the situation, you will feel it.”
”As a developer you have to develop a sense of what is good and what is not good and apply it on a situational basis.”
”Iterative logic shines with SQL-CLR”

SQL-CLR should

  • Operate on data either headed for or already in the database.
  • Should be used for data-centric operations (churning data), not application or ui-centric

Performance tradeoffs to consider

  • Adding indexes improves read performance and hurts write performance
  • Rewriting queries may make them more complicated

T-SQL is the only choice for reading data and doing standard data operations.  But more elaborate uses like the following should be considered for CLR:

  • math or string operations
  • linear and time series computations
  • heavy xml
  • manipulation of different kinds of data (like images)

The question being attempted to answer: Does passing data to CLR carry huge overhead, considering that clr is jitted (compiled down to machine code) and T-SQL is interpreted?

TESTS:

Test 1 Return Len on a string.  His first test:He created t-sql and slq clr function that return the length of the input string.  he ran it 100K times and passed in different strengths.  t-sql did better, but not by much.  but when it reached over 50000 characters there was a difference in about 5 seconds.

Test 2 math: he found prime numbers from x to x.  clr : 24 seconds.  t-sql : 43 minutes, 2 seconds.  clr performed better.  Simple inline/non-procedure math may not be an issue (counting on a math).  bottom line: if the formula needs to be a udf, then consider converting it to an inline procedure or consider clr.

Test 3.  Non-math.  string operations.  He created two functions, one using t-sql charindex and the other (clr) using .net string.indexof.  and then he asked to find the character Y..and called it 200 K times.  When it was around 100K character they did the same.  but when it went above that ..up to 50000 characters then clr performed 3x better.

Test 4. replace.  pretty much the same before.  created two functions, one using t-sql replace and the other using .net string.replace.  ran it 100K again.  at the low end around 2500 they performed the same, but at around 50000 characters clr performed 7x better (clr ran at 100 seconds, t-sql ran at 700 seconds)

Test 5. linear and time series (linear means any query where the current value depends on the previous value(s)).  His example went something like his..’select t1.y, sum(t2.x) as running_x from t as t1 inner join t as t2 on t1.y >=t2.y group by t1.y)  for each input row look at all the proceeding rows once …every single time. 

You can use a cursor for this but the fastest guaranteed reliable method is CLR.  Use an ado.net datareader (forward only), use an order by so that the data is sorted by the query engine and then loop with the data reader…use local variables to calculate and store running values. Output the data using sqlpipe (allows us to pass a rowset back to the caller…he called them row by row– no temp tables are needed).  You can see the code on his blog )search for running sums).  The results:

  • set based sql: 15:43
  • sql cursor:  0:04
  • clr stored procedure:  0:01.5

Test 6. xml manipulation.  .net has an xml datareader.  sql server’s model is iterator based sqlxml model and simply doesn’t perform well.  CLR is about 5-7xs faster. and it’s easier.  3 hours on t-sql with someone from microsoft xml ..about an hour on CLR for him to create.  (you can download extended events code generator on his blog here)

Q/A

Q. Does CLR add memory overhead to the server?
A.  It can.  You want to watch memory very closely and use screaming algorithms (use datareader not a datatable for example).  memory is the most expensive resource.  bottom line: follow best practices and know what you are doing.

Q.  Which is better for recursion for a hierarchy. 
A.  Neither.  In 2005 the best way is to use materialized paths used for hierarchy.  In 2008 use hierarchy id (which uses sql clr under the hood).  Recursion is not the right solution.

Q.  How was he calling charindex?
A.  He used a clr function calling string.indexof  He put it in a while loop and called it 100x.  For the sql version he used charindex directly.

Q.  We found that udfs are a serious drag on performance vs inline.   Are you suggesting that we might increase performance if we use CLR instead?
A.  Yes.  Multi statement udfs are a drag and not good for performance.  Look up Alex’s articles on sql blog (or any search on sqlblog on ‘inline’) to determine if it’s better for inline.  Heavy calculations and functions are better in CLR..but generally inline is better.

Q.  Why do some DBA’s not like CLR and think it’s not safe.
A.  There’s a great acronym FUD:  It stands for fear uncertainty and doom.  what they don’t know they fear. 

Q.  Would you recommend using clr to replace a trigger table that receives 20 transactions per second.  That trigger includes update and insert statements.
A.  If you are just moving the data somewhere else like a log, then clr is not your best choice.  CLR is not the best choice for standard data manipulation.  If you are moving it to a log file on a disk or across servers then it might be interesting.  If you are just using it for a log  by the way, check out sql server 08 ‘change data capture’

Advertisements

Written by matt

September 15, 2009 at 1:31 am

Posted in Sql Performance, Sql Server

Tagged with

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: