Developing Matt

My Technical Journal

Query Programming and Tuning – Itzik Ben-Gan

leave a comment »

Query Programming and Tuning – Itzik Ben-Gan

 

I can’t express enough how much fun it was to take Itzik’s class.  It was my first class with him, and now I know why he has such a huge following.  He has a grasp for understanding the structure of the query language and a passion for database logic.  I took a 6 hour class and have attempted to encapsulate my learnings in 25 pages of notes here.  Also, you can find the sql for this class on the references section of his blog (www.insidetsql.com) or by clicking here.

 

Highlights:

THE OVER CLAUSE

“The over clause is the single most powerful thing in sql language”  because it bridges the gap between iterative logic and the logic of sets.  It is optimized well, orders well, and has a group of ranking functions.  The different ranking functions are covered in good detail.

DATA AGGREGATION TECHNIQUES

Subqueries: expensive.  Using subqueries is a separate scan for each subquery.  Multiple joins are better than subqueries as they do one scan fo all.

Cursors: sometimes you have to use (benefits: can control order of elements and doesn’t impose restrictions)  However the cost is extensive.  It is several times slower than a basic aggregate

Grouping:  single pass through data.  This is a good solution unless you need a running aggregate.  However, the over clause will give us better performance for this case.

Sliding aggregates: similar to grouping but we fixa  a low point and an upper point (1 month ago to 1 month ahead)

Custom Aggregates: most common forms: string concatenation, median, mode

Pivot: Using pivot is useful but it’s not any better performance than grouping…it isn’t useful if there are large elements in the group….and it’s not straitforward (for someone else to administer later)

CLR User defined aggregates: 2005 is restricted to only 1 aggregate function with code and are restricted to 8000 bytes.  Using clrs is the fastes way that he knows to do sting concatenation, but it won’t be easy to maintain.

AN AUXILILARY TABLE OF NUMBERS (what is it? a single column with a sequence of integers)

TIP: in 2005 you can use a synonym to make it look like a local object. 

Use this table To SPLIT ARRAYS  (the best way to split an array as far as performance is to use a clr).  This is a great section that you should be sure to read as EVERY organization finds a need for this, and most of the time we end up writing functions that use poorly optimized resultsets. 

TOP

The problem with top is how it is used: Select top 3 * from sales order by qty     Here TOP needs to use ORDER BY  for filtering purposes (which 3 rows to pick).  The order by clause now has two meanings: display and filtering.  There is no guarantee of presentation of the data.  Instead use the OVER clause.  (lots of examples in notes)

T-SQL VS CLR

If it’s data manipulation, use t-sql

If it’s iterative logic, algorithmic logic,k computational intensive clculationrs, string manipulation, parsing then use .net. 

 

Advertisements

Written by matt

May 7, 2008 at 7:07 am

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: