Monday, November 29, 2010

Default Join Type

I'm always ask what the default join in SQL Server is, when no join type is specified. My immediate reaction is Google it. The problem is trying to find it among all the article that will show up as search result.

Now that I found it, I decided to put it here...

Here is the actual article (MSDN BOL) where I found it. And just in case Microsoft decides to remove it, here's an excerpt:

INNER
Specifies that all matching pairs of rows are returned. Discards unmatched rows from both tables. This is the default if no join type is specified.

LEFT [ OUTER ]
Specifies that all rows from the left table not meeting the specified condition are included in the result set in addition to all rows returned by the inner join. Output columns from the left table are set to NULL.

RIGHT [ OUTER ]
Specifies that all rows from the right table not meeting the specified condition are included in the result set in addition to all rows returned by the inner join. Output columns from the right table are set to NULL.

FULL [ OUTER ]
If a row from either the left or right table does not match the selection criteria, specifies the row be included in the result set, and output columns that correspond to the other table be set to NULL. This is in addition to all rows usually returned by the inner join.


~~ CK

Thursday, November 25, 2010

Convert Rows into A Delimited String, Part 3

Here's another sample of dataset rows converted into delimited string based on a unique ID. Looking at the sample data is better than explaining how this works... Populate sample data:
declare @tbl_email table (id int, email varchar(20))
insert into @tbl_email
select 1, 'A'
union all
select 2, 'A0'
union all
select 2, 'A1'
union all
select 3, 'A2'
union all
select 3, 'A3'
union all
select 3, 'A4'
union all
select 4, 'A5'

select * from @tbl_email

id          email
----------- --------------------
1           A
2           A0
2           A1
3           A2
3           A3
3           A4
4           A5
Here's the required output:
id          email_list
----------- -----------
1           A
2           A0, A1
3           A2, A3, A4
4           A5
And the code...
;with Numbered
as 
   (
      select       
         rownum = row_number() over(partition by id order by id, email),
         id, email
      from @tbl_email
   )
,FirstRecords
as 
   (
      select rownum, id, email from Numbered where rownum = 1
   )
,ReturnAllRecords
as
   (
      select rownum, id, cast(email as varchar(500)) as email_list from FirstRecords  
      union all      
      select Numbered.rownum, Numbered.id,  cast(email_list + ', ' + Numbered.email as varchar(500))
      from Numbered
         inner join ReturnAllRecords 
            on ReturnAllRecords.id = Numbered.id and
               ReturnAllRecords.rownum + 1 = Numbered.Rownum
   )
select id, max(email_list) as email_list
from ReturnAllRecords
group by id
Comments are always welcome...


 ~~ CK

Wednesday, November 17, 2010

List tables, index and keys

Here's a code that list the columns used in each index for each table.

with table_indexes
as
(
   select 
   row_num = row_number() over(partition by t.name, i.name order by t.name, i.name, ci.key_ordinal),
   t.name as table_name, i.name as table_index, c.name as index_keys
   from sys.tables t
      left join sys.indexes i on t.object_id = i.object_id and i.index_id > 0
      left join sys.index_columns ci on i.object_id = ci.object_id and i.index_id = ci.index_id
      left join sys.columns c on c.object_id = ci.object_id and c.column_id = ci.column_id
)  
select 
   table_name = case when row_num = 1 then table_name else '' end,
   table_index = case when row_num = 1 then table_index else '' end,
   index_keys
from table_indexes
As always, comments, suggestions, objections :) are all welcome...


~~ CK

Shrink Logfile in SQL 2008

I know that there's a great deal of debate on whether Shrinking a database and it's log is healthy. However, there will always be a situation that shrinking the log file is the only option you have. Here's the code that will shrink the database log in SQL 2008
declare @LogFileLogicalName sysname
select @LogFileLogicalName=Name from sys.database_files where Type=1

select @LogFileLogicalName

DBCC Shrinkfile(@LogFileLogicalName,100)


~~ CK

Convert Rows into A Delimited String, Part 2

Read the first part of the article, here. Here's another application of how to concatenate rows into a single string column. This script group concatenate the rows based on a PRIMARY KEY (in this case column ID). Here's the sample data that will be summarized:
id          speak_flag  read_flag   ia_flag     lit1_flag   lang
----------- ----------- ----------- ----------- ----------- ----
1           1           1           1           1           fi
1           1           1           0           0           sv
1           1           1           1           0           en
1           1           1           0           1           de
2           1           1           0           0           fi
2           1           0           1           1           sv
2           1           1           0           1           en
Here's the required output
id          speak_lang    read_lang     ia_lang       lit1_lang
----------- ------------- ------------- ------------- ----------
1           de,en,sv,fi   de,en,sv,fi   en,fi         de,fi
2           en,sv,fi      en,fi         sv            en,sv
Here's the code 1. Populate the sample data
create table langdtl
(
id int,
speak_flag int,
read_flag int,
ia_flag int,
lit1_flag int,
lit2_flag int,
lit3_flag int,
lang nvarchar(2)
)

insert into langdtl values(1,1,1,1,1,1,0,'fi')
insert into langdtl values(1,1,1,0,0,0,1,'sv')
insert into langdtl values(1,1,1,1,0,1,0,'en')
insert into langdtl values(1,1,1,0,1,1,1,'de')

insert into langdtl values(2,1,1,0,0,1,1,'fi')
insert into langdtl values(2,1,0,1,1,1,0,'sv')
insert into langdtl values(2,1,1,0,1,0,1,'en')

select 
   id ,speak_flag ,read_flag ,ia_flag ,lit1_flag ,lang 
from langdtl
2. Summarized and Group the Rows
;with language_table
as 
(select
   id, 
   rownum = row_number() over(partition by id order by id),
   speak_lang = isnull(lang + space(nullif(0, speak_flag)) + ',',''),
   read_lang = isnull(lang + space(nullif(0, read_flag))+ ',',''),
   ia_lang = isnull(lang + space(nullif(0, ia_flag))+ ',',''),
   lit1_lang = isnull(lang + space(nullif(0, lit1_flag))+ ',',''),
   lit2_lang = isnull(lang + space(nullif(0, lit2_flag))+ ',',''),
   lit3_lang = isnull(lang + space(nullif(0, lit3_flag))+ ',',''),
   speak_flag, read_flag, ia_flag, lit2_flag, lit3_flag
 from langdtl  
),
sorted_table 
as
(
select TOP 10000000 * from language_table order by id asc, rownum desc
), 
concatenated
as
(  
select 
   id,   
   speak_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN speak_lang END) , '' )  +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN speak_lang END) , '' ) + 
   COALESCE( MIN( CASE WHEN rownum = 2 THEN speak_lang END) , '' ) +  
   COALESCE( MIN( CASE WHEN rownum = 1 THEN speak_lang END) , '' ),
 
   read_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN read_lang END) , '' )  +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN read_lang END) , '' ) + 
   COALESCE( MIN( CASE WHEN rownum = 2 THEN read_lang END) , '' ) +  
   COALESCE( MIN( CASE WHEN rownum = 1 THEN read_lang END) , '' ),
 
   ia_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN ia_lang END) , '' )  +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN ia_lang END) , '' )  +
   COALESCE( MIN( CASE WHEN rownum = 2 THEN ia_lang END) , '' )  +
   COALESCE( MIN( CASE WHEN rownum = 1 THEN ia_lang END) , '' ), 

   lit1_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN lit1_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN lit1_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 2 THEN lit1_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 1 THEN lit1_lang END) , '' ), 

 
   lit2_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN lit2_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN lit2_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 2 THEN lit2_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 1 THEN lit2_lang END) , '' ), 

   lit3_lang = 
   COALESCE( MIN( CASE WHEN rownum = 4 THEN lit3_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 3 THEN lit3_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 2 THEN lit3_lang END) , '' ) +
   COALESCE( MIN( CASE WHEN rownum = 1 THEN lit3_lang END) , '' ) 
 
from  sorted_table
group by id
)
select id, 
   speak_lang = rtrim(substring(speak_lang,1, len(speak_lang)-1)),
   read_lang = rtrim(substring(read_lang,1, len(read_lang)-1)),
   ia_lang =  rtrim(substring(ia_lang,1, len(ia_lang)-1)),
   lit1_lang = rtrim(substring(lit1_lang,1, len(lit1_lang)-1))
from concatenated   


~~ CK

Thursday, August 5, 2010

What SQL Statements are currently running?

Whenever the server is running slow, my initial move is to check what query is running. Using the Activity Monitor shows information about the different connections and the t-sql running. However, in a batch, it will not display what specific sql statement is running within the batch. I found this code snippet that's very helpful to identify the specific SQL Statement that are currently running.

I forgot where I grab this, but I'm very thankful for this code.
SELECT 
[Server] = @@servername
, [Spid] = session_Id
, ecid
, [Database] = DB_NAME(sp.dbid)
, [User] = nt_username
, [Status] = er.status
, [Wait] = wait_type
, [Individual Query] =
  SUBSTRING (qt.text, er.statement_start_offset/2,
  (CASE WHEN er.statement_end_offset = -1
           THEN LEN(CONVERT(NVARCHAR(MAX), qt.text)) * 2
    ELSE er.statement_end_offset END -
   er.statement_start_offset)/2)
,[Parent Query] = qt.text
, Program = program_name
, Hostname
, nt_domain
, start_time
, elapsed_hours = datediff(hh, start_time, getdate())
, elapsed_mins =
  datediff(mi, dateadd(hh,datediff(hh, start_time, getdate()),start_time), getdate())
FROM sys.dm_exec_requests er
INNER JOIN sys.sysprocesses sp ON er.session_id = sp.spid
CROSS APPLY sys.dm_exec_sql_text(er.sql_handle)as qt
WHERE session_Id > 50             
AND session_Id NOT IN (@@SPID)    
ORDER BY 1, 2
I would thank the origin of this code, but I forgot where I got it from... Thank you, nevertheless...


~~ CK

Wednesday, August 4, 2010

A Sample Table-to-XML conversion.

A question was posted on the forum that I am a member of to convert a table into an XML data. The content of table look like this:
STUDENT_ID   grade
---------- ---------
 abc123     0.2
 abc123     0.4
 abc123     0.7
 def123     0.1
 def123     0.4
 def123     0.5

The request is to convert the above data into:
<Students StudentID="abc123">
  <StudentGrade Grades="0.2" />
  <StudentGrade Grades="0.4" />
  <StudentGrade Grades="0.7" />
</Students>
<Students StudentID="def123">
  <StudentGrade Grades="0.1" />
  <StudentGrade Grades="0.4" />
  <StudentGrade Grades="0.5" />
</Students>

Here's the code I posted as a reply:
declare @TestData Table
(
STUDENT_ID VARCHAR(6),
grade Decimal(2,1)
)

insert into @TestData (student_id, grade)
select 'abc123', 0.2 UNION
select 'abc123', 0.7 UNION
select 'abc123', 0.4 UNION
select 'def123', 0.1 UNION
select 'def123', 0.5 UNION
select 'def123', 0.4

select * from @TestData

;with tag1
as
(
  select distinct student_id
  from @testdata
)
select
  1 as tag,
  0 as parent,
  student_id as [Students!1!StudentID],
  null as [StudentGrade!2!Grades]
from tag1
union all
select
  2 as tag,
  1 as parent,
  t1.student_id as [Students!1!StudentID],
  grade as [StudentGrade!2!Grades]
from tag1 t1
  left join @testdata t2 on t1.student_id = t2.student_id
order by [Students!1!StudentID], [StudentGrade!2!Grades]
for xml explicit

As always, comments are welcome...


~~CK

Returning Sequence of Numbers

For quite sometime, I use COLORDER column of syscolumns table to return a sequential number. I will usually pick a table with large number of columns enough to return my requirement. Until I come across Common Table Expression (CTE). One of the major difference between a CTE and a derived table or sub-query, is that a CTE can be self-referencing or recursive.

Consider returning a series of numbers 1-31.

Here's the code:
WITH Number_Sequence
AS
(
   SELECT 1 AS sequence
   UNION ALL
   SELECT sequence+1
   FROM Number_Sequence W
      WHERE sequence < 31
)
SELECT *
FROM Number_Sequence

Now that it's a result set, it may be used to join to another table. Since this is a CTE and not a physical table, it's not indexed and could degrade processing. Table size should always be considered before joining this CTE to another table.

Here's another sample code that simulate bit positions in a byte:
with GroupNumbers
as
(
   select 1 as byte, 1 as from_bit, 8 as to_bit
   union all
   select byte + 1, from_bit + 8, to_bit + 7
   from GroupNumbers
   where to_bit < 64
)  select * from GroupNumbers

As always, comments are welcome...


~~CK

Sunday, March 7, 2010

Returning a random set of data

A number of times, I've been requested to provide a random data for various purposes, including testing and sampling. I have found a number of ways of doing it which mostly include creating a multiple layer of sub-query and even creating temporary tables. My usual mindset is that I can not really use the TOP n keyword because, most of the time, it will return the same records every time. I also tried to use the RAND() function that can generate random number. Closer look with this function, however, says repetitive calls of RAND() with the same seed value return the same results (BOL). When I received another project that requires a randomly generated set of data, I started searching the net on how to do this in a simpler way. This time, I leverage an existing function that will always generate a random value. The NEWID() function. This function always returns a unique value of uniqueidentifier data type.

Here's a sample table:
BookId         BookTitle
------      ---------------
1          Yesterday
2          In My Life
3          Hey Junde
4          Fool On the Hill
5          If I Fell
6          Let It Be
7          Till There Was You
8          Yellow Submarine
9          I Should Have Known Better
Running a simple TOP X% query always return the same set of records.
select  TOP 30 PERCENT BookId, BookTitle from books
Using the RAND() function always returns the same set of records:
select  TOP 30 PERCENT BookId, BookTitle from books
order by rand(rand(1))

BookId         BookTitle
------      ---------------
1          Yesterday
2          In My Life
3          Hey Junde
Using the NEWID() function always returns a random set of records
select  TOP 30 PERCENT BookId, BookTitle from books
order by newid()

BookId         BookTitle
------      ---------------
4          Fool On the Hill
5          If I Fell
1          Yesterday
Here's another run
select  TOP 30 PERCENT BookId, BookTitle from books
order by newid()

BookId         BookTitle
------      ---------------
5          If I Fell
8          Yellow Submarine
2          In My Life
So far, I have not seen any issue using this function in a big table.


~CK