Tuesday, June 2, 2015

Convert Delimited String into Rows

Here's another way of converting a delimited string into a result set or rows. This technique is much more straightforward and does not involve xml processing. It saves memory, specially if this step needs to be executed inside a loop.

Say you have this string:
x
----------------------------------------------------------
one, two, three, four, five, six, seven, eight, nine, ten
Which needs to be converted into:
xRow
--------
one
two
three
four
five
six
seven
eight
nine
ten
Here's the code...

First,  all spaces in the string must be removed and an extra delimiter needs to be added in the end of the string.
declare @x as varchar(150)

set @x = 'one, two, three, four, five, six, seven, eight, nine, ten'

set @x = replace(@x, ' ','') + ','

select @x as x
Second, using CTE, the string can be parse using the delimiter to identify where to break...
;with ToTable
as
(
   select cast(substring(@x,1, charindex(',',@x)-1) as varchar(150)) as xRow, cast(substring(@x,charindex(',',@x)+1, 150) as varchar(150)) as xValue2
   union all
   select cast(substring(xValue2,1, charindex(',',xValue2)-1) as varchar(150)) as xValues,  cast(substring(xValue2,charindex(',',xValue2)+1, 150) as varchar(150)) as xValue2
   from ToTable where xValue2 <> ''
)
select xRow
from ToTable
Now that it's a result set, it can be used to JOIN with other tables. Consideration should always be done how big the tables are.

If ever there's a need to use variable values inside an IN clause, this technique is very useful. Check this out on how to do this through XML.

As always, comments are welcome...


~~ CK

Tuesday, March 17, 2015

Converting Bytes to KB, MB or GB

I was looking for some algorithm on how to convert Bytes to KB or MB or GB depending on the value. Most of the algorithm that I saw are using case statement to implement the breakdown as brackets.

Here's my own version. This will generate some random number that will be converted to KB, MB or GB. But this should help me in the future.
;with Sorted
as
(
   select 1 as base
   union all
   select base = base + 1
   from Sorted
   where base < 20
), Randomized
as
(
   select top 5 base
   from Sorted
   order by newid()
), SizeInStorage
as
(
select
   ByteSize = cast(abs(checksum(newid())) as bigint) 
from randomized
)
select
   ByteSize,
   Formatted = replace(convert(varchar(20), cast(ByteSize as money),1),'.00',''),
   Converted = coalesce(nullif(ByteSize/power(cast(1024 as bigint),4),0),
                        nullif(ByteSize/power(cast(1024 as bigint),3),0),
                        nullif(ByteSize/power(cast(1024 as bigint),2),0),
                        nullif(ByteSize/power(cast(1024 as bigint),1),0), 
                        ByteSize
                        ),
   Unit = coalesce('TB' + space(nullif(ByteSize/power(cast(1024 as bigint),4),0)), 
                   'GB' + space(nullif(ByteSize/power(cast(1024 as bigint),3),0)), 
                   'MB' + space(nullif(ByteSize/power(cast(1024 as bigint),2),0)), 
                   'KB' + space(nullif(ByteSize/power(cast(1024 as bigint),1),0)), 
                   'Bytes'
                  )
from SizeInStorage
Here's a sample result set:
ByteSize             Formatted       Converted   Unit
-------------------- --------------- ----------- -----
926528091            926,528,091     883         MB 
105563351            105,563,351     100         MB 
1603808599           1,603,808,599   1           GB 
866882480            866,882,480     826         MB 
426406913            426,406,913     406         MB 
Update...I want to convert bytes with two decimal places.
;with Sorted
as
(
   select 1 as base
   union all
   select base = base + 1
   from Sorted
   where base < 20
), Randomized
as
(
   select top 5 base
   from Sorted
   order by newid()
), SizeInStorage
as
(
select
   ByteSize = cast(abs(checksum(newid())) as bigint)
from randomized
union all
select top 1
   ByteSize = substring(cast(abs(checksum(newid())) as varchar(10)),1,5)
from randomized
union all
select top 1
   ByteSize = substring(cast(abs(checksum(newid())) as varchar(10)),1,3)
from randomized
)
select
   ByteSize,
   Formatted = replace(convert(varchar(20), cast(ByteSize as money),1),'.00',''),
   Converted = convert(varchar(30), isnull(cast(ByteSize as money)/ cast(coalesce((nullif((ByteSize/power(cast(1024 as bigint),4)),0)/nullif((ByteSize/power(cast(1024 as bigint),4)),0)) * power(cast(1024 as bigint),4),
                        (nullif((ByteSize/power(cast(1024 as bigint),3)),0)/nullif((ByteSize/power(cast(1024 as bigint),3)),0)) * power(cast(1024 as bigint),3),
                        (nullif((ByteSize/power(cast(1024 as bigint),2)),0)/nullif((ByteSize/power(cast(1024 as bigint),2)),0)) * power(cast(1024 as bigint),2),
                        (nullif((ByteSize/power(cast(1024 as bigint),1)),0)/nullif((ByteSize/power(cast(1024 as bigint),1)),0)) * power(cast(1024 as bigint),1)) as money)                                               
                        ,
                        ByteSize * 1.00
                           ), 1),
   Unit = coalesce('TB' + space(nullif(ByteSize/power(cast(1024 as bigint),4),0)),
                   'GB' + space(nullif(ByteSize/power(cast(1024 as bigint),3),0)),
                   'MB' + space(nullif(ByteSize/power(cast(1024 as bigint),2),0)),
                   'KB' + space(nullif(ByteSize/power(cast(1024 as bigint),1),0)),
                   'Bytes'
                  )
from SizeInStorage
 
Here's the resultset:
ByteSize      Formatted         Converted   Unit
------------  ----------------- ----------- ----
1582690891    1,582,690,891     1.47         GB
786337767     786,337,767       749.91       MB
431093153     431,093,153       411.12       MB
1067677070    1,067,677,070     1,018.22     MB
626880896     626,880,896       597.84       MB
28760         28,760            28.09        KB
308           308               308.00       Bytes

Comment?


~~ CK


Thursday, February 12, 2015

Some Unexpected Result of LEN() function on REVERSED() string

I have a project that involves a number of string manipulation. The LEN() and REVERSE() functions are two of the most common functions that I use to manipulate strings for various purposes.

I have not noticed it anything wrong with it until lately. Check out the following codes:
select CharVar = ''123' - char(9) - no space', 
       CharLen = len(cast(''123'' as char(9))), 
       ReverseLen = len(reverse(cast(''123'' as char(9))))
union all
select CharVar = ''123' - char(9) - trailing 1 space', 
       CharLen = len(cast(''123' ' as char(9))), 
       ReverseLen = len(reverse(cast(''123' ' as char(9))))
union all
select CharVar = ''123' - char(9) - trailing 2 space', 
       CharLen = len(cast(''123'  ' as char(9))), 
       ReverseLen = len(reverse(cast(''123'  ' as char(9))))
As expected, the LEN() function ignored the trailing spaces. The LEN() of the returned string by the REVERSE() function, however, returned the full size of the character.
CharVar                                CharLen     ReverseLen
--------------------------------       ----------- -----------
''123'' - char(9) - no space           3           9
''123'' - char(9) - trailing 1 space   3           9
''123'' - char(9) - trailing 2 space   3           9
Now it starts to get weird...
select CharVar = ''123' - char(9) - leading 1 space', 
       CharLen = len(cast(' '123'' as char(9))), 
       ReverseLen = len(reverse(cast(' '123'' as char(9))))
union all
select CharVar = ''123' - char(9) - leading 2 space',
       CharLen = len(cast('  '123'' as char(9))), 
       ReverseLen = len(reverse(cast('  '123'' as char(9))))
union all
select CharVar = '123 - char(9) - leading 3 space', 
       CharLen = len(cast('   123' as char(9))), 
       ReverseLen = len(reverse(cast('   123' as char(9))))
Check out the results:
CharVar                            CharLen     ReverseLen
-------------------------------    ----------- -----------
'123' - char(9) - leading 1 space  4           8
'123' - char(9) - leading 2 space  5           7
'123' - char(9) - leading 3 space  6           6
The LEN() function counted the leading spaces. After the REVERSE() function, however, it seems the LEN() returns the size of the char variable less the number of trailing spaces. I was under the impression that LEN() will ignore those spaces as, after the REVERSE(), it became trailing spaces.

After further research, it looks like REVERSE() function returns the same datatype of the string being passed to it, as expected. If the character of the string passed is less than the full length of the character, reverse automatically returns a value with leading spaces.

To illustrate this, check this code:
select CharVar = 'x' + cast('123' as char(9)), 
       ReversedChar = 'x' + reverse(cast('123' as char(9)))
Here's the result:
CharVar    ReversedChar
---------- ------------
x123       x      321
The returned ReversedChar value reversed the string with the same data type. Since it's a char data type, the returned string included the expected spaces in the char variable.

I guess this is not unexpected, maybe am just not aware...

To count the number of bytes in a var/character string, use the DATALENGTH() instead.

Comment?


~~ CK

Monday, February 9, 2015

Why IsNumeric() returns unexpected results?

IsNumeric() is a handy function that returns 1 when the input expression evaluates to a valid numeric data type; otherwise it returns 0.

Reading carefully, it can sometimes return unexpected value. Book Online notes that it returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).

There are also some accounting symbols that evaluates to a number, hence the function returns 1. The most commonly forgotten reason is "Scientific Notation". The letters "e" and "d" are used for different numeric notations. These letters, as is, do not evaluate as a number and the function will return 0.

As these samples shows:
SELECT Expression = '''d''', IsItaNumber = ISNUMERIC('d')
SELECT Expression = '''e''', IsItaNumber = ISNUMERIC('e')

Here are the results:
Expression IsItANumber
---------- -----------
'd'        0

Expression IsItANumber
---------- -----------
'e'        0

Included in numbers, however will return a different result
SELECT Expression = '''097d89''',  IsItANumber = ISNUMERIC('097d89')
SELECT Expression = '''489e239''', IsItANumber = ISNUMERIC('489e239')

Here are the results:
Expression IsItANumber
---------- -----------
'097d89'   1

Expression IsItANumber
---------- -----------
'489e239'  1

Enter Regular Expression...
SELECT Expression = '''097d89''',  IsItANumber = ISNUMERIC('097d89'),
     RegExIsItANumber = case when '097d89' NOT LIKE '%[^0-9]%' then 1 else 0 end

SELECT Expression = '''489e239''', IsItANumber = ISNUMERIC('489e239'), 
     RegExIsItANumber = case when '489e239' NOT LIKE '%[^0-9]%' then 1 else 0 end

Here are the results:
Expression IsItANumber RegExIsItANumber
---------- ----------- ----------------
'097d89'   1           0

Expression IsItANumber RegExIsItANumber
---------- ----------- ----------------
'489e239'  1           0

The double negative expression looks confusing. Simply put, the condition below will return TRUE if all the character in the expression is a digit from 0 through 9.
[CharExpression] NOT LIKE '%[^0-9]%' 

Be careful, this regular expression only checks for digits. So if the input has a decimal point, it will still return FALSE.

UPDATE:

Starting SQL 2012, a new function is provided for checking if a value may be converted to a specific data type. TRY_CONVERT() is a function that converts a value to a specific data type and returns a NULL value if the value cannot be converted to the expected data type.

Here's a sample:
select ValueToBeConverted = '1e35', IsNumericFunction = isnumeric(1e35), 
       ConvertedValue = try_convert(int, 1e35), ConvertedValue = try_convert(float, 1e35)
Here's the resultset:
ValueToBeConverted IsNumericFunction ConvertedToInt ConvertedToFloat
------------------ ----------------- -------------- ----------------------
1e35               1                 NULL           1E+35

Notice that the fourth column returned a valid value as the letter "e" is a part of numeric value.

As always, comments are welcome...


~~ CK