Thursday, February 12, 2015

Some Unexpected Result of LEN() function on REVERSED() string

I have a project that involves a number of string manipulation. The LEN() and REVERSE() functions are two of the most common functions that I use to manipulate strings for various purposes.

I have not noticed it anything wrong with it until lately. Check out the following codes:
select CharVar = ''123' - char(9) - no space', 
       CharLen = len(cast(''123'' as char(9))), 
       ReverseLen = len(reverse(cast(''123'' as char(9))))
union all
select CharVar = ''123' - char(9) - trailing 1 space', 
       CharLen = len(cast(''123' ' as char(9))), 
       ReverseLen = len(reverse(cast(''123' ' as char(9))))
union all
select CharVar = ''123' - char(9) - trailing 2 space', 
       CharLen = len(cast(''123'  ' as char(9))), 
       ReverseLen = len(reverse(cast(''123'  ' as char(9))))
As expected, the LEN() function ignored the trailing spaces. The LEN() of the returned string by the REVERSE() function, however, returned the full size of the character.
CharVar                                CharLen     ReverseLen
--------------------------------       ----------- -----------
''123'' - char(9) - no space           3           9
''123'' - char(9) - trailing 1 space   3           9
''123'' - char(9) - trailing 2 space   3           9
Now it starts to get weird...
select CharVar = ''123' - char(9) - leading 1 space', 
       CharLen = len(cast(' '123'' as char(9))), 
       ReverseLen = len(reverse(cast(' '123'' as char(9))))
union all
select CharVar = ''123' - char(9) - leading 2 space',
       CharLen = len(cast('  '123'' as char(9))), 
       ReverseLen = len(reverse(cast('  '123'' as char(9))))
union all
select CharVar = '123 - char(9) - leading 3 space', 
       CharLen = len(cast('   123' as char(9))), 
       ReverseLen = len(reverse(cast('   123' as char(9))))
Check out the results:
CharVar                            CharLen     ReverseLen
-------------------------------    ----------- -----------
'123' - char(9) - leading 1 space  4           8
'123' - char(9) - leading 2 space  5           7
'123' - char(9) - leading 3 space  6           6
The LEN() function counted the leading spaces. After the REVERSE() function, however, it seems the LEN() returns the size of the char variable less the number of trailing spaces. I was under the impression that LEN() will ignore those spaces as, after the REVERSE(), it became trailing spaces.

After further research, it looks like REVERSE() function returns the same datatype of the string being passed to it, as expected. If the character of the string passed is less than the full length of the character, reverse automatically returns a value with leading spaces.

To illustrate this, check this code:
select CharVar = 'x' + cast('123' as char(9)), 
       ReversedChar = 'x' + reverse(cast('123' as char(9)))
Here's the result:
CharVar    ReversedChar
---------- ------------
x123       x      321
The returned ReversedChar value reversed the string with the same data type. Since it's a char data type, the returned string included the expected spaces in the char variable.

I guess this is not unexpected, maybe am just not aware...

To count the number of bytes in a var/character string, use the DATALENGTH() instead.

Comment?


~~ CK

Monday, February 9, 2015

Why IsNumeric() returns unexpected results?

IsNumeric() is a handy function that returns 1 when the input expression evaluates to a valid numeric data type; otherwise it returns 0.

Reading carefully, it can sometimes return unexpected value. Book Online notes that it returns 1 for some characters that are not numbers, such as plus (+), minus (-), and valid currency symbols such as the dollar sign ($).

There are also some accounting symbols that evaluates to a number, hence the function returns 1. The most commonly forgotten reason is "Scientific Notation". The letters "e" and "d" are used for different numeric notations. These letters, as is, do not evaluate as a number and the function will return 0.

As these samples shows:
SELECT Expression = '''d''', IsItaNumber = ISNUMERIC('d')
SELECT Expression = '''e''', IsItaNumber = ISNUMERIC('e')

Here are the results:
Expression IsItANumber
---------- -----------
'd'        0

Expression IsItANumber
---------- -----------
'e'        0

Included in numbers, however will return a different result
SELECT Expression = '''097d89''',  IsItANumber = ISNUMERIC('097d89')
SELECT Expression = '''489e239''', IsItANumber = ISNUMERIC('489e239')

Here are the results:
Expression IsItANumber
---------- -----------
'097d89'   1

Expression IsItANumber
---------- -----------
'489e239'  1

Enter Regular Expression...
SELECT Expression = '''097d89''',  IsItANumber = ISNUMERIC('097d89'),
     RegExIsItANumber = case when '097d89' NOT LIKE '%[^0-9]%' then 1 else 0 end

SELECT Expression = '''489e239''', IsItANumber = ISNUMERIC('489e239'), 
     RegExIsItANumber = case when '489e239' NOT LIKE '%[^0-9]%' then 1 else 0 end

Here are the results:
Expression IsItANumber RegExIsItANumber
---------- ----------- ----------------
'097d89'   1           0

Expression IsItANumber RegExIsItANumber
---------- ----------- ----------------
'489e239'  1           0

The double negative expression looks confusing. Simply put, the condition below will return TRUE if all the character in the expression is a digit from 0 through 9.
[CharExpression] NOT LIKE '%[^0-9]%' 

Be careful, this regular expression only checks for digits. So if the input has a decimal point, it will still return FALSE.

UPDATE:

Starting SQL 2012, a new function is provided for checking if a value may be converted to a specific data type. TRY_CONVERT() is a function that converts a value to a specific data type and returns a NULL value if the value cannot be converted to the expected data type.

Here's a sample:
select ValueToBeConverted = '1e35', IsNumericFunction = isnumeric(1e35), 
       ConvertedValue = try_convert(int, 1e35), ConvertedValue = try_convert(float, 1e35)
Here's the resultset:
ValueToBeConverted IsNumericFunction ConvertedToInt ConvertedToFloat
------------------ ----------------- -------------- ----------------------
1e35               1                 NULL           1E+35

Notice that the fourth column returned a valid value as the letter "e" is a part of numeric value.

As always, comments are welcome...


~~ CK