Monday, October 20, 2008

Careful EXISTS

In one of my old notes, I wrote how JOIN, IN and EXISTS are all affected by NULL values in the columns used to join tables, or to check for existence of relationship (parent-child) between tables.

Here are some queries that use EXISTS that will not give any syntax error but could give unexpected results

Consider this sample table
select * from Authors
select * from Books

AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland

aid BookID BookTitle BookPrice
----------- ----------- ------------------------------ -----------
1 1 Yesterday 5.00
1 2 In My Life 7.50
1 3 Hey Junde 4.45
2 4 Fool On the Hill NULL
2 6 If I Fell 7.80
4 7 Let It Be NULL
4 8 Till There Was You 0.00
2 9 Yellow Submarine 34.65
1 10 I Should Have Known Better 65.33
Now if I execute these queries:
select * 
from authors
where exists
(select Authorid from Books
where BookTitle = 'Yesterday')

select *
from authors
where exists
(select aid from Books
where BookTitle = 'Let It Be')

Here are the result sets:
AuthorID    AuthorName                     AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland

AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland

The first query did not result any error. This, in spite the column AuthorID does not exist in Books table.

Separately executed, the subquery will return an error:
select Authorid from Books  
where BookTitle = 'Yesterday'

Msg 207, Level 16, State 1, Line 1
Invalid column name 'Authorid'.
The parser did not return any error even though AuthorID does not exists in Books. This is because it exists in Authors. In this case, the query will use the column in Authors table instead. This could return unexpected results.

To illustrate this query in another way:
declare @myVar int

set @myVar = 1

select @myVar myVar from Books where BookTitle = 'Yesterday'

set @myVar = 2

select @myVar myVar from Books where BookTitle = 'Yesterday'

Here are the result sets:

myVar
-----------
1

myVar
-----------
2
I also noticed that the queries returned all rows. This is due to the nature of how EXISTS works. It checks if the subquery clause returns at least a single row. As illustrated in another way, the above query always returns a valid row regardless of the value of @myVar. The query itself is incomplete. The filter condition that should link the two table is missing. If I am to use EXISTS to relate the two tables, the query should be written this way:
select * 
from authors a
where exists
(select 1 from Books b
where a.AuthorID = b.aid and BookTitle = 'Yesterday')
Here’s the result set:
AuthorID    AuthorName                     AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco


~~CK

No comments:

Post a Comment