Consider this sample table:
set nocount onHere's how these three works in checking for existence of parent-child relationship.
create table Authors
(AuthorID int, AuthorName varchar(30), AuthorCity varchar(15))
create table Books
(aid int, BookID int, BookTitle varchar(30), BookPrice money)
insert into Authors values (1, 'Jhon', 'San Francisco')
insert into Authors values (2, 'Paul', 'Los Angeles')
insert into Authors values (3, 'George', 'San Diego')
insert into Authors values (4, 'Ringo', 'Oakland')
insert into Books values (1, 1, 'Yesterday',5.00)
insert into Books values (1, 2, 'In My Life',7.50)
insert into Books values (1, 3, 'Hey Junde',4.45)
insert into Books values (2, 4, 'Fool On the Hill',NULL)
insert into Books values (2, 6, 'If I Fell',7.80)
insert into Books values (4, 7, 'Let It Be',NULL)
insert into Books values (4, 8, 'Till There Was You',0.00)
insert into Books values (2, 9, 'Yellow Submarine',34.65)
insert into Books values
(1, 10, 'I Should Have Known Better',65.33)
select * from Authors
select * from Books
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland
aid BookID BookTitle BookPrice
----------- ----------- ------------------------------ ---------------------
1 1 Yesterday 5.00
1 2 In My Life 7.50
1 3 Hey Junde 4.45
2 4 Fool On the Hill NULL
2 6 If I Fell 7.80
4 7 Let It Be NULL
4 8 Till There Was You 0.00
2 9 Yellow Submarine 34.65
1 10 I Should Have Known Better 65.33
select *Here are the result sets:
from authors a
where exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid in (select aid from books)
select distinct a.*
from authors a
join books b on a.authorid = b.aid
where b.aid is not null
AuthorID AuthorName AuthorCityIn the case of JOIN, I have to use DISTINCT to avoid returning duplicate rows. So far these queries are behaving the way I expected them to be.
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
Now I'll try and find parent records with no child using the NOT operator.
select *Here are the result sets:
from authors a
where not exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid not in (select aid from books)
select distinct a.*
from authors a
left join books b on a.authorid = b.aid
where b.aid is null
AuthorID AuthorName AuthorCityThe queries returned the expected results. Now, see what happen if I have a row with AuthorID = NULL on the parent table.
----------- ------------------------------ ---------------
3 George San Diego
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
3 George San Diego
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
3 George San Diego
First I'll insert the dummy record.
insert into Authors values (NULL, 'Dummy', 'Daly City')Running the queries again:
select * from Authors
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland
NULL Dummy Daly City
select *Here are the result sets:
from authors a
where not exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid not in (select aid from books)
select distinct a.*
from authors a
left join books b on a.authorid = b.aid
where b.aid is null
AuthorID AuthorName AuthorCityNotice that IN did not return the expected result. It excluded the NULL on the parent table. To include that record, I actually have to add an OR condition.
----------- ------------------------------ ---------------
3 George San Diego
NULL Dummy Daly City
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
3 George San Diego
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
NULL Dummy Daly City
3 George San Diego
select *Now what if I don't have a row with AuthorID = NULL on the parent table, but instead it's on the child table.
from authors a
where authorid not in (select aid from books)
or authorid is null
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
3 George San Diego
NULL Dummy Daly City
First remove the record from the Authors table :
delete from Authors where AuthorId is nullthen insert the record in Books table:
select * from Authors
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland
insert into Books valuesRunning the queries again:
(NULL, 5, 'The Long and Winding Road',5.65)
select * from Books
aid BookID BookTitle BookPrice
----------- ----------- ------------------------------ ---------------------
1 1 Yesterday 5.00
1 2 In My Life 7.50
1 3 Hey Junde 4.45
2 4 Fool On the Hill NULL
2 6 If I Fell 7.80
4 7 Let It Be NULL
4 8 Till There Was You 0.00
2 9 Yellow Submarine 34.65
1 10 I Should Have Known Better 65.33
NULL 5 The Long and Winding Road 5.65
select *Here are the result sets:
from authors a
where not exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid not in (select aid from books)
select distinct a.*
from authors a
left join books b on a.authorid = b.aid
where b.aid is null
AuthorID AuthorName AuthorCityNotice that IN did not return any result at all. Now what will happen if both the parent and the child tables have AuthorId = NULL and AID = NULL.
----------- ------------------------------ ---------------
3 George San Diego
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
3 George San Diego
First I'll insert back the row on Authors table:
insert into Authors values (NULL, 'Dummy', 'Daly City')Now the tables look like this:
select * from AuthorsFirst I'll check for existence of parent-child relationship
select * from Books
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland
NULL Dummy Daly City
aid BookID BookTitle BookPrice
----------- ----------- ------------------------------ ---------------------
1 1 Yesterday 5.00
1 2 In My Life 7.50
1 3 Hey Junde 4.45
2 4 Fool On the Hill NULL
2 6 If I Fell 7.80
4 7 Let It Be NULL
4 8 Till There Was You 0.00
2 9 Yellow Submarine 34.65
1 10 I Should Have Known Better 65.33
NULL 5 The Long and Winding Road 5.65
select *Here are the result sets:
from authors a
where exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid in (select aid from books)
select distinct a.*
from authors a
join books b on a.authorid = b.aid
where b.aid is not null
AuthorID AuthorName AuthorCitySo even if there is a row in both tables with AuthorID = NULL and AID = NULL the result set did not return it. This is because SQL Server can not compare two NULLs. So NULL = NULL is neither TRUE nor FALSE. Since the condition is not TRUE, which is the primary requirement for the filter to include the row in the result set, it was excluded.
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
4 Ringo Oakland
Now, if I try and see if there are records on the parent table that does not have a child record…
select *Here are the result sets:
from authors a
where not exists (select aid from Books b where authorid = aid)
select *
from authors a
where authorid not in (select aid from books)
select distinct a.*
from authors a
left join books b on a.authorid = b.aid
where b.aid is null
AuthorID AuthorName AuthorCityThe query that uses IN did not return any result set. This is because IN will not return any rows if the list contains a NULL value. This is true whether the list is a list of values or a subquery.
----------- ------------------------------ ---------------
3 George San Diego
NULL Dummy Daly City
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
NULL Dummy Daly City
3 George San Diego
select * from AuthorsHere's the result set:
AuthorID AuthorName AuthorCity
----------- ------------------------------ ---------------
1 Jhon San Francisco
2 Paul Los Angeles
3 George San Diego
4 Ringo Oakland
NULL Dummy Daly City
select *
from authors a
where authorid not in (1,4,NULL)
AuthorID AuthorName AuthorCityBased on these samples IN is the most unreliable way among the three in terms of checking parent-child relationship, specially if either or both the parent and child tables have a NULL on value on the column used as join key.
----------- ------------------------------ ---------------
Although some of these situations may be avoided using the Primary Key and Foreign Key Constraint built in SQL Server, these inconsistencies could happen while using these queries during data analysis stage.
~~CK
No comments:
Post a Comment