Thursday, March 17, 2016

Correlated or Derived JOIN in SQL Server


You must be wondering what is this correlated join in SQL Server? Let me clarify it’s not a type of join like INNER or OUTER, it is a kind of usage or play around with joins. Some call it derived join too. Anyway it really does not matter what it is called. The concept is important to understand here. Different types of JOINs in SQL Server
Couple of months ago, I was interviewing a guy. My first question was, find the top 3 business by Amount from Business table, he was kind of happy because he thought it’s a very silly and common question. I saw that happiness and tweaked the question a little bit, I asked him to get me the same out put but by doing a LEFT JOIN with Employee then INNER JOIN between Employee and Location tables.
Let’s do the practical. We have 3 tables, named Employee, Location and Business. In Business table, we have the business data. Let’s find the top 3 businesses by Amount made so far by our employees using the joins mentioned above.
Business:
BusinessID
BusinessName
EmpID
Amount
1
92 Per-cents %% team
7
4750
2
K-Land fund trust
2
16750
3
Robert@BigStarBank.com
7
18100
5
BIG 6’s Foundation%
4
22090
6
TALTA_Kishan International
3
18100
7
Ben@MoreTechnology.com
10
41000
8
www.@-Last-U-Can-Help.com
7
25000
9
Thank you @.com
11
21500
10
Just Mom
5
9900
11
Big Giver Tom
7
19000
12
Mega Mercy
9
55000
13
Hope Reaches
7
29000
14
Everyone Wins
4
12500
Employee:
EmpID
LastName
FirstName
LocationID
ManagerID
2
Brown
Barry
1
11
3
Osako
Lee
2
11
4
Kinnison
Dave
1
11
5
Bender
Eric
1
11
7
Lonning
David
NULL
11
9
Newton
James
2
3
10
O’Haire
Terry
2
3
11
Smith
Sally
1
NULL
Location
LocationID
street
city
state
1
545 Pike
Seattle
WA
2
222 Second AVE
Boston
MA
4
444 Ruby ST
Spokane
WA
5
1595 Main
Philadelphia
PA
6
915 Wallaby Drive
Sydney
NULL
TOP 3 Business by Amount, our desired output:
BusinessName
Amount
EmpID
Mega Mercy
55000
9
Ben@MoreTechnology.com
41000
10
Hope Reaches
29000
7
To get the above result, we can simply write a SELECT TOP(3) statement on Business table and then ORDER BY Amount DESC like shown below.
SELECT TOP (3) BusinessName, Amount from [dbo].[Business]
ORDER BY Amount DESC
Remember, I had tweaked the question and asked him to write the query by doing LEFT JOIN between Business and Employee tables then INNER JOIN between Employee and Location tables. Usually someone will write a query like below.
--Wrong
SELECT TOP (3) b.BusinessName, b.Amount FROM Business b

LEFT JOIN Employee e
    ON e.EmpID = b.EmpID

INNER JOIN Location l
    ON l.LocationID = e.LocationID

ORDER BY b.Amount DESC
Output: (Wrong)
BusinessName
Amount
EmpID
Mega Mercy
55000
9
Ben@MoreTechnology.com
41000
10
BIG 6’s Foundation%
22090
4
We know BIG 6’s Foundation% (22090) is not our 3rd highest business. The reason Hope Reaches(29000) did not come in the result set because EmpID 7  owns that business and EmpID 7 has no relative data in the location table,  LocationID is NULL for EmpID 7 in the Employee table.
Note: In the above query we are fine up to LEFT JOIN between Business and Employee tables but the INNER JOIN with Location phrased out the whole result set at the end with only matching records between Employee and Location tables. Our EmpID 7 is the owner of the 3rd highest business unfortunately he works from home so his business is filtered out and the next highest business is displayed.
Now answering the question.. There are number of ways to get the desired result set but probably the most natural is to use an interesting capability of joins.
--Right
SELECT TOP (3) b.BusinessName, b.Amount, b.EmpID FROM Business b

LEFT JOIN (
       Employee e

       INNER JOIN Location l
       ON l.LocationID = e.LocationID

       )
     ON e.EmpID = b.EmpID

ORDER BY b.Amount DESC
Output: (Right)
BusinessName
Amount
EmpID
Mega Mercy
55000
9
Ben@MoreTechnology.com
41000
10
Hope Reaches
29000
7
In the above query, we isolated the EMPLOYEE and LOCATION join within parenthesis to their own pendent logical phase. Which is why the LEFT JOIN is not filtered out by the location anymore.