Tag Archives: JOIN TROUBLESHOOT

Adding Limiting Criteria to a SQL OUTER JOIN

When developers first start using and writing SQL queries, JOIN syntax can be confusing. There are many explanations about the basic syntax available all over the web so this post doesn’t reiterate those. This post explains what happens when you start adding limiting criteria to your JOIN.

Even if the basic INNER and OUTER (LEFT and RIGHT) join syntax is understood, adding limiting criteria to the JOIN has tripped up more than one developer.

Examples

The following examples use the AdventureWorksLT sample database. It is currently available here – SQL Server Sample Databases.

USE AdventureWorksLT

 

Basic INNER JOIN

This is a basic INNER JOIN, so only matching rows are returned. If a customer doesn’t have any orders in the database it won’t be displayed. This returns 32 results.

SELECT *
FROM SalesLT.Customer C
    INNER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID

EXAMPLE.JOIN.OUTER.With.Criteria.Example 1

Basic OUTER JOIN

This is a basic OUTER JOIN, so all rows are returned from the primary table (Customer) and only the matching rows from the JOINed table. This returns 847 results.

SELECT *
FROM SalesLT.Customer C
    LEFT OUTER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID

EXAMPLE.JOIN.OUTER.With.Criteria.Example 2

OUTER JOIN With Criteria – try 1

So now you have your basic query created and you need to restrict the rows that are returned. If you get a requirement stating to exclude orders over $1,000 but you still want to see all of the Customer records you need to use an OUTER join. If you are new to SQL or you forget that the Total Due column is in the OUTER table you might put the exclusion criteria in the WHERE clause. This example returns 8 rows.

SELECT *
FROM SalesLT.Customer C
    LEFT OUTER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID
WHERE TotalDue < 1000

EXAMPLE.JOIN.OUTER.With.Criteria.Example 3

OUTER JOIN With Criteria – try 2

Putting the exclusion criteria in the WHERE clause was too restrictive. Another way I’ve seen developers approach this problem is to check for NULLs. Checking for NULL does take care of part of the problem, but you can see this only returns 823 rows, so some customers are still excluded.

SELECT *
FROM SalesLT.Customer C
    LEFT OUTER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID
WHERE (TotalDue < 1000
        OR TotalDue IS NULL)

EXAMPLE.JOIN.OUTER.With.Criteria.Example 4

OUTER JOIN With Criteria – Solution
The last option remaining is to put the exclusion criteria in the JOIN. This query matches the criteria – all of the customer records are returned and only orders with total due less than $1,000. This returns 847 results again – exactly what is expected.

SELECT *
FROM SalesLT.Customer C
    LEFT OUTER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID
        AND TotalDue < 1000

EXAMPLE.JOIN.OUTER.With.Criteria.Example 5

OUTER JOIN Equivalent to an INNER JOIN

The last example shows what happens when the query has criteria in WHERE clause checking for a column in the JOIN table – IS NOT NULL. If the check is on the JOIN key field (SOH.CustomerID) it will be equivalent to an INNER JOIN. If you check another field (TotalDue) it is equivalent to an INNER JOIN with the addition of a WHERE clause specifying that the field also cannot be NULL. In this example TotalDue is always populated so it is the same as an INNER JOIN. If TotalDue could be NULL and was null – it would be missing additional records. The example below returns 32 results.

SELECT *
FROM SalesLT.Customer C
    LEFT OUTER JOIN SalesLT.SalesOrderHeader SOH
        ON C.CustomerID = SOH.CustomerID
WHERE TotalDue IS NOT NULL

EXAMPLE.JOIN.OUTER.With.Criteria.Example 6

You can see that the query is processed as an INNER JOIN by the query processor too – so not only is it logically equivalent it is identical to the query engine. Look at the join graphic in the middle in both examples. It is a Merge Join (Inner Join) in both.

INNER JOIN Query Plan (partial)
EXAMPLE.JOIN.OUTER.With.Criteria.Example 7 - INNER JOIN Query Plan

OUTER JOIN Query Plan (partial)
EXAMPLE.JOIN.OUTER.With.Criteria.Example 8 - LEFT JOIN Query Plan

Explanation

So why does an OUTER JOIN behave like an INNER JOIN sometimes and not others? When you put your limiting criteria inside of the OUTER JOIN syntax it is part of the JOIN. So if any records are limited it only impacts that JOIN and the table being JOINed. You only want orders placed in Iowa but you want to see all customers? No problem – put the criteria in the JOIN (AND State = ‘IA’). You only want to see orders placed in Iowa and the customers for those orders? No problem again – just put the criteria in the WHERE ( WHERE State = ‘IA).

If you remember that the criteria in the WHERE clause impacts ALL records it’s easy to know where to put the limiting clause.