The SQL Join

There are three categories of joins that you can make use of in SQL:

  • Inner Join
  • Outer Join
  • Cross Join

But before we dive into the categories of joins, we first need to have an understanding of what a join really is.
Joins are used in SQL to bring together all the relevant data from multiple database tables. Remember that we've broken data down into multiple tables and established relationships between the tables.

An example we've used in the past was the tables author and book. We are going to switch things up in this post and change the relationship of the author and book tables to be a one-to-many relationship (as opposed to the previous many-to-many relationship)

So what does the relationship between author and book look like as a one-to-many relationship?

Author
<<PK>> author_id [int (11)]
first_name [varchar(20)]
last_name [varchar(20)]
Book
<<PK>> book_id [int (11)]
<<FK>> author_id [int (11)]
book_title [varchar(255)]
ISBN [varchar(255)]
version [varchar(10)]

The relationship between these two tables is based on the author_id column. So how does this relate to the topic of joins?

Let's say we're in a situation where we want to ask the question “Which books did which author create?” We answer this question with a join. When we join the tables together, we marry the data together (smash it together) based on the existing relationship the tables have with each other. If there is no relationship, then we can't join the tables… make sense?

SQL Inner Join

So, the first type of join I want to talk about is one of the most commonly used joins in SQL, and that's the inner join. The inner join matches data from both tables with each other, and if there isn't a match, then there is no data returned. This is an important concept to understand, as it's what separates the inner join from the other types. So I'll say it again, the SQL inner join will not return a row if there is no match from one table to the other.

Here's an example that should illustrate this concept nicely:

Author

Author_id First_Name Last_Name
1 Trevor Page
2 Jane Doe
3 Jack Johnson

Book

Book_id Author_id Book_title
1 1 How to Program with Java 2nd Edition
2 1 How to Program with Java 1st Edition
3 3 How to Play Guitar

Let's assume that the data above is what's stored in our database (in the author and book tables). What would happen if we invoked an inner join like so:

select * from author
inner join book on author.author_id = book.author_id;

The results of this query will follow the rules that I outlined above… if there is no match (based on the relationship) then it won't return any results. So because of this fact, there will only be three rows returned (instead of 4) because the author “Jane Doe” doesn't have a corresponding book.

Let's see what we get when we run this sql script:

Author_id First_Name Last_Name Book_id Book_title
1 Trevor Page 1 How to Program with Java 2nd Edition
1 Trevor Page 2 How to Program with Java 1st Edition
3 Jack Johnson 3 How to Play Guitar

So as you can see, since we've used an inner join, author Jane Doe doesn't even show up as a result because she doesn't have a matching row in the book table.

Make sense?

SQL Outer Join

Okay, so now let's move onto the second most used join type, the SQL outer join.

Let's say that we are in a situation where we want to see all the rows from the author table as well as any pertinent “joined” data from the book table. How can we accomplish that?

Outer joins of course!

SQL outer joins actually break down into three categories:

  1. Left Outer Join
  2. Right Outer Join
  3. Full Outer Join

The most commonly used of these three (by far) is the left outer join, so that's the one that we will be focusing our attention on.

As I mentioned above, the left outer join is useful when you don't want to exclude any rows from one table if there doesn't happen to be any data to join to. So let's say we want to see all the authors (including the authors that haven't yet published a book), and we also want to see the book titles that belong to those authors who HAVE published a book. This requirement demands the use of a left join. Here's what it would look like:

select * from author
left join book on author.author_id = book.author_id;

Here's what the results of this query would be:

Author_id First_Name Last_Name Book_id Book_title
1 Trevor Page 1 How to Program with Java 2nd Edition
1 Trevor Page 2 How to Program with Java 1st Edition
2 Jane Doe null null
3 Jack Johnson 3 How to Play Guitar

Can you spot the difference in the results that were returned from the left join vs the inner join?

The difference is that with the left join we actually get the Jane Doe row to display… however, you'll notice that instead of displaying information about Jane's book, you see null. This is because Jane doesn't have any matching rows in the book table, so there's nothing for SQL to display in these book columns.

So to sum up, the when there's no matching data, the left join will display nulls but still return a result, whereas the inner join will exclude the entire row altogether.

Links Mentioned in Podcast

Free Java Roadmap

Discover exactly what you need to learn and where to start in order to become a professional coder.