Creating Self-Joins

You can join a table to itself — that is, create a self-join — if you want to find rows in a table that have values in common with other rows in the same table. For example, you can use a self-join to find pairs of authors who live in the same postal code.

As with any join, a self-join requires at least two tables. The difference is that, instead of adding a second table to the query, you add a second instance of the same table. That way, you can compare a column in the first instance of the table to the same column in the second instance, which allows you to compare the values in a column to each other. Each instance of the table must be unique, so the Query Designer assigns an alias to the second instance of the table.

For example, if you are creating a self-join to find all authors with the same postal code, you compare the zip column in the first instance of the table against the zip column in the second instance. The resulting join condition might look like the following:

FROM authors INNER JOIN authors authors1 ON 
   authors.zip = authors1.zip

Creating a self-join often requires multiple join conditions. The primary join condition is the one on which the join is based. In the example of the authors’ postal code, the primary join condition is based on finding an exact match in the zip column.

However, if you join based on only this condition, each row in the table appears in the result set at least twice. Each row matches itself, resulting in a duplicate. In addition, the join results are reversed for rows that are identical except for the order of the join values.

To eliminate these duplicates, you can include a second join condition to filter out duplicated rows. The second join condition might, for example, compare the primary key (in this example, the au_id column) with a less than (<) operator. The resulting join condition might look like this:

FROM authors INNER JOIN authors authors1 ON 
   authors.zip = authors1.zip AND
   authors.au_id < authors1.au_id

When you create a self-join, the Query Designer usually creates the second join condition automatically, because it is based on a primary key. You can then manually add the primary join condition.

To create a self-join

  1. Add to the Diagram pane the table you want to work with.

  2. Add the same table again, so that the Diagram pane shows the same table in two input source windows.

    The Query Designer assigns an alias to the second instance by adding a sequential number to the table name. In addition, the Query Designer creates a join line between the input source windows based on the primary key. In most cases, this join can function as the second join condition.

  3. Right-click the join line between the tables, choose Properties from the shortcut menu, and then change the comparison operator between the primary keys as required. For example, you might change the operator to less than (<).

  4. Create the primary join condition by dragging the name of the primary join column in the first input source window and dropping it on the corresponding column in the second input source window.

  5. Specify other options for the query such as output columns, search conditions, and sort order.