A CTE can be thought of as a temporary
result set and are similar to a derived table in that it is not stored
as an object and lasts only for the duration of the query. A CTE is
generally considered to be more readable than a derived table and does
not require the extra effort of declaring a Temp Table while providing
the same benefits to the user. However; a CTE is more powerful than a
derived table as it can also be self-referencing, or even referenced
multiple times in the same query.
The basic syntax structure for a CTE is shown below:
WITH MyCTE
AS ( SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
WHERE ManagerID IS NULL )
SELECT *
FROM MyCTE
Building a Recursive CTE
In the following examples, you will
learn how to harness the power of a recursive CTE query by fulfilling a
common business requirement, retrieving hierarchical data. By the time
the final query is complete you will be able to easily determine how
many levels from the top executive each employee is.
A recursive CTE requires four elements in order to work properly.
- Anchor query (runs once and the results ‘seed’ the Recursive query)
- Recursive query (runs multiple times and is the criteria for the remaining results)
- UNION ALL statement to bind the Anchor and Recursive queries together.
- INNER JOIN statement to bind the Recursive query to the results of the CTE.
WITH MyCTE
AS ( SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
WHERE ManagerID IS NULL
UNION ALL
SELECT EmpID, FirstName, LastName, ManagerID
FROM Employee
INNER JOIN MyCTE ON Employee.ManagerID = MyCTE.EmpID
WHERE Employee.ManagerID IS NOT NULL )
SELECT *
FROM MyCTE
Identify the Anchor and Recursive Query
Anyone who does not have a boss is
considered to be at the top level of the company and everyone who does
have a boss either works for the person(s) at the top level (upper
management), or the people that work for them (mid-management thru base
employees).
For example, a CEO is at the top level
and thus has a ManagerID of null. Likewise, everyone below the CEO will
have a ManagerID. This is demonstrated in the two queries below:
The first SELECT statement will become
your Anchor query as it will find the employee that has a ManagerID of
null (representing Level 1 of the organization). The second SELECT
statement will become your Recursive query and it will find all
employees that do have a ManagerID (representing Level 2-3 of this
organization).
As you can see from the results so far,
these queries are unable to give hierarchical data on which level each
employee is at within the organization.
Add the Anchor and Recursive query to a CTE
Begin transforming this entire query
into a CTE by placing a UNION ALL statement between the Anchor and
Recursive queries. Now add parentheses around the entire query,
indenting it, moving it down, and adding the declaration WITH
EmployeeList AS before the open parenthesis, and then add SELECT * FROM
EmployeeList on the next line after the close parenthesis.
Your query should now look like the screenshot below:
As you can see, the results from your
CTE are exactly the same as the results returned from running the anchor
and Recursive queries simultaneously in the previous example.
Add an expression to track hierarchical level
The Anchor query (aliased as ‘Boss’)
inside the CTE represents everyone at Level 1 (i.e. Sally Smith). The
Recursive query (aliased as ‘Emp’) represents everyone at Levels 2 and
3. In order to visualize each level in a result set, you will need to
add an expression field to each query.
Add the expression “1 AS EmpLevel” to
the Anchor query and the expression “2 AS EmpLevel” to the Recursive
query. Before executing the entire query, look closely at the expression
field. The EmpLevel expressions in the Anchor query will hard-code the
numeral 1 (for Sally Smith’s level), while the EmpLevel expressions in
the Recursive query will hard-code the numeral 2 for everyone else.
Your query should now look like the screenshot below:
The two new expression fields were a
helpful step. In fact, they show the correct EmpLevel information for
Sally Smith and for the people at Level 2 (i.e., Adams, Bender, Brown,
Kennson, Lonning and Osako). However, the 2 is just a hard-coded
placeholder to help visualize your next step. Lisa Kendall and several
other employees need to be at Level 3.
Ideally you would like to make the expression dynamic by replacing “2 AS EmpLevel” with the expression “EmpLevel + 1”.
Add a self-referencing INNER JOIN statement
Let’s take a moment and recognise why
this is not going to work quite so simply. The idea to increment
EmpLevel in the recursive query of the CTE is on the right track.
Unfortunately, the recursive query is trying to reference a field called
EmpLevel but can’t find one, since it has only been materialized in the
result set of the Anchor query and does not yet exist in the recursive
set.
How can you materialize the EmpLevel field for the recursive query? We can use the CTE for this!
Remember, a recursive CTE requires an INNER JOIN to connect the
recursive query to the CTE itself. Go ahead and write an INNER JOIN
statement binding the recursive query ‘Emp’ to the CTE ‘EmployeeList AS
EL’ ON Emp.ManagerID = EL.EmpID.
Your query should now look like the screenshot below:
Success! You
can now see that Sally is at the first level, Alex is at the second
level and Lisa appears at the third level. Since a CTE can reference
itself, the ‘Emp’ recursive query can now access the EmpLevel field
materialized in the EmployeeList CTE.