If you've ever worked with Excel, you may be familiar with nested functions. These allow you to set other functions
as the arguments to a parent function. This is the basic premise behind subqueries. This is a common topic for technical
interviews as it tests how you think about queries. This tutorial will cover the following learning objectives:
What is a Subquery?
Subqueries in the FROM Clause
What is a Correlated Subquery?
Subqueries in the WHERE Clause
Subqueries in the SELECT Clause
What is a Subquery?
Summary
A Subquery is a query nested inside another query.
Subqueries should be used when the results from one query need to be used in another query. Example: SELECT * FROM products WHERE price > (SELECT AVG(price) FROM products);
Although subqueries can sometimes be used to replace a join, it's recommended to use a join to help mkae the query more readable.
NOTE: It should be mentioned that you need to have a deep understanding of the semantics behind where each subquery is placed to avoid semantic errors in your query engine. We'll discuss these semantics in the next few subsections.
NOTE: In the video, it showed the IN operator. This allows you to specify items in a list that meet a condition. For instance, a query that answers a questions like "How many customers live in either Utah, Colorado, or Arizona" would look something like: SELECT COUNT(*) FROM customers WHERE state IN('Utah', 'Colorado', 'Arizona').
NOTE: One major disadvantage of Subqueries is that they can be very difficult to read. Thus, SQL provides another option that we will explore in the next tutorial.
Subqueries in the FROM Clause
Summary:
Subqueries in the FROM Clause are used to create a temporary table based on a query. This enables you
to run aggregates on aggregated columns. To write a subquery in the FROM Clause, use the following syntax:
SELECT [col1], [col2], [col3]
FROM (SELECT * FROM [table_name] WHERE [list of conditions]);
As mentioned above, this type of subquery can be used to run aggregates on aggregated columns. To write
this type of subquery, use the following syntax:
SELECT MAX(aggregated_column)
FROM (SELECT [col1], SUM([col2]) AS aggregated_column FROM [table_name] GROUP BY [col1]);
These types of Subqueries are commonly referred to as Derived Tables.
NOTE: As mentioned in the previous subsection, there is an easier and more preferred way of nesting subqueries used in SQLcalled Common Table Expressions (CTEs). We will explore these in the next tutorial.
What is a Correlated Subquery?
Summary
A Correlated Subquery is a subquery that depends on an object referenced in the outer query.
Subqueries should be used when the results from one query need to be used in another query. Example: SELECT * FROM products WHERE price > (SELECT AVG(price) FROM products);
Although subqueries can sometimes be used to replace a join, it's recommended to use a join to help mkae the query more readable.
NOTE: It should be mentioned that you need to have a deep understanding of the semantics behind where each subquery is placed to avoid semantic errors in your query engine. We'll discuss these semantics in the next few subsections.
NOTE: In the video, it showed the IN operator. This allows you to specify items in a list that meet a condition. For instance, a query that answers a questions like "How many customers live in either Utah, Colorado, or Arizona" would look something like: SELECT COUNT(*) FROM customers WHERE state IN('Utah', 'Colorado', 'Arizona').
NOTE: One major disadvantage of Subqueries is that they can be very difficult to read. Thus, SQL provides another option that we will explore in the next tutorial.
Subqueries in the WHERE Clause
Summary:
Subqueries used in the WHERE Clause allow you to filter the result set of the
outer query by the values of the subquery. To write a subquery in the WHERE Clause in conjunction with the IN operator, use the following syntax:
SELECT [col1], [col2]
FROM [table_name1]
WHERE [col1] IN(SELECT [col] FROM [table_name2]);
Although rarely used, you can write a subquery in the WHERE Clause where the condition outputs true
for a single value. To write this type of subquery, use the following syntax:
SELECT [col1], [col2]
FROM [table_name1]
WHERE [col1] = (SELECT SUM([col] FROM [table_name2]));
NOTE: When comparing a numeric column to an aggregate used in a subquery in the WHERE clause, the subquery result set must have only a single value. For example: SELECT *
FROM products
WHERE price > (SELECT AVG(price) FROM products);
Subqueries in the SELECT Clause
Summary
This particular type of subquery is commonly used in conjunction with aggregate functions since they
return only one value. To write a subquery in the SELECT Clause, use the following syntax:
SELECT [col1], [col2], (SELECT SUM([col3]) FROM [table_name2]) AS [alias]
FROM [table_name1];
NOTE: As mentioned in the video, a subquery in the SELECT clause must only return a single row and a single column.
Exercise
Congratulations! You just completed the Subqueries Tutorial! To help test your knowledge, let's
practice using them with some real-world applications. This exercise is based on the exercise of
the previous tutorial. **It's highly recommended that you
complete the exercise outlined in the previous tutorial before beginning this exercise.**
Instructions:
Open your RDBMS client.
Using the "Sales" database, solve the following problems:
Write a query that returns revenue by state. Use a subquery in the SELECT Clause.
Write a query that returns the highest purchase_price by order. Use a subquery in the FROM Clause.
Write a query that returns revenue by order_date. Filter to include the following states: "CA", "CO", "CT".
Use a subquery in the WHERE Clause.
Exercise Completed! Click here to view the answers.
Have any issues with the above exercise? Post your question on Discord!