DISCLAIMER: ANSI SQL (The SQL that's used universally) includes everything taught in this section. However, some of the syntax shown will vary between your Relational Database Management Service (RDBMS).
Isn't querying fun? Data Analysis is all about asking and answering questions using querying techniques such as
SELECT statements. Let's expand our querying skills by learning how to work with numerical data. This lesson will
cover the following learning objectives:
Arithmetic Operators
Common Aggregate Functions
GROUP BY and HAVING Clauses
Arithmetic Operators
Summary:
If you want to add two columns together, add a column by a certain value, or concatenate two or more columns, use the + symbol with the following syntax:
SELECT [column1] + [column2] FROM [table_name];
If you want to subtract one column from another or subtract the values in a column by a certain amount, use the - symbol with the following syntax:
SELECT [column1] - [column2] FROM [table_name];
If you want to multiply one column by another or multiply the values in a column by a certain amount, use the * symbol with the following syntax:
SELECT [column1] * [column2] FROM [table_name];
If you want to divide one column from another or divide the values in a column by a certain amount, use the / symbol with the following syntax:
SELECT [column1] / [column2] FROM [table_name];
If you are performing a division task and want to get the remainder, use the % symbol with the following syntax:
SELECT [column1] % [column2] FROM [table_name];
NOTE: Although SQL isn't a programming language by nature, you can still declare variables and perform arithmetic operations on them. We will discuss these in a leter tutorial.
NOTE: When creating new columns from arithmetic operators, it's smart to give the new column an alias. This can be done using the AS keyword with the following syntax: SELECT [column1] + [column2] AS [new_column_name] FROM [table_name];
Common Aggregate Functions
Summary:
The COUNT() function is used to count the number of values either in a specific column or an entire table. This function is used with the following syntax:
SELECT COUNT([column_name]) FROM [table_name];
The SUM() function is used to add together all values in the specified column to get a grand total. This is used with the following syntax:
SELECT SUM([column_name]) FROM [table_name];
The MIN() function is used to retrieve the smallest value in the specified column. This function is used with the following syntax:
SELECT MIN([column_name]) FROM [table_name];
The MAX() function is used to retrieve the largest value in the specified column. This function is used with the following syntax:
SELECT MAX([column_name]) FROM [table_name];
If you want to view the range of values contained in a specific column, use the following syntax:
SELECT MIN([column_name]), MAX(column_name) FROM [table_name];
The AVG() function is used to get the average of all values in the specified column. This is used with the following syntax:
SELECT AVG([column_name]) FROM [table_name];
GROUP BY and HAVING Clauses
Summary:
The GROUP BY clause is used to group all non-aggregate fields in an aggregated SQL query. When used with the SUM() function, it is used with the following syntax:
SELECT [column_name1], [column_name2], SUM([column_name3]) FROM [table_name] GROUP BY [column_name1], [column_name2];
The HAVING clause is used in conjunction with the GROUP BY clause. This is used to filter aggregated fields similar to the WHERE clause. This is used with the following syntax:
SELECT [column_name1], [column_name2], SUM([column_name3]) FROM [table_name] GROUP BY [column1], [column2] HAVING SUM(column_name3) > 1;
The WHERE clause is used to filter all non-aggregate fields, whereas the HAVING clause is used to filter all aggregate fields being used in the query.
If you want to save some time when writing your queries, you can write the column index number in the GROUP BY and/or ORDER BY clauses. In the example below, the "first_name" column would have an index of 1, whereas the "last_name" column would have an index of 2. Example:
SELECT first_name, last_name FROM customers ORDER BY 2, 1;
Exercise
Congratulations! You just completed the Aggregate Functions Tutorial! To help test your knowledge, let's use
the database you populated in the previous tutorial to run some intermediate queries. This exercise is based on the exercise of
the previous tutorial. **It's highly recommended that you
complete the exercise outlined in the previous tutorial before beginning this exercise.**
Instructions:
Open your RDBMS Client (e.g., MySQL, PostgreSQL, Microsoft SQL Server).
Using the "earthquake" table, answer the following questions:
What are the Top 5 Places with the Highest Average Magnitude of Earthquakes?
What was the Magnitude of the Largest Earthquake that took place in the 21st Century?
What are the Average Depths for each Cause?
What was the Total Depth for All Earthquakes Caused by a Nuclear Explosion?
Write a Query to Show How Many Earthquakes Occurred at each Place. Limit the Result Set to Include Only Places with at Least 3 Confirmed Earthquakes.
Exercise Completed! Click here to view the answers.
Have any issues with the above exercise? Post your question on Discord!