SQL TUTORIALS

SQL Tutorials: String Functions

[Enter image here]

Strings are one of the most common data types and can be difficult to work with. In this tutorial, we will go over some basic string manipulation techniques found in most RDBMSs. This tutorial will cover the following learning objectives:

  • String Concatenation
  • UPPER, LOWER, and INITCAP Functions
  • LEFT and RIGHT Functions
  • String Formatting Functions
  • Miscellaneous String Functions

String Concatenation

In SQL, String Concatenation is the process of combining two or more columns of data type "VARCHAR" or "CHAR" to create a more comprehensible group of data in a single column. Watch the following video to get a better understanding of this concept:

Summary

  • Concatenation is the process of joining two or more strings. To concatenate two or more columns, use the following syntax (differs between RDBMSs). It should be noted that you need to manually enter a space between the strings.
    SELECT [col1] || ' ' || [col2] AS [alias]
    FROM [table_name];
  • Another way to concatenate strings in SQL is to use either the CONCAT() or CONCATENATE() functions (differs between RDBMSs). To concatenate strings using one of these functions, use the following syntax:
    SELECT CONCAT([col1], ' ', [col2]) AS [alias]
    FROM [table_name];
  • A third way to concatenate strings in SQL is by using the "+" operator. This is commonly used in certain RDBMSs to make reading queries easier. To concatenate strings with this operator, use the following syntax:
    SELECT [col1] + ' ' + [col2] AS [alias]
    FROM [table_name];

UPPER, LOWER, and INITCAP Functions

When presenting query results to stakeholders in a tabular format, it helps to format the strings the way they were designed to be presented. Watch the following video to get a better understanding of this concept:

Summary:

  • The LOWER() function makes all characters in the string lowercase. This is commonly used when formatting product descriptions or other long strings. To use this function, use the following syntax:
    SELECT LOWER([col1]) AS [alias], LOWER([col2]) AS [alias]
    FROM [table_name];
  • The UPPER() function makes all characters in the string uppercase. This is commonly used when formatting acronyms, categories, or stock market abbrevations. To use this function, use the following syntax:
    SELECT UPPER([col1]) AS [alias], UPPER([col2]) AS [alias]
    FROM [table_name];
  • The INITCAP() function makes the first letter of every word in the string uppercase. This is commonly used when formatting formal names such as first and last names, product names, and city names. To use this function, use the following syntax:
    SELECT INITCAP([col1]) AS [alias], INITCAP([col2]) AS [alias]
    FROM [table_name];

LEFT and RIGHT Functions

Although SQL isn't built for automation like other languages such as Python and Java, it does provide some automated features that allow you to quickly manipulate data. Watch the following video to get a better understanding of this concept:

Summary:

  • The LEFT() function gets the first "n" number of characters from the start of the string. This is commonly used to create abbreviations or codes. To use this function, use the following syntax:
    SELECT LEFT([col1], 4) AS [alias], LEFT([col2], 2) AS [alias]
    FROM [table_name];
  • The RIGHT() function gets the first "n" number of characters from the end of the string. This is commonly used to preview long strings. To use this function, use the following syntax:
    SELECT RIGHT([col1], 4) AS [alias], RIGHT([col2], 6) AS [alias]
    FROM [table_name];
  • These functions are common interview questions and should not be overlooked.

String Formatting Functions

When verifying outputs, it helps to present data in a comprehensible format. For example, if you're working with financial data, it may be helpful to add dollar signs to add some context to your result sets. Read the following article to get a better understanding of this concept:

Summary:

  • The FORMAT() function is used to customize strings in a column. This is commonly used when creating mail merge scripts or backend application placeholder text. To use this function to format a string, use the following syntax:
    SELECT FORMAT('[new_string] %s', [col_name]) AS [alias]
    FROM [table_name];
  • To make your result sets more readable, you can specify flags to limit the characters returned or specify the string's alignment. To return a column of string data type with a maximum of 10 characters and left-align the contents, use the following syntax:
    SELECT FORMAT('|%-10s|', [col_name]) AS [alias]
    FROM [table_name];
  • When working with a list of string values, either when uploading data from a semi-structured format such] as JSON or XML, it's helpful to format strings with multiple parameters. To format strings based on certain categories, use the following syntax:
    SELECT FORMAT('%1$s [value], %2$s [value], %1$s [value]', '[string1]', '[string2]') AS [alias]
    FROM [table_name];

Miscellaneous String Functions

While the functions we've learned so far in this tutorial cover about 80% of all the string manipulation you'll ever need when querying data, there are some other functions available that cover the other 20%. View the following resources to get a better understanding of this concept:

Summary:

  • The SPLIT_PART() function splits a string based on a specified delimiter and allows you to choose which chunk you would like to return. This function is specific to PostgreSQL, though may have variants in other RDBMSs. To use this function, use the following syntax:
    SELECT SPLIT_PART([col_name], '[delimiter]', [chunk_number]) AS [alias]
    FROM [table_name];
  • The LENGTH() function returns the number of characters in the string. This function is also called the LEN() function depending on your RDBMS. To use this function, use the following syntax:
    SELECT LENGTH([col_name]) AS [alias]
    FROM [table_name];
  • The TRIM() function removes both leading and trailing spaces from a string. This is commonly used when cleaning datasets. To use this function, use the following syntax:
    SELECT TRIM([col1]) AS [alias], TRIM([col2]) AS [alias]
    FROM [table_name];
  • The LTRIM() function is used to remove only leading spaces from a string. This is commonly used after splitting string columns based on a delimiter. To use this function, use the following syntax:
    SELECT LTRIM([col1]) AS [alias], LTRIM([col2]) AS [alias]
    FROM [table_name];
  • The RTRIM() function is used to remove only trailing spaces from a string. This is also commonly used after splitting string columns based on a delimiter. To use this function, use the following syntax:
    SELECT RTRIM([col1]) AS [alias], RTRIM([col2]) AS [alias]
    FROM [table_name];
  • The REPLACE() function is used to replace specific characters in a string with new characters. This is commonly used when cleaning datasets. To use this function, use the following syntax:
    SELECT REPLACE([col_name], '[old_string]', '[new_string]') AS [alias]
    FROM [table_name];
  • The SUBSTRING() function is used to retrieve a specified range of characters from the middle of a string. This is commonly used when removing certain characters from strings or extracting parts of a string. To use this function, use the following syntax:
    SELECT SUBSTRING([col_name], [start_position], [end_position]) AS [alias]
    FROM [table_name];

Exercise

Congratulations! You just completed the String Functions Tutorial! To help test your knowledge, let's see how these can be utilized by completeing a few practice problems. This exercise is based on the exercise of the previous tutorial.
**It's highly recommended that you complete the exercise outlined in the previous tutorial before beginning this exercise.**

Instructions:

  1. Open your RDBMS Client.
  2. Insert the Following Data into the "Orders" Schema of the "Sales" Database:
  3. To keep things interesting, let's add some more data to our Sales database:
    • Customers
      1011, [enter 4 spaces] Trevor, Wilkins, 700 W 1800 N, OREM, ut, 84058
      1012, Samantha, Grettel [enter 4 spaces], 121 Center Street, provo, UT, 84064
      1013, Paul, HIDDICK, [enter 4 spaces] 443 North Wind Drive [enter 4 spaces], AtlanTA, gA, 21321
      1014, simone, fredericks - StringError, 9008 Pratt Corner, DES Moines, Ia, 56217
      1015, THADIUS, clements, 3900 S Main Street, murray, ut, 84048

      Orders
      190001137, 2023-02-02, 2023, 2, 1011
      190001138, 2023-02-02, 2023, 2, 1011
      190001139, 2023-02-02, 2023, 2, 1012
      190001140, 2023-02-03, 2023, 2, 1013
      190001141, 2023-02-04, 2023, 2, 1013
      190001142, 2023-02-04, 2023, 2, 1014
      190001143, 2023-02-04, 2023, 2, 1015

      Order_Items
      190001137, 1, Wet Suit (Mens), XXL Mens Wet Suit for Scuba Diving and Snorkeling, 68.99
      190001137, 2, Swimming Goggles, Unisex Adjustable Eyewear for Water use, 12.99
      190001137, 3, Beach Towel (2-Pack), Beach Towels with Tropical pattern. 100% Cotton, 21.29
      190001137, 4, Sunscreen SPF 50, Light Duty Sunscreen for Outdoor Use, 11.49
      190001138, 1, Ski Poles, Unisex Poles for Powder, 29.99
      190001138, 2, Winter Coat (Womens), Heavy Duty Winter Coat for Women, 54.99
      190001139, 1, USB-C Cable, USB-C Cable for Phones and Laptops, 3.29
      190001140, 1, Lightning Cable, Apple iPhone Charger, 21.49
      190001141, 1, Water Backpack, Outdoor Backpack with Water Pouch (up to 50 ML), 54.49
      190001141, 2, Cooling Towel, Heavy Duty Cooling Towel for Outdoor Use, 31.99
      190001141, 3, Wool Socks (Mens), Thick Wool Socks for Outdoor Use, 9.99
      190001142, 1, Pillow Case, Cotton Pillow Case, 9.99
  4. Using the "Sales" Database, solve the following problems:
    1. Write a query that cleans all text columns in the "customers" table from leading and trailing spaces. Return all customers with a customer_id greater than 1010.
    2. Write an appropriate query to clean all the columns in the "customers" table in the correct string format according to the column's previous string format.
    3. Write a query that replaces all instances of "mens" or "womens" in the "item_name" column of the "order_items" table with "Unisex".
    4. Write a query that returns all customers with the substring "Sam" is in the first 3 letters of their first name or that live in "UT" or "CO". Use the SUBSTRING Function.
    5. Add a new column to the "customers" table titled "email" with a data type of VARCHAR(50). This column will be populated as a single string with the first 4 letters of the customer's first name, the last 3 letters of the customer's last name, followed by "@gmail.com".
  5. Exercise Completed! Click here to view the answers.
  6. Have any issues with the above exercise? Post your question on Discord!