In this article, we’ll examine 40 essential SQL questions and answers for intermediate practitioners. These will cover useful functions in SQL (e.g., aggregate and scalar functions, built-in and user-defined functions), advanced commands (e.g., joins, primary and foreign keys, indexes, and SQL relationships), database design (e.g., normalization, denormalization, and the differences between various SQL statements like DELETE, TRUNCATE, and DROP), and advanced queries (e.g., subqueries, both nested and correlated, specific tasks like finding the nth highest value in a column). These questions and answers will help you better prepare for the interview and know what to expect from your interviewer/interviewee.
In this section, we take a look at the 40 most popular intermediate SQL questions and answers, so that we’ll know what to expect from our interviewer. These questions are more suited to SQL practitioners with a few years of experience.
A database object representing a set of SQL statements frequently used for a certain task. A function takes in some input parameters, performs calculations or other manipulations on them, and returns the result. Functions help improve code readability and avoid repetition of the same code snippets.
Case manipulation functions represent a subset of character functions, and they’re used to change the case of the text data. With these functions, we can convert the data into the upper, lower, or title case.
Character manipulation functions represent a subset of character functions, and they’re used to modify the text data.
Local variables can be accessed only inside the function in which they were declared. Instead, global variables, being declared outside any function, are stored in fixed memory structures and can be used throughout the entire program.
By default, the order is ascending. To change it to descending, we need to add the DESC keyword as follows:
1 | SELECT * FROM table_name |
The LIKE operator in combination with the % and _ wildcards. The % wildcard represents any number of characters including zero, while _ represents strictly one character.
While both types of keys ensure unique values in a column of a table, the first one uniquely identifies each record of the table, and the second one prevents duplicates in that column.
The primary key of a table, based on multiple columns.
SELECT – FROM – JOIN – ON – WHERE – GROUP BY – HAVING – ORDER BY – LIMIT
FROM – JOIN – ON – WHERE – GROUP BY – HAVING – SELECT – ORDER BY – LIMIT
A virtual table containing a subset of data retrieved from one or more database tables (or other views). Views take very little space, simplify complex queries, limit access to the data for security reasons, enable data independence, and summarize data from multiple tables.
Yes. This is also known as nested views. However, we should avoid nesting multiple views since the code becomes difficult to read and debug.
No. Any views based on that table will become invalid after deleting the base table. If we try to use such a view anyway, we’ll receive an error message.
In some SQL flavors, such as PostgreSQL, the BOOLEAN data type exists explicitly and takes values TRUE, FALSE, or NULL. In other flavors, such as Microsoft SQL Server, the BIT datatype is used to store Boolean values as integers 1 (true) or 0 (false).
Normalization is a process of database design that includes organizing and restructuring data in a way to reduce data redundancy, dependency, duplication, and inconsistency. This leads to enhanced data integrity, more tables within the database, more efficient data access and security control, and greater query flexibility.
Denormalization is the process opposite of normalization. It introduces data redundancy and combines data from multiple tables. Denormalization optimizes the performance of the database infrastructure in situations when read operations are more important than write operations since it helps avoid complex joins and reduces the time of query running.
Renaming a column means permanently changing its actual name in the original table. Giving an alias to a column means giving it a temporary name while executing an SQL query, with the purpose to make the code more readable and compact.
A correlated subquery is an inner query nested in a bigger (outer) query that refers to the values from the outer query for its execution, meaning that a correlated subquery depends on its outer query. Instead, a non-correlated subquery doesn’t rely on the data from the outer query and can be run independently of it.
While a clustered index defines the physical order of records of a table and performs data searching based on the key values, a non-clustered index keeps the order of records that do not match the physical order of the actual data on the disk. A table can have only one clustered index but many non-clustered ones.
The way to implement the if-then-else logic in SQL. This function sequentially checks the provided conditions in the WHEN clauses and returns the value from the corresponding THEN clause when the first condition is satisfied. If none of the conditions is satisfied, the function returns the value from the ELSE clause in case it’s provided, otherwise, it returns NULL. The syntax is:
1 | CASE |
DELETE is a reversible DML (Data Manipulation Language) command used to delete one or more rows from a table based on the conditions specified in the WHERE clause. Instead, TRUNCATE is an irreversible DDL (Data Definition Language) command used to delete all rows from a table. DELETE works slower than TRUNCATE. Also, we can’t use the TRUNCATE statement for a table containing a foreign key.
DROP deletes a table from the database completely, including the table structure and all the associated constraints, relationships with other tables, and access privileges. TRUNCATE deletes all rows from a table without affecting the table structure and constraints. DROP works slower than TRUNCATE. Both are irreversible DDL (Data Definition Language) commands.
The first one works on aggregated data after they are grouped, while the second one checks each row individually. If both statements are present in a query, they appear in the following order: WHERE – GROUP BY – HAVING. The SQL engine interprets them also in the same order.
Using the INSERT INTO statement in combination with VALUES. The syntax is:
1 | INSERT INTO table_name |
We can also delete multiple records if they satisfy the provided condition. Using the DELETE statement. The syntax is:
1 | DELETE FROM table_name |
Using the ALTER TABLE statement in combination with ADD. The syntax is:
1 | ALTER TABLE table_name |
Using the ALTER TABLE statement in combination with RENAME COLUMN … TO … The syntax is:
1 | ALTER TABLE table_name |
Using the ALTER TABLE statement in combination with DROP COLUMN. The syntax is:
1 | ALTER TABLE table_name |
By checking the remainder of the division by 2. In some SQL versions (e.g., PostgreSQL and My SQL), we use the MOD function, in the others (Microsoft SQL Server and SQLite) – the modulo operator (%). To select all even records using MOD:
1 | SELECT * FROM table_name |
To select all even records using %:
1 | SELECT * FROM table_name |
To select all odd records, the syntax is identical in both cases, only that we would use the inequality operator <> instead of =.
Using the DISTINCT statement in combination with SELECT or creating a unique key for that table.
Using the INSERT INTO statement in combination with VALUES. The syntax is:
1 | INSERT INTO table_name |
Using the OFFSET clause. For example, to find the 6th highest value from a column, we would use the following syntax:
1 | SELECT * FROM table_name |
Using the LIKE operator in combination with the % and _ wildcards. For example, we need to find all surnames in a table that start with “A”. The query is:
1 | SELECT * FROM table_name |
Here, we assume that a surname must contain at least two letters. Without this assumption (meaning that a surname can be just A), the query is as follows:
1 | SELECT * FROM table_name |
Using the MAX() function. Otherwise, in many SQL versions, we can use the following syntax:
1 | SELECT id |
or in Microsoft SQL Server:
1 | SELECT TOP 1 id |
Using the RAND() function in combination with ORDER BY and LIMIT. In some SQL flavors, such as PostgreSQL, it’s called RANDOM(). For example, the following code will return five random rows from a table in MySQL:
1 | SELECT * FROM table_name |
To sum up, we discussed 40 essential intermediate SQL interview questions and answers. Hopefully, this information will help you to get ready for the interview and feel more confident, whether you’re looking for a job in SQL or hiring candidates for an intermediate SQL position.
Made with ❤️ and ☀️ on Earth.