Verified concept notes

Review core SQL concepts before solving problems.

These notes summarize SQL concepts using official documentation from KDATA, PostgreSQL, MySQL, and Oracle as the baseline. Read the concept first, then move directly into matching practice problems.

Solve Problems Review Concepts

Source baseline

Built from official documentation, not unverified notes.

The goal is to give learners a reliable summary before practice, while linking back to the original documentation for deeper study.

KDATA SQLD Scope

Used as the exam-oriented baseline for data modeling, SELECT, WHERE, JOIN, GROUP BY, subqueries, set operators, window functions, and management statements.

KDATA SQLD guide

PostgreSQL Docs

Used for SELECT, WHERE, ORDER BY, joins, aggregate functions, GROUP BY, HAVING, and window function behavior.

PostgreSQL SELECT tutorial

MySQL Docs

Used for SELECT syntax, NULL behavior, and window function concepts such as OVER, PARTITION BY, and window ordering.

MySQL SELECT reference

Oracle SQL Reference

Used for set operator rules, including UNION, UNION ALL, INTERSECT, EXCEPT, and MINUS compatibility requirements.

Oracle set operators

KDATA SQLD scope

Use the official scope as a map for practice.

KDATA's SQLD information groups the exam around data modeling and SQL fundamentals. SQLHub follows that direction by connecting modeling, table relationships, SELECT, functions, WHERE, GROUP BY, HAVING, ORDER BY, joins, subqueries, set operators, window functions, and management statements to practical query problems.

Official source: KDATA SQLD guide

Start with WHERE, JOIN, and GROUP BY. Then move to subqueries and window functions once the row-level logic is clear.

SELECT, WHERE, ORDER BY, DISTINCT

SELECT chooses columns, WHERE filters rows, and ORDER BY controls final order.

Official SQL documentation treats SELECT as a statement that defines the output columns, source tables, and optional conditions. WHERE keeps rows that satisfy a condition. ORDER BY is what makes output order predictable. DISTINCT removes duplicate result rows, but it should not be treated as a replacement for ORDER BY.

SELECT DISTINCT category
FROM products
WHERE price >= 50
ORDER BY category;

Select only the columns the result needs.
Use WHERE for row-level conditions before grouping.
Use ORDER BY whenever the visible row order matters.

Official source: PostgreSQL querying a table Official source: MySQL SELECT statement

Practice SELECT

JOIN and relationships

JOIN combines rows through table relationships.

PostgreSQL's documentation explains joins as queries that process multiple tables, or multiple instances of the same table, at once. INNER JOIN returns matching rows. LEFT OUTER JOIN keeps rows from the left table and fills missing right-side values with NULL. A self join uses the same table with different aliases.

SELECT c.name, o.id AS order_id
FROM customers AS c
LEFT JOIN orders AS o
  ON c.id = o.customer_id;

Find the primary-key and foreign-key path first.
Use LEFT JOIN when unmatched rows from the left table must remain.
Use aliases when multiple tables share column names such as id.

Official source: PostgreSQL joins between tables

Practice JOIN

GROUP BY, HAVING, aggregates

WHERE filters input rows; HAVING filters grouped results.

Aggregate functions calculate one result from multiple input rows. WHERE decides which input rows enter the grouping step. GROUP BY forms groups. HAVING filters those groups after aggregate values such as COUNT, SUM, and AVG have been calculated.

SELECT customer_id, COUNT(*) AS order_count
FROM orders
WHERE total_amount > 0
GROUP BY customer_id
HAVING COUNT(*) >= 2;

Use HAVING for conditions based on aggregate results.
Use WHERE for raw row conditions before grouping.
Remember that grouped output is group-level, not row-level.

Official source: PostgreSQL aggregate functions

Practice GROUP BY

Subqueries

Use subqueries when a condition depends on another result.

A subquery is useful when the outer query needs a value or set of values computed by another query. Use scalar subqueries for one value, IN for a list of candidate values, and EXISTS or NOT EXISTS when the question is about whether related rows exist.

SELECT *
FROM orders
WHERE total_amount > (
  SELECT AVG(total_amount)
  FROM orders
);

Check whether the subquery returns one value or multiple rows.
Use IN or EXISTS instead of = when multiple rows may be returned.
Prefer NOT EXISTS when NULL values could make NOT IN confusing.

Official source: PostgreSQL aggregate functions Official source: MySQL SELECT statement

Practice Subqueries

NULL handling

NULL means missing or unknown, so normal equality checks do not work.

MySQL's documentation explains that NULL has special behavior in DISTINCT, GROUP BY, ORDER BY, and aggregate functions. COUNT(column) ignores NULL values, while COUNT(*) counts rows. For conditions, use IS NULL or IS NOT NULL instead of = NULL.

SELECT id, COALESCE(nickname, name) AS display_name
FROM users
WHERE deleted_at IS NULL;

Use IS NULL and IS NOT NULL for NULL checks.
COUNT(column) skips NULL values; COUNT(*) counts rows.
COALESCE returns the first non-NULL value in its argument list.

Official source: MySQL NULL behavior

Practice NULL

UNION, INTERSECT, EXCEPT

Set operators combine compatible SELECT results.

Oracle's SQL Reference explains that set operators combine result sets and require matching numbers of expressions with compatible data type groups. UNION removes duplicates. UNION ALL preserves duplicates.

SELECT email FROM customers
UNION
SELECT email FROM newsletter_signups
ORDER BY email;

Each SELECT must return the same number of columns.
Corresponding columns should have compatible data types.
Use UNION ALL when duplicates are meaningful and should remain.

Official source: Oracle set operators

Practice Set Operators

Window functions and Top N

Window functions calculate across related rows without collapsing them.

PostgreSQL's documentation explains that window functions do not group output rows into a single row. MySQL's documentation shows that functions such as ROW_NUMBER use OVER, and that PARTITION BY plus window ORDER BY defines how rows are numbered or ranked.

SELECT id, customer_id, total_amount,
  ROW_NUMBER() OVER (
    PARTITION BY customer_id
    ORDER BY total_amount DESC
  ) AS customer_order_rank
FROM orders;

GROUP BY reduces rows; window functions keep row detail.
PARTITION BY restarts the calculation for each group.
Use window functions for top item per category, latest row per user, and per-group ranking.

Official source: PostgreSQL window functions Official source: MySQL window function concepts

Practice Window Functions