← Back First drill: JOIN + GROUP BY →

SQL interview patterns

Structure first — match the question shape to a clause shape, then plug in columns.

Interviews reward two scopes: GROUP BY collapses rows to one row per bucket. OVER (PARTITION BY … ORDER BY …) keeps one row per input row and adds a column from a window. If you know which of those you need, picking HAVING, DENSE_RANK, LAG, or date truncation follows naturally.

GROUP BY = “one result row per group”  ·  PARTITION BY (in OVER) = “reset the window per group; still one output row per source row”

Decision flow

1Need totals / counts / averages per month, per dept, per user? → GROUP BY (+ aggregates). Filter those groups with HAVING, not WHERE.
2Need rank or row order within each dept / user but still list base rows? → OVER (PARTITION BY … ORDER BY …) + RANK / DENSE_RANK / ROW_NUMBER.
3Need previous or next value in time order (per partition)? → LAG / LEAD with the same PARTITION BY + ORDER BY.
4Need calendar month / year as the bucket? → normalize dates first (DATE_TRUNC, EXTRACT, or dialect DATE_FORMAT), then GROUP BY or window ORDER BY.

1. GROUP BY + aggregate

When: “Per department / per month / per customer, what is SUM / COUNT / AVG …?”

SELECT   dept_id,
         SUM(amount) AS revenue
FROM     orders
GROUP BY dept_id;

One output row per value of the grouped columns.

2. GROUP BY + HAVING

When: “Only groups where COUNT / SUM … meets a condition” (filter after aggregation).

SELECT   customer_id, COUNT(DISTINCT product_id) AS n
FROM     orders
GROUP BY customer_id
HAVING   COUNT(DISTINCT product_id) > 5;

WHERE filters rows before grouping; HAVING filters groups.

3. Date grain (month / year bucket)

When: Facts are timestamps but the question is “by calendar month” or “by year”.

-- PostgreSQL-style
SELECT DATE_TRUNC('month', order_date) AS m, SUM(amount) AS total
FROM orders
GROUP BY DATE_TRUNC('month', order_date);

Same grain must appear in GROUP BY or be derivable consistently; use the dialect cheatsheet for MySQL / SQL Server.

4. DENSE_RANK + PARTITION BY (top N per group)

When: “Top 3 salaries per department” — rank inside each partition, then filter rank.

SELECT *
FROM (
  SELECT emp_id, dept_id, salary,
         DENSE_RANK() OVER (
           PARTITION BY dept_id
           ORDER BY salary DESC
         ) AS rnk
  FROM employees
) t
WHERE rnk <= 3;

Interviewers often want this subquery / CTE shape: window in inner query, WHERE rnk <= N outside (standard SQL cannot filter window aliases in the same SELECT level).

5. LAG / LEAD + PARTITION BY + ORDER BY

When: “Previous login”, “month-over-month delta”, “gap between events” — ordered comparison within a group.

SELECT user_id, login_at,
       LAG(login_at) OVER (
         PARTITION BY user_id
         ORDER BY login_at
       ) AS prev_login
FROM sessions;

Define PARTITION BY (whose timeline?) and ORDER BY (what “previous” means). For month totals, aggregate to one row per month first, then LAG on that series.