Module 2

SQL Basics

SQL is how you talk to databases — and it's the most in-demand skill in data. Here's how it works, what you can do with it, and how to write your first queries.

The spreadsheet that broke

A marketing team at a mid-size SaaS company tracked everything in spreadsheets. Customer data, campaign results, revenue — all in interconnected Google Sheets. It worked when they had 500 customers.

Then they hit 50,000 customers. The spreadsheets froze, crashed, and gave wrong answers because formulas broke when rows were added. Reports that took 10 minutes now took 2 hours. Nobody trusted the numbers.

The fix wasn't a bigger spreadsheet. It was a database — and SQL to talk to it. Within a month, the same reports ran in seconds. Queries that would have taken an analyst all day took 3 lines of SQL.

By the end of this module, you'll be able to write SQL queries that pull, filter, sort, and summarize data from a database — the same commands analysts use hundreds of times a day.

The track starts with SQL because it's the universal language of databases — you'll apply these same data concepts in Excel starting in Module 4. In the previous module, you saw that SQL is one of the core tools in any analyst's toolkit. Now you'll learn to use it.

🔑Why SQL matters
SQL consistently ranks as one of the most in-demand data analyst skills (LinkedIn/Indeed job posting analyses), and 58.6% of developers use it regularly ([Stack Overflow Developer Survey, 2023](https://survey.stackoverflow.co/2023/)). It's the lingua franca of data — whether you're in marketing, finance, product, or engineering, SQL lets you get answers from data without waiting for someone else to pull a report.

What SQL actually is

SQL (Structured Query Language, pronounced "sequel" or "S-Q-L" — both are correct) is a language for managing and querying data stored in relational databases.

A relational database is just organized tables — like spreadsheets, but with superpowers:

Spreadsheet

  • Rows and columns in a file
  • Manual formulas for calculations
  • Breaks with large data
  • One person edits at a time
  • Data gets messy over time

Database + SQL

  • Rows and columns in structured tables
  • Queries for calculations (instant, reusable)
  • Handles millions of rows easily
  • Many people query simultaneously
  • Rules enforce data quality

The key concepts

ConceptWhat it meansSpreadsheet equivalent
TableA collection of related dataA sheet/tab
RowA single recordA row in your sheet
ColumnA specific attributeA column header
QueryA question you ask the databaseA formula or filter
SchemaThe structure of your tablesYour column headers and data types

There Are No Dumb Questions

Do I need to install anything to learn SQL?

No. Free browser tools like SQLite Online, DB Fiddle, or Mode Analytics let you write queries immediately. No setup required.

Is SQL a programming language?

It's a query language — simpler than Python or JavaScript. You're describing WHAT data you want, not HOW to get it. Most people learn the basics in a weekend.

Which database should I learn?

The SQL syntax is 90% the same across all databases. Start with any. The most common are PostgreSQL (most popular), MySQL, SQLite (simplest), and SQL Server (enterprise).

Your first SQL queries

SELECT — Getting data

The most fundamental command. "Show me this data."

sql
SELECT name, email, signup_date
FROM customers;

This says: "From the customers table, show me the name, email, and signup_date columns." That's it.

Want everything? Use *:

sql
SELECT * FROM customers;

WHERE — Filtering data

"Show me this data, but only rows that match a condition."

sql
SELECT name, email
FROM customers
WHERE country = 'United States';

You can combine conditions:

sql
SELECT name, email, total_spent
FROM customers
WHERE country = 'United States'
  AND total_spent > 100
  AND signup_date > '2025-01-01';

ORDER BY — Sorting results

sql
SELECT name, total_spent
FROM customers
ORDER BY total_spent DESC;

DESC = descending (highest first). ASC = ascending (lowest first, default).

LIMIT — Capping results

sql
SELECT name, total_spent
FROM customers
ORDER BY total_spent DESC
LIMIT 10;

"Show me the top 10 customers by spending."

🔒

Write your first query

25 XP

You have a table called `orders` with columns: `id`, `customer_name`, `product`, `amount`, `order_date`. Write SQL queries for each request: 1. Show all orders: 2. Show only orders where the amount is greater than $50: 3. Show the 5 most expensive orders: 4. Show orders from 2026, sorted by date:

Sign in to earn XP

Aggregating data

This is where SQL gets powerful — summarizing thousands of rows into useful numbers.

COUNT, SUM, AVG, MIN, MAX

sql
-- How many customers do we have?
SELECT COUNT(*) FROM customers;

-- What's our total revenue?
SELECT SUM(amount) FROM orders;

-- What's the average order value?
SELECT AVG(amount) FROM orders;

-- What's the biggest single order?
SELECT MAX(amount) FROM orders;

GROUP BY — Aggregating by category

"Show me the total revenue broken down by country."

sql
SELECT country, SUM(amount) as total_revenue, COUNT(*) as order_count
FROM orders
GROUP BY country
ORDER BY total_revenue DESC;

This is the equivalent of a pivot table — but faster and reproducible.

COUNT — How many rows match?

SUM — What's the total of a numeric column?

AVG — What's the average?

GROUP BY — Break the results into categories

HAVING — Filter the groups (like WHERE but for aggregated data)

How SQL executes your query

SQL doesn't run top to bottom like code. The database processes clauses in this order:

This is why you can't use a column alias from SELECT inside WHERE — the database hasn't processed SELECT yet when it runs WHERE.

There Are No Dumb Questions

What's the difference between WHERE and HAVING?

WHERE filters individual rows BEFORE grouping. HAVING filters groups AFTER grouping. Example: WHERE filters out small orders; HAVING filters out countries with few orders.

Can I save a query and reuse it?

Yes — that's called a "view" or a "saved query." Most database tools let you save queries. In production, developers create views that act like virtual tables.

🔒

Match the SQL clause to its purpose

25 XP

Match 6 items to their pairs.

Sign in to earn XP

JOINs — Combining tables

Real databases have multiple related tables. JOINs connect them.

Imagine two tables:

  • customers — id, name, email
  • orders — id, customer_id, product, amount

To see customer names with their orders:

sql
SELECT customers.name, orders.product, orders.amount
FROM orders
JOIN customers ON orders.customer_id = customers.id;

This "joins" the two tables on the matching IDs. Every analyst uses JOINs daily.

🔒

Analyze a dataset

50 XP

You have two tables: - `employees` (id, name, department, salary) - `departments` (id, dept_name, budget) Write queries for: 1. Find the average salary per department 2. Find departments where the average salary is above $80,000 3. Show employee names alongside their department names (hint: JOIN) 4. Find the department with the highest total salary cost

Sign in to earn XP

Real-world SQL use cases

RoleHow they use SQL
Marketing analystPull campaign performance, segment customers, track conversion funnels
Product managerAnalyze feature usage, track retention, measure A/B test results
Financial analystGenerate revenue reports, track expenses, audit transactions
Data scientistExtract training data, validate model predictions, create features
Software developerBuild application backends, optimize queries, manage data
OperationsTrack inventory, monitor KPIs, generate compliance reports

53%of data jobs require SQL (widely cited job posting analysis; verify with current LinkedIn/Indeed data as requirements shift)

58.6%of developers use SQL (Stack Overflow Developer Survey, 2023)source ↗

120Kaverage SQL-skilled salary (US, ~2023–2024; verify with current Glassdoor/BLS data as figures vary by role and region)

Getting started today

Day 1: Try SQLite Online or DB Fiddle — write SELECT, WHERE, ORDER BY

Day 2-3: Practice COUNT, SUM, AVG, GROUP BY on sample datasets

Day 4-5: Learn JOINs — connect two tables together

Week 2: Practice on real datasets (Kaggle has free ones) or your company's data

Week 3+: Learn subqueries, window functions, and CTEs (common table expressions)

🔑The 80/20 of SQL
SELECT, WHERE, GROUP BY, ORDER BY, JOIN, and aggregate functions (COUNT, SUM, AVG) cover 80% of real-world SQL work. You can learn these in a weekend. The remaining 20% (window functions, CTEs, optimization) comes with practice over months.

Back to the broken spreadsheet

Remember the marketing team drowning in crashing spreadsheets? The SQL queries they replaced those spreadsheets with are exactly the commands you learned in this module. A single SELECT ... GROUP BY ... ORDER BY query now pulls their campaign performance in seconds — the same report that used to take two hours of manual spreadsheet wrangling. The spreadsheet didn't need to get bigger. The team needed a better tool.

Key takeaways

  • SQL is how you query databases — the most in-demand data skill across all industries
  • SELECT gets data, WHERE filters it, GROUP BY summarizes it, JOIN connects tables
  • Aggregate functions (COUNT, SUM, AVG) turn thousands of rows into useful numbers
  • SQL appears in over half of data job postings and 58.6% of developers use it regularly (Stack Overflow Developer Survey, 2023)
  • You can learn the core commands in a weekend with free browser-based tools
  • Every role that touches data benefits from SQL — marketing, product, finance, engineering

What's next

You've learned the 80% of SQL that covers daily analyst work. Later in this track, Advanced SQL (Module 6) adds JOINs across multiple tables, CTEs for readable complex queries, and window functions for rankings and running totals — the skills that separate junior from senior analysts.

?

Knowledge Check

1.What does the SQL keyword SELECT do?

2.What is the difference between WHERE and HAVING?

3.What does a JOIN do in SQL?

4.Which SQL command would you use to find the average order amount per country?

Want to go deeper?

🧠 AI & Machine Learning Master Class

Understand AI, use it in your job, and build AI-powered products.

View the full program