Built-In Course
Learn SQL while querying real files
SQL for Files includes 64 guided lessons that open directly in the editor. Load sample tables, run SQL with autocomplete, and solve challenges without installing a database.
Practice in the editor
Lessons use the same Monaco editor, DuckDB engine, and results panel as your own imported files.
Load sample data
One-click sample tables make it easy to run examples before bringing your own CSV, JSON, or Parquet files.
Keep data local
Use AI to draft SQL if you want, then run it locally without uploading sensitive file data to an AI provider.
Understanding Data
Lesson 01
Tables, Rows, and Columns
Before writing any SQL, it helps to understand how data is organized in a database.
Lesson 02
Data Types
Every column in a table stores a specific kind of data. These are called data types. The most common ones are:
Lesson 03
Missing Values with NULL
Real data often has missing or unknown values. SQL represents those values with `NULL`.
Introduction to SQL
Lesson 01
Your First Query
SQL (Structured Query Language) is the standard language for working with data in databases. The most fundamental SQL statement is `SELECT`, which retrieves data from a table.
Lesson 02
Selecting Columns
Instead of selecting all columns with `*`, you can pick specific columns by listing their names separated by commas:
Lesson 03
Naming Results with Aliases
Aliases let you give columns or tables a temporary name inside a query.
Lesson 04
Sorting Results
By default, SQL doesn't guarantee a particular row order. To sort your results, use `ORDER BY`:
Filtering Data
Lesson 01
WHERE Clause Basics
The `WHERE` clause filters rows based on a condition. Only rows that satisfy the condition are included in the result:
Lesson 02
Combining Conditions
You can combine multiple conditions using `AND` and `OR`:
Lesson 03
Pattern Matching with LIKE
The `LIKE` operator matches text patterns using two wildcards:
Simple Aggregates
Lesson 01
COUNT Rows
`COUNT` tells you how many rows match a condition. It is one of the fastest ways to answer questions like “How many sales did we make?” or “How many sales came from the West region?”
Lesson 02
SUM Values
`SUM` adds the values in a numeric column. It is useful for totals like revenue, quantity, or cost.
Lesson 03
COUNT and SUM Together
Once you understand single aggregates, the next step is combining them. SQL lets you calculate several summaries in the same query.
Lesson 04
MIN Finds the Smallest Value
`MIN` returns the smallest value in a column.
Lesson 05
MAX Finds the Largest Value
`MAX` returns the largest value in a column.
Lesson 06
AVG Computes the Mean
`AVG` calculates the average value of a numeric column.
Lesson 07
MIN, MAX, and AVG Together
After learning each aggregate separately, you can combine them to get a compact summary of the spread of your data.
DISTINCT and GROUP BY
Lesson 01
Removing Duplicates with DISTINCT
`DISTINCT` removes duplicate rows from a result set.
Lesson 02
Summarizing with GROUP BY
`GROUP BY` collects rows into groups before applying aggregate functions.
Lesson 03
Filtering Groups with HAVING
`WHERE` filters individual rows before grouping. `HAVING` filters the groups after aggregation.
Working with Dates
Lesson 01
Filtering by Date Ranges
Dates are easier to work with when you treat them as dates, not strings. SQL lets you compare them directly.
Lesson 02
Extracting Parts of a Date
You can pull specific parts out of a date with `EXTRACT`.
Lesson 03
Date Arithmetic with DATE_DIFF
SQL can calculate the distance between two dates. In DuckDB, `DATE_DIFF` is a convenient way to do that.
JOIN Types
Lesson 01
Why JOINs Exist
When a database is normalized, related facts live in different tables. That keeps the data clean, but it also means you need a way to bring those facts back together when you query.
Lesson 02
INNER JOIN Basics
An `INNER JOIN` keeps only rows that match on both sides.
Lesson 03
LEFT JOIN Keeps the Left Side
A `LEFT JOIN` keeps every row from the left table and fills missing right-side values with `NULL`.
Lesson 04
Finding Missing Matches
One common use of `LEFT JOIN` is to find rows that do not have a match.
Lesson 05
RIGHT JOIN and FULL OUTER JOIN
`RIGHT JOIN` is the mirror image of `LEFT JOIN`: it keeps every row from the right table.
Lesson 06
SELF JOIN for Hierarchies
A self join joins a table to itself. This is common when rows relate to other rows in the same table, such as employees and managers.
Lesson 07
CROSS JOIN for All Combinations
`CROSS JOIN` returns every possible combination of rows from both tables.
Window Functions
Lesson 01
OVER Keeps the Detail Rows
A window function calculates across a set of related rows without collapsing the result into one row per group.
Lesson 02
PARTITION BY Creates Mini Windows
`PARTITION BY` splits the result set into smaller windows before the function runs.
Lesson 03
Running Totals with ORDER BY
Adding `ORDER BY` inside the window makes the function care about row sequence.
Lesson 04
Ranking Rows with ROW_NUMBER
Window functions are also great for ranking.
Lesson 05
Looking Back with LAG
`LAG` lets you reach into an earlier row in the same partition.
Lesson 06
Other Common Window Functions
You have now seen the main shape of window functions: a function call followed by an `OVER (...)` clause.
Execution Order
Lesson 01
Written Order vs Logical Order
SQL is written one way but logically evaluated in another order. A useful mental model is:
Lesson 02
FROM and JOIN Build the Rows First
The first big step is `FROM` plus `JOIN`. SQL first decides which tables participate and how their rows connect.
Lesson 03
WHERE Filters Before GROUP BY
`WHERE` removes individual rows before any grouping or aggregation happens.
Lesson 04
HAVING Filters After GROUP BY
`HAVING` runs after `GROUP BY`. Instead of filtering individual rows, it filters the aggregated groups.
Lesson 05
Where Window Functions Fit
Window functions belong later in SQL's logical order. A good mental model is:
Lesson 06
Why Window Functions Cannot Go Directly in WHERE
A common beginner question is: “If window functions run late, can I filter on them in `WHERE`?”
Lesson 07
Why SELECT Aliases Work in ORDER BY
`SELECT` runs later than `WHERE` but earlier than `ORDER BY`.
Lesson 08
LIMIT Happens at the End
`LIMIT` is one of the last steps. It cuts down the final ordered result, not the raw table.
Normalization
Lesson 01
Why Normalization Matters
Normalization is the process of splitting data into related tables so each fact is stored in the right place only once. We do this to reduce duplication and keep data consistent.
Lesson 02
Feel the Update Anomaly
Let's make the update problem concrete. In the bad schema below, customer details are copied into every order row.
Lesson 03
One Row to Update in a Good Schema
Now compare that with a normalized design. Customer facts live in the `customers` table, while `orders` only stores `customer_id`.
Lesson 04
Feel Why Packed Lists Break 1NF
First normal form says each column should hold one value, not a comma-separated list of values.
Lesson 05
Order Items Feel Better
In a normalized schema, each ordered product gets its own row in `order_items`. That means products are queryable with normal joins and filters.
Lesson 06
Feel the Delete Anomaly
The flat table also has a delete problem. Product facts exist only because they are attached to order rows.
Lesson 07
Products Can Exist Without Orders
A normalized schema fixes both insert and delete anomalies by giving products their own table.
Lesson 08
1NF, 2NF, and 3NF
Most application databases aim for first, second, and third normal form. These are practical rules for organizing data well.
Lesson 09
Why We Build Tables in Normal Forms
Normal forms are not just theory. They give us concrete benefits when we design real systems.
GROUPING SETS, ROLLUP, and CUBE
Lesson 01
Multi-Dimension Aggregates with GROUPING SETS
Sometimes you need aggregates along several dimensions in one result. `GROUPING SETS` lets you specify exactly which groups to compute.
Lesson 02
ROLLUP for Hierarchical Totals
`ROLLUP` is a shorthand that creates progressively less detailed grouping sets, perfect for subtotals and a grand total.
Lesson 03
CUBE for All Combinations
`CUBE` goes further than `ROLLUP`: it produces every possible combination of the listed columns.
UNNEST
Lesson 01
Unnesting Lists into Rows
`UNNEST` turns each element of a list column into its own row. This is essential when working with JSON or other data that stores arrays inside columns.
Lesson 02
Unnesting Structs into Columns
When a column holds a `STRUCT`, `UNNEST` expands its fields into separate columns.
Lesson 03
Recursive UNNEST for Nested Data
When lists contain other lists, a single `UNNEST` only peels off one layer. Use the `recursive` parameter to flatten all levels at once.
Lesson 04
Controlling Depth with max_depth
Sometimes you want to peel off only a few layers of nesting rather than flattening everything. The `max_depth` parameter controls how many levels `UNNEST` removes.
Array Lambdas
Lesson 01
Transforming Lists with list_transform
`list_transform` applies a function to every element of a list and returns a new list with the results. Think of it like a map operation.
Lesson 02
Filtering Lists with list_filter
`list_filter` keeps only the elements that satisfy a condition and returns a shorter list.
Lesson 03
Reducing a List with list_reduce
`list_reduce` collapses a list into a single value by applying a two-argument `lambda` across all elements, one at a time.
Lesson 04
Combining Lambda Functions
Lambda functions can be nested. You can filter a list first, then transform or reduce the result.