Also at Deasil Works · txn2 · Plexara
Profiles GitHub · X · LinkedIn
Theme Light · Auto · Dark
Professional notes by Craig Johnston
long-form, short-form, working drafts · since 2008
VOL. XIX · MMXXVI
109 NOTES IN PRINT
FOLIO X 2018-05-07 · 7 MIN · SHORT-FORM

SQL Foundations

Selects, joins and aliases.

Diagram · folio x
erDiagram
    CUSTOMER ||--o{ ORDER : places
    ORDER ||--|{ ORDER_ITEM : contains
    PRODUCT ||--o{ ORDER_ITEM : ordered_as
    CUSTOMER ||--o{ ADDRESS : has

    CUSTOMER {
        int id PK
        string name
        string email
    }
    ADDRESS {
        int id PK
        int customer_id FK
        string street
        string city
    }
    ORDER {
        int id PK
        int customer_id FK
        date placed_at
        decimal total
    }
    ORDER_ITEM {
        int order_id FK
        int product_id FK
        int quantity
        decimal price
    }
    PRODUCT {
        int id PK
        string sku
        string name
        decimal price
    }

The following is an attempt at explaining the basics of an SQL query, and more importantly how I believe you can best think through them. All queries can be broken down into the basics of this declarative language.


§2026 Update

The good news for a post about fundamentals: none of this changed. SELECT, JOIN, and aliases work exactly as they did in 2018, and they will keep working long after whatever framework you are using today is gone. SQL is one of the safest things you can learn deeply.

What changed is who writes the query. A lot of SQL now comes out of an LLM. You describe what you want in English, and a model hands you a SELECT with six joins and a couple of aliases. That is genuinely useful, and it makes understanding the fundamentals more important, not less. If you cannot read the join, you cannot tell whether the model gave you the rows you asked for or quietly handed you a Cartesian product that happens to look right on a small result set. The whole point of this post, declaring what you want and aliasing for readability, doubles as the checklist for reviewing SQL you did not write. I get into that habit of verifying generated code in “AI on a Leash”.

One thing I want to fix while I am here: the original promised joins in the title and never actually showed one. There is a proper join section added below.


Original article below. Everything from here down is the post as originally written. The 2026 Update above covers what’s changed since.

I recently helped a co-worker read through a large SQL query with a few dozen joins and left joins, alias, and recursions. He is mostly a front-end integrator and although he has been tinkering with SQL for years, he never really understood the basics. I realize that unless you have to write SQL, many front-end developers work from the API layer, where database interaction has been highly abstracted, and with only brief interactions, many do not realize how easy it is to know the fundamentals. I do not address subqueries, stored procedures or vendor-specific syntax. This example is just the foundation, yet everything builds up from it and can be broken down into it.

§The Foundation

Declaring what you want by selecting, joining, and aliasing in the development of readable SQL queries.

SELECT [what] FROM [location] WHERE [a condition is true]

[what] is the columns you want back. The statement “SELECT *” litters hundreds of examples and is often scattered through sloppy code. “SELECT *” is generally an anti-pattern, it hides your dependencies, even if you genuinely want everything, it is not clear in that statement what everything is. SQL is declarative and so you should declare something, rather than being vague.

§Designing a Query

Start with the analog. A proper database schema is going to be as generic as possible, typically to accommodate a broader set of requirements than your specific query. Design the query to declare what you want from the database and use aliases to abstract the underlying implementation.

§The Analog

We are going to make a Paleo Bowl (I live on these things.) For this meal, we need meat, vegetable, and dressing. So I might start my query like this.

SELECT meat, price

I know what it is that I want, so I declare it. I know I can get it at the grocery store in the meat aisle, so I declare that as well.

SELECT meat FROM meat_aisle

I have some preferences, so I declare a specific as a condition.

SELECT meat, price FROM meat_aisle WHERE meat_type = "tri-tip"

The example above is an analog, stores in my area do not take orders directly via SQL, although they nearly all do in their inventory and point-of-sale systems. I can take this abstract query and make it concrete. In fact, this is how I design all my queries. I start with an abstract query based on my requirements, and refactor back to the physical entities. This use of abstraction is what makes aliases so great. However, I often see them used merely to make code smaller, and I think that is a shame. Today I saw a query on a Drupal module use an alias to reduce a table called field_collection_field_data_value to fcfdv. That is like Mr. Solzhenitsyn asking you to call him slzhntsyn for short; it got shorter but not easier.

§The Implementation

Here is how I use aliases, I start with them, by keeping my straw query above but tying it to a real table. The database has a table called “product” however to my analog example it is a meat aisle, so I declare it as such with an alias.

SELECT meat, price FROM product AS meat_aisle

“product” does not have a column called meat. In this database, it is called type. I use meat to alias type, since that is the condition I will be putting on it.

SELECT meat_aisle.type AS meat, meat_aisle.price AS price
FROM product AS meat_aisle
WHERE meat_aisle.type = "tri-tip"

Output:

meat, price
tri-tip, 699

It might seem strange at first to alias the beautifully generic product table to a specific like “meat.” However, I know I need other ingredients and may be joining the same product table multiple times, and calling “product,” “product2” is a terrible thing, but encountered too often. SQL is a declarative language for humans to interact with databases. The SQL should be readable by both.

My query should not be concerned about describing the underlying schema; the database does a great job at that. My query is best understood by clearly representing what I am trying to accomplish with it.

§The Join

The Paleo Bowl needs more than meat. In a real schema you assemble it from many rows across more than one table, so let’s drop the grocery analog and use the schema in the diagram at the top of this post: a product table, and an order_item table that records which products went into an order and at what price.

To list the items in order number 5, I join order_item to product on the product id, and I alias both tables so the query stays readable:

SELECT
    p.name      AS product,
    oi.quantity AS qty,
    oi.price    AS unit_price
FROM order_item AS oi
JOIN product    AS p ON p.id = oi.product_id
WHERE oi.order_id = 5

JOIN, an inner join, returns only the rows where the condition matches: an order_item that points at a real product. That is what you want most of the time.

A LEFT JOIN keeps every row from the left table even when the right side has no match, filling the missing columns with NULL. If an order_item referenced a product that had since been deleted, an inner join would silently drop that line from your order total, while a left join would keep it and show a NULL product name. That difference, a missing row versus a NULL, is exactly the kind of thing worth checking when you read a query, whether you wrote it or a model did.

The aliasing habit pays off here too. oi and p are short, but the AS product, AS qty, and AS unit_price on the output columns keep the result legible. Short where it is plumbing, clear where it is the answer.

§Mastery

If you are interested in really mastering SQL, I can highly recommend three books that got me through some of the most incredibly intricate database designs with challenging and unique requirements. Thanks to Joe Celko, I have successfully developed numerous queries that have been efficiently processing millions of records every day for over a decade.

celko

← back to all notes