Relational Database History

1983 called. They said SQL needs a makeover

I recently read 2 papers that I’d highly recommend if you ever wonder about the underpinnings of modern relational databases:

What Goes Around Comes Around (Stonebraker and Hellerstein, 2005)

This is an opinionated history of database architectures from roughly 1970 to 2005, and it’s got some serious credentials behind it – Michael Stonebraker was the driving force behind Postgres and many other things. It’s an easy read, and it’s a good explanation of how relational DBs and SQL came to (mostly) rule the world. It was written amidst a lot of hype over XML databases, and it (rightly, in retrospect) critiques XML DBs and suggests that they won’t see much commercial adoption.

One interesting bit: Stonebraker and Hellerstein do not like SQL, and they claim that it became dominant largely because of IBM’s dominance in the marketplace during the 1980’s. Which leads us to the next item…

A Critique of the SQL Database Language (Chris Date, 1983)

This was cited approvingly in the 1st paper as a “scathing critique of the semantics of SQL”, which caught my eye - virtually every database supports SQL, how bad can it be?

I won’t summarize the entire thing, it’s a laundry list of complaints, but this resonated with me:

Notice that it is just table-names that appear in the FROM clause. Completeness suggests that it should be table-expressions (as Gray puts it, “anything in computer science that is not recursive is no good”).

…a simple table-reference (i.e. a table-name) ought to be just a special case of a general table-expression

To give a (trivial) example, say we want to union 2 tables and extract a column. We’d write that like this in SQL:

SELECT EMPLOYEE_ID FROM EMPLOYEES
UNION
SELECT EMPLOYEE_ID FROM CONTRACTORS

But if table names were just special cases of table expressions, we could write it like this instead:

SELECT EMPLOYEE_ID FROM (EMPLOYEES UNION CONTRACTORS)

SQL gets the job done, and it could be a lot worse - but it is a little sad that many of the issues academics have known about for 35+ years still exist today. It’s a reminder of how entrenched something that’s only “good enough” can become.

Dark Mode on the Web

Using the prefers-color-scheme media selector

I recently rewrote most of this website (more on that soon!) and it’s now much easier to work on. Last week, I shipped a feature that I really like: dark mode using the new prefers-color-scheme media query. If you have dark mode enabled in your OS and a reasonably new browser, reillywood.com now shows you a dark UI using the Solarized colour scheme:

I’m taking a self-funded sabbatical after 8 years of working full-time. It feels like the right thing to do at this stage of my career.

I’ve had an unusually stable career for a software developer. When I graduated from university in 2011 I interviewed at a few places, including a mid-sized investment firm. I didn’t know anything about finance and the… rustic state of their website was a little concerning, but the people seemed great. So I took the job, and told myself I’d stick around for 2 or 3 years.

8 years later I was still at the same company. Orbis offers a lot of different opportunities; I wrangled big financial data systems, ran a small team, got my hands dirty building a modern web stack, and worked in London and Cape Town. It was a blast, and I’d highly recommend Orbis as an employer.

Still, after 8 years, it’s time to try something new. I miss the open-ended learning that’s so common in school, and I want to explore my technical interests with no regard for immediate relevance to my day job. Gianfranco Chicco’s description of a “serendipity break” really resonated with me:

In the note I sent out to my friends and network I mentioned that I’d be undertaking a Serendipity Break, which wasn’t a nice way to say that I wasn’t going to work for a few months but that I wanted to actively explore different possible paths.

In the book The Craftsman, sociologist Richard Sennett describes how “skill builds by moving irregularly, and sometimes by taking detours”, which is akin to keeping the Serendipity Engine in perpetual motion to encourage the strengthening of current skills and allowing the development of new ones.

Leaving a great job at a great company was a little scary, but I think it’s necessary for my long-term growth. Reading about Joel Spolsky’s sabbaticals helped a lot; it’s reassuring to see successful developers following similar paths.

My last day at Orbis was July 27th, and since then I’ve been trying all kinds of things. I’ve been diving into database internals, rewriting this website, and even learning Lisp/Scheme. I’m not exactly sure where my interests will take me next, but I’m looking forward to finding out.

Quick T-SQL Regression Testing

Using CHECKSUM() and CHECKSUM_AGG()

Here’s something I’ve found useful in SQL Server, but it should apply in any DBMS with checksum functions. Next time you’re refactoring some database objects and you want to do some regression testing, give the CHECKSUM (MSDN) and CHECKSUM_AGG (MSDN) functions a try.

CHECKSUM and CHECKSUM_AGG

They behave pretty much as you would expect; CHECKSUM returns 1 checksum given 1 row, and CHECKSUM_AGG is an aggregate function that returns 1 checksum given multiple checksum rows. Between the two of them, you can get a checksum for any arbitrary collection of data:

-- Returns 2 rows with 2 columns
SELECT Field1, Field2
FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2)

-- Returns 2 rows with 1 checksum column
SELECT CHECKSUM(Field1,Field2)
FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2)
 
-- Returns 1 row with 1 checksum column
SELECT CHECKSUM_AGG(CHECKSUM(Field1,Field2))
FROM (VALUES ('foo','bar'),('baz','foo')) TempTable(Field1,Field2)

Regression testing

These help with a common problem: when you’re refactoring a database object, how can you be sure that you haven’t changed the existing functionality? Sometimes it’s enough to spot check a few inputs, but often the logic is complex enough that this does not provide a high level of confidence. With CHECKSUM and CHECKSUM_AGG, you can quickly check large numbers of test cases.

For example, say we have a TVF called MyDateTVF that takes in a date parameter and returns columns Column1 and Column2. I want to refactor this function and then test that the functionality is unchanged for every date in 2018. If we have a table called DateList which contains every date (more useful than you might expect in a DB with a lot of temporal aspects), I can simply run the following before and after my change:

SELECT CHECKSUM_AGG(CHECKSUM(MDT.Column1, MDT.Column2))
FROM DateList DL
CROSS APPLY (
    SELECT Column1, Column2 from dbo.MyDateTVF(DL.Date)
) MDT
WHERE DL.Date BETWEEN '2018-01-01' AND '2018-12-31'

If that query returns the same checksum before and after my change, we can rule out any regressions for those inputs. Nice and easy!

Tailwind CSS

Web styles designed from scratch

I recently overhauled the UI for my letter builder web app, switching from Bootstrap to a neat framework named Tailwind CSS. It’s been great so far.

I dabble in web development, but it’s not “my thing”. Most of my time is spent on back-end systems and the occasional native UI. When I’m building a web UI, I usually spend a lot of time on MDN or W3Schools looking up syntax details.

Tailwind

Enter Tailwind. It’s a utility-first CSS framework, which in practice means they give you a whole bunch of “utility” classes which are effectively aliases for common styles. Here’s an example from when I was styling a button:

<button class="bg-blue-dark text-sm text-white rounded py-2 px-4 my-2"/>

With these classes, I’m saying that I want a dark blue background, small white text, 2 units of padding on the top and bottom (py-2), 4 units of padding on the left and right (px-4), 2 units of margin on the top and bottom (my-2), and rounded corners. Whew!

Right away, we can notice a few advantages relative to the normal way of doing things:

  1. This is fast to iterate on. I can add another utility class in-line without digging through my CSS file and reasoning about selectors.
  2. The class names are concise yet informative. py-2 is clearly operating on the Y axis (top and bottom), which is much easier to remember than a padding style with multiple unlabeled values.
  3. I have fewer choices to make. I didn’t need to worry about whether to use #0033CC or #0000FF for a dark blue, I just asked for dark blue. Same thing with padding+margins, I just chose from a small set of integers instead of wondering whether to do 0.5rem or 0.6rem.

Isn’t putting all your styles inline just asking for trouble?

Yes, this can get out of hand if you put too much stying inline – but Tailwind utility classes can be extracted into components as soon as you need some additional abstraction. The authors even recommend that:

Tailwind encourages a “utility-first” workflow, where new designs are initially implemented using only utility classes to avoid premature abstraction.

While we strongly believe you can get a lot further with just utilities than you might initially expect, we don’t believe that a dogmatic utility-only approach is the best way to write CSS.

Curation VS organic growth

I’m being a little hard on the “usual” way of writing styles in CSS. The web grew more-or-less organically, so it’s not fair to expect an overall design from the mess of styles available to modern browsers. Thankfully, that’s where projects like Tailwind come in.

It’s remarkable how much nicer it is to build web UIs when working with a thoughtfully curated and documented subset of styles. Sometimes a big step forward doesn’t need to come from a clever technical breakthrough – simply organizing existing information+symbols in a better way can reap massive benefits.

headshot

Cities & Code

Things that don't quite fit in 280 characters.

Top Categories

View all categories

About

I'm a software engineer in Vancouver, Canada. I'm interested in databases, urban planning, computing history, and whatever else catches my fancy.

Learn More / Contact me