Databases as file formats

The .mbtiles way

I’m building an interactive online map of all properties in Vancouver, and along the way there have been a few pleasant surprises. Most recently: the .MBTiles tileset format is surprisingly cool.

Background

Mapbox is one of the biggest players in the open source mapping space (especially now that Mapzen and Carto have thrown in the towel – Mapzen is closing and Carto is now using Mapbox tech). One of the many nice things about Mapbox is that they developed an efficient open standard for vector map tiles, appropriately named Mapbox Vector Tiles (read this if you’re not sure why vector tiles are great).

Map tiles are often pre-computed for each zoom level, and once you’ve done that you need to store them somewhere. Enter the .MBTiles tileset format.

Poking around under the hood

My first encounter with this file format occurred when I used Eric Fischer’s excellent tippecanoe tool to simplify my data set at lower zoom levels. Tippecanoe generates .mbtiles files, which are easy to serve to clients either by uploading to Mapbox, using a third party tile server, or even by rolling your own server with something like the mbtiles Node.js package.

All great… but after setting up a server my Mapbox GL JS client refused to render the tiles. I tried a few things without much luck, and then as a last resort I decided to poke around in the .mbtiles file. I was expecting to need a hex editor or similar, but then I saw this beauty in the spec:

MBTiles is a specification for storing tiled map data in SQLite databases

The files themselves are just relational databases in a known schema – how cool is that? Emboldened, I grabbed a SQLite client and opened up my .mbtiles file:

Right away, I spotted my issue in the metadata table: my client was attempting to bind a style to a layer using the wrong layer ID.

Conclusion

Relational databases are a great way to store data for easy retrieval+manipulation – SQL is far and away the most popular query language, and most developers are already familiar with it. SQLite is probably the most frequently used relational database in the world and it’s supported by tons of clients as a result.

Mapbox has done something extremely simple but very clever here. By using SQLite databases as data files, they’ve established a tileset format that is:

  • Easy to query, even in complex ways
  • Already supported by hundreds of tools (there are a lot of SQLite clients out there)
  • Instantly familar to most developers
  • Conceptually very simple
  • Flexible (from the spec: “the schemas outlined are meant to be followed as interfaces. SQLite views that produce compatible results are equally valid”)

It seems like this is already paying off for them, as there are quite a few third-party implementations of the .mbtiles format.

More generally, I really love the idea of using databases as a file format. It wouldn’t be suitable for everything (maybe don’t write your next-gen word processor document format as a position int, word varchar(max) table…) but it seems great for general collections of data like this. Versioning of the underlying database format is theoretically a concern, but that’s significantly mitigated by SQLite’s exceptional focus on maintaining compatibility.

headshot

Cities & Code

Things that don't quite fit in 280 characters.

Top Categories

View all categories

About

I'm a software engineer in Vancouver, Canada. I'm into databases, urban planning, and serverless development. I can be found on GitHub and Twitter for computery things, LinkedIn for professional things, and Twitter again for urbanism things. Sometimes these overlap!

Contact me