I recently started experimenting with building tooling for ETL systems. After many years of wrestling with ETL in industry, I had a few questions on my mind:
- Can we make common data issues quick to resolve?
- Can we make automated data transformations as easy to work with as spreadsheets?
Importing data at scale is painful. Your data providers will screw up the formatting, systems will experience connectivity issues, and your transformation logic will fail on cases you didn’t expect. What if our tools focused on helping with those failures, instead of assuming the happy path?
Speaking of transformations, how should we write them? Some systems take a code-first approach, which is great for coders and impenetrable for everyone else. Others take a GUI-driven approach, which usually becomes the stuff of nightmares. I think we can do better, by drawing inspiration from a tool that’s found in every office: