
This is a topic I’ve been musing about lately. The idempotent data load has been a source of much pain and suffering in the lives of many a data engineer and data warehouse developers. Apparently somethings don’t change with the passage of time. My first job in tech was working on a data warehouse team with a classic Kimball style model on SQL Server, back then worrying how to make data loads and ETL idempotent was the task of the hour. All these years later working on data lakes in DataBricks with Spark … guess what …. still worrying about idempotent ETL and data loads.
Read more