Anyone who’s been roaming around the forest of Data Engineering has probably run into many of the newish tools that have been growing rapidly around the concepts of Data Warehouses, Data Lakes, and Lake Houses … the merging of the old relational database functionality with TB and PB level cloud-based file storage systems. Tools like Delta Lake, lakeFS, Hudi, and the like.
Sure, these tools have been around for some time, but the uptake and adoption of them all have been rapidly growing. I use Delta Lake on a daily basis, taking advantage of the many wonderful features it provides to simplify and reduce complexity in data pipelines. But, I’ve been sitting around for a long time waiting for the plethora of “add-on” tooling to come out, stuff that will make my life easier. I recently saw one of the first tools like that for Delta Lake, namely mack.
Mack appears to have the ability to “do the hard work for you,” a concept that appears to be growing in popularity, but which I have a fraught relationship with. Double-edged sword? Let’s find out.