I am a glutton for punishment, a harbinger of tidings, a storm crow, a prophet of the data land, my sole purpose is to plumb the depths of the tools we use every day in Data Engineering. I find the good, the bad, the ugly, and splay them out before you, string ’em up and quarter them.

Today, for the third time, we put that ole’ Duck to the test. I want to test to see if DuckDB has fixed their OOM (Out Of Memory) errors on commodity hardware … that age old problem of “larger than memory data sets.

Read more

There are some things you don’t need until you need them. I ran into that situation recently with needing to process some CSV / Flatfiles on short notice. At first, it appeared to be easy, but then I realized, as usual, there was a little monkey wrench thrown into the middle of it.

It is nothing earth-shattering, it’s just something that comes up so rarely that I forget there are ways to deal with these inconveniences without jumping through unnecessary hoops.

Read more

I figured a few of us might need the WordPress drama explained like we are 5. So, here you go.

  • WordPress is the GOAT of internet website builders
  • WordPress was founded by Matt Mullenweg
  • With much of the internet running on WordPress … hosting WordPress is of course … lucrative and a big business.
  • The founder of WordPress, Matt Mullenweg, is CEO of a company called Automattic
  • WPEngine is the other big gorilla in the WordPress space.
    • hosting platform etc.
  • There is a lot of money involved
  • Mullenweg was/is unhappy with WPEngine
    • went after WPEngine for being Equity Firm owned, and doing things with WordPress features to “save money,” as well as confusing consumers about the WordPress Trademark based on what is official and what isn’t.
  • The fight turned very public, and now lawsuits are flying back and forth
  • The fight is also spilling over into the open-source community, as there are myriad of developers and businesses who’ve built their companies and businesses around WordPress.

It reminds me of the Rust trademark hoopla. The who thing has quickly devolved into what is supposed to be “open-source” software being controlled by money hungry interests who lay claim to trademarks and other “stuff” surrounding brands, who then start telling tons of developers and companies (who’ve been happily doing things for years) that they are now all subject to x, y, z and we will sue you and destroy if you don’t.

People take sides, and the open-source world and all the “things” attached to the “thing” in question descend into chaos.

Is there anything worse than the PR process (Pull Request) at most companies? Probably not. It’s the dreaded 600-pound gorilla in the room that no one wants to talk about. Everyone hates it, everyone has to do it. But, it doesn’t have to be like that.

There are a few tried and true ways to make the perfect PR that takes all your problems away. Checkout the video for more.

 

This is an interesting one indeed, it’s one that teases and puzzles the brain to no end. Has the Data Warehouse finally died, has that unruly upstart the Lake House finally taken its place atop the seething mass of data we call home? Can we say that after all these decades the Data Warehouse Toolkit and Kimball is finally gone the way of the dinosaurs? Maybe. Probably. I don’t know.

Read more

I’ve been hacking around with tools and programming since Perl was a thing. I’ve worked the gambit of Data Platforms from large organizations to tiny startups, and all those in between. I’ve worked on Data Platforms that dropped ungodly amounts of money on SAP products, and places where we would build our own massive data processing platforms on Kubernetes.

Each to their own I guess.

Read more