, ,

Hosted (SaaS) vs DIY Data Tools

I’ve been hacking around with tools and programming since Perl was a thing. I’ve worked the gambit of Data Platforms from large organizations to tiny startups, and all those in between. I’ve worked on Data Platforms that dropped ungodly amounts of money on SAP products, and places where we would build our own massive data processing platforms on Kubernetes.

Each to their own I guess.

To buy or build, that is the SaaS question.

Unlike most folk, I have no vested interest in the answer to this question being one thing or another. Buy, build, whatever, I don’t care that much. Well, maybe I do care if it involves me, but you can do whatever you want. It’s a free country, for now.

Anywho, this is an interesting question to reason about, buy vs build, and I think the answer is clear and totally depends on the makeup of your engineering team and the size of your company.

I’m just going to cut to it and tell you out of the gate when you should build and when you should buy.

  • If you’re a very small team, you should probably buy.
  • If you’re a very big team, you should probably buy.
  • If you’re in-between you can consider building.
  • If you can’t keep up with current tech debt, or never work on tech debt, don’t build.
  • If you do average and common things, you should buy.
  • If you’re working in a problem space that is fringe-ish, (think geospatial, etc) you can consider building.
  • If you have lots of time and not much work, you can build.
  • If you have more work than time typically, you should buy.
  • If you’re a very technical team, you can consider building.
  • If you’re not that technical, you should buy.

That should get you down the right road I imagine. Kinda vague, kinda not.

Advantages and Disadvantages of Build vs Buy.

I tried to make it as simple as possible, but it can be hard to decide sometimes, depending on the use case, so let’s set some more of the record straight when it comes to building or buying software.

Firstly, if you are overworked and don’t have an abundance of time on your hands, it’s very unlikely should build. People underestimate the amount of …

  • work it is to build and deploy your own solutions
  • the amount of upkeep and maintenance those systems require

If you buy a thing, you pay for it, but you also receive something in return. The ability to do more with less, if you buy well, your maintenance cost and time should be low. If you’re building your own thing, you have to keep it up to date, fix bugs, add features, documentation, etc.

You can build things from scratch that work, I’ve done so. But you have to be ready to commit time and headcount to keep that thing running, train people to use it, make tweaks, etc. There is no free lunch. 

Custom-built tools are inherently more flexible, you can build what you want, how you want, and the sky is the limit. You can twist it and fold it to meet whatever needs arise. Not so much with the buying of a thing. It typically is what it is, and you are going to have to move yourself to fit inside its pre-built box.

You need to have the technical chops to build a thing, and to keep it going, as well as add new features. You better not be a bunch of milk toast programmers if that’s the road you choose, you should be confident in your abilities and the problem space.

Team Size.

I wanted to talk about team size for a minute because it should play a big roll in the decision-making process. If you work on a very small team, you should probably not be building major pieces of technology … why? Because if you’re a normal team with a normal workload you have better things to be doing.

You have to make tradeoffs … you need to focus and free time up to work on things that are important for the business. In the end, the business doesn’t care how you deliver what they need, they just want it delivered and working. There is a reason Databricks and Snowflake came in like juggernauts and took over the Data world.

Sure they can cost a pretty penny, but they pay back in dividends. No more teams of engineers working on EMR to make it sing. Databricks just works. You’re freed up to work on important business logic and other tasks. 

This is the idea.

On the other end of the spectrum if you’re working on a huge engineering team … at a typical company … everything moves slowly, and it takes forever to do a small project … what’s the chance the giant new tool you want to build will get done on time, budget, be useful, and adopted? Small.

Medium-sized teams worked on fringe problems? Now you might have an argument to build. I’ve been there before. You have the resources to build and maintain, nothing fits perfectly what you need, you have the chops. Build away.

If you buy, watch the costs. If you build, watch the complexity and drive ruthlessly towards completion. Nothing is perfect, but most likey if you ask yourself the above questions, the answer is probably clear.