, , ,

How to Solve Data Engineering Problems

One thing I find myself doing these days (I am unsure how I feel about this), is teaching others to solve problems … Data Engineering problems to be specific. It’s not a hard stretch for most to imagine that what a person does at Senior+ software-type levels is just write good code all day.

I assure you, this is not the case typically.

I mean, if you are at the Senior+ level of anything it’s kinda expected that you will be able to write good code as needed. You will learn you don’t need to be the best, but you need to be above average to be effective. So, if it isn’t about good code all day … what it is all about?

Hmm … if I had to sum it all up.

Teaching, mentoring, and demonstrating to others how to solve problems effectivey and efficently.

Honestly, the problem space itself isn’t all that important. I know it sounds strange, but after you write enough code for enough years, yeah … you get stumped, yeah there is new stuff you don’t know … but you begin to realize that in most cases, problems can be solved given enough time and teamwork.

First, I’m going to start with what I see most people do WRONG when solving a new problem.

What people do wrong when solving (hard) problems.

This is my experience, so take it for what it’s worth and what you’re paying for it.

  • decide on the solution 5 minutes after learning about the problem.
  • decide to start writing code immediately
  • do not allocate time to specifically think about the problem.
  • do not write or document anything down (decisions, thought processes, ideas, etc)
  • work in isolation, don’t share ideas, ask questions, get feedback, etc
  • don’t flow chart or draw out solutions
  • don’t read enough about the problem space (docs, Reddit, blogs, google, etc)
  • don’t walk through a detailed design before writing code. 

I think if we try to sum this up, what I try to teach the most is the skill of being a critical thinker, not assuming anything, avoiding early or rushed decisions, question everything, write everything down, read ALOT.

The code is the code, the solution is the solution, and Engineers are good at doing that … that isn’t the problem. The problem is not thinking through everything, not knowing all options, making assumptions, not communicating, getting feedback, and finding the hidden corners.

No software project or problem space is free of issues, but what we can do is reduce the friction ahead of time … before we walk down the path of writing code for a certain solution.

We know ahead of time there will be bumps in the road, what we are trying to do is …

  • find the path of least resistance
  • keep everyone informed and involved
  • explore all options
  • create visuals that represent our path and proposed solution
  • explore details ahead of time to discover difficulties and gotchas.

Honestly, the tools we choose … Databricks, Snowflake, BigQuery, Spark, and Python. Rust, Polars, AWS … I mean we could go on forever … those are solved problems … most Engineers can make something work. Remember it’s typically not the tools, solutions, or code itself that ends up being the “issue.”

It’s typically …

  • actually understanding the requirements fully
  • missing important details
  • over-engineering or under engineering the problem
  • keeping the project on track and delivered on time.

These common issues don’t arise from some lack of software engineering excellence or lack of understanding or some bad tooling.

These problems arise from Engineers who think it’s all about the code they write. They are smart enough to know right away immediatly. They jump in way to fast. They don’t consider reality, but their own egos and tools/software they want to write.

1 day of planning and drawing pictures, reading, and generally pretending like you’re in middle-school class again will save you failed projects, frustration, and disappointment.

I can hire any engineer and teach them how to code, and use a tool or platform … that’s a given. What is harder to teach is the ability to think critically, to slow down, to not jump to conclusions, to document ideas and thought processes, and to prove a solution first.

How to solve your next problem.

Here is how to tackle your next problem.

  • Force the requestor to write down their request in a One-Pager
  • Setup a meeting with the project/problem initiator and talk for 30 minutes
  • Create a document for Engineering
  • Write down all your ideas and possible solutions
  • Research each one and document (tons of reading)
  • Pick a few
  • Review with group
  • Start a flowchart that describes and follows the solution end-t0-end
  • Write down each step of the process and the solution
  • Review the chart/visual along with the written out solution/flow
  • Review with a larger team and get feedback, iterate
  • Put the final design together … visually and in text (the literal steps)
  • POC if necessary and large enough problem
  • Add 20% to your work and time estimates
  • Execute

Notice that the execution and creation of actual code and solution only comes at the end after a lot of upfront work.