Why I Love Rust, but Deploy Python
I’m not sure if others have this same problem, maybe they are lucky, they get to build in their favorite language 24/7, it’s their tool of choice. I feel like I have a great burden to bear, a heavy one. I love to write Rust … but I deploy Python. Even when I know I could write Rust … Python gets deployed.
Why?
Because there are tools and tools. Part of being a good developer is understanding when and why you should do something, not just blindly doing whatever you want or whatever feels good that day.
I mean this is what separates Senior Engineers from ones who might be good programmers after all, but won’t make it to Senior+ … because coding is more than coding. It includes other skills.
If you are interested in what those skills are, you can read more here.
The reality of code in a Data Engineering context.
We need to be context-aware when in the school of Data Engineering, we need to have some tact and understand the culture we operate in. No matter how many pundits want it to change, Python is the name of the game in Data Engineering, especially in the ML age we live in.
Data Engineering is very similar to Software Engineering, they cross paths in many places … but they have different Titles for a reason. Because they are not the same thing.
We think about different things, we solve problems differently, hence they are separate in most places. Don’t get me wrong, I think most Data Engineers severely lack good Software Engineering skills and would benefit greatly from taking best practices from Software Engineering back to their home towns if you will.
This is why I love Rust but deploy Python.
I’m going to give you some bullet points, and you should meditate upon them.
- Python is high-level.
- Rust is low level.
- You can build the best tooling with Rust.
- You can build things quickly with Python.
- Python is ubiquitous in Data Engineering.
- Rust is more specialized.
- Everything has Python SDKs and APIs
- Not everything has Rust SDKs and APIs.
The truth of the matter is that you need to fully understand and appreciate the problem you are trying to solve before you reach for a tool, say Rust vs Python.
Would I prefer to build every single thing in Rust? Why yes I would, thank you very much.
Should I write everything in Rust in Data Engineering? Why no, I should not.
Here is a perfect example to make it clear, something that most Data Engineers will end up doing over a million times in their illustrious careers.
Paganite s3 buckets and work with those files, maybe look for a certain file(s) that match a string.
I need some code that any other average Data Engineering can see and understand, work with, modify, whatever. It’s a simple task that needs a simple solution.
Why would I NOT write it in Python? I mean there are no serious performance implications, I just need something that works.
I mean I could do it in Rust … but why?
I guess if I was building some CLI tool to assist other Data Engineers in doing a bunch of common things with cloud files … yes, THEN at that point I would consider using Rust … it’s the perfect CLI tool.
I mean if you doubt me, go look at these implementations of pagination with s3 in Rust, here, and here.
I mean I don’t even know if this code below works, but it would be something along these lines in Rust.
I’m not going to do that to some poor old new Data Engineer in fresh from the fields of college. They would run away screaming and never come back.
So, dear friend, I know you might love Scala, Golang, Rust, C++, but next time you are doing something obvious in simple in a Data Engineering context. Reach for that good ole’ Python.
First of all I want to say terrific blog! I had a quick question which I’d
like to ask if you don’t mind. I was interested to know how you center yourself and
clear your mind prior to writing. I’ve had a tough time clearing my thoughts
in getting my thoughts out. I do enjoy writing but
it just seems like the first 10 to 15 minutes are usually lost simply just trying
to figure out how to begin. Any ideas or tips? Many thanks!
Your bright-eyed young data engineer is going to have to learn to do the hard stuff eventually; while code should be “clean” it also doesn’t mean you need to code for the lowest common denominator fresh out of high school. I agree, though, there’s a right time and place.