Data Warehousing Archives - Page 6 of 6 - Confessions of a Data Guy

Columnstore Indexes – Always Faster Uh?

Columnstore indexes promise to be the savior of every data warehouse. So, what are they, when should you use them, when to stay away? Columnstore indexes are just what they sound like, data physically stored in a columnar way. This is what makes them so fast when it comes aggregating large amounts of data. The data is compressed and similar values are stored together, the database engine can grab all the values it needs to SUM for example, very quickly, this all leads to faster query results.

March 10, 2019

Data, Data Engineering, Data Warehousing, Python

Python and Apache Parquet. Yes Please.

Update: Check out my new Parquet post.
Recently while delving and burying myself alive in AWS Glue and PySpark, I ran across a new to me file format. Apache Parquet.

It promised to be the unicorn of data formats. I’ve not been disappointed yet.

September 29, 2018

Data Warehousing

Where Good Data Warehousing Goes Wrong.

It wouldn’t be the first time.

The story is usually the same, lots of people, contractors, software installation, months of ETL work, months of database work, testing testing and more testing. And then it arrives, a beautiful spiffy Enterprise Data Warehouse with all it’s facts and dimensions in all their Kimball glory.

April 18, 2018

Data, Data Warehousing, SQL

T-SQL Basics : Running Totals

Some of the most unused yet powerful functions in T-SQL are Window functions. These functions are powerful because they allow calculations on a Window of the data you specify, even while the calculation scrolls through your data.

March 13, 2018

Data, Data Warehousing

It’s Called a Non-Lookup Dude

Seriously…..It’s called a non-lookup dude. Probably one of the most annoying situations I’ve come across when working on Enterprise Data Warehouse {EDW} teams/projects is the non-lookup problem.

February 22, 2018

Columnstore Indexes – Always Faster Uh?

Python and Apache Parquet. Yes Please.

Where Good Data Warehousing Goes Wrong.

T-SQL Basics : Running Totals

It’s Called a Non-Lookup Dude

Interesting links

Pages

Categories

Archive