Data Warehouse

Data Warehousing in the 21st Century

Pinterest LinkedIn Tumblr

To be a successful relative company in the 21st century, analytics is no longer an option, but a requirement.  Yet, with all of the conflicting information out there, how do you know where to start, or what to believe?

This article serves to give you a real world road map for running a successful business intelligence/analytics team.

There are 1001 different tactics that relate to this subject and probably 10 times that many products and vendors. It can be confusing, especially if you start with methods. However, starting with general concepts may yield more consistent results.

So to put it simply for this concept, I’ve opted to boil everything down to the 5 big questions: Who, What, When, Where, and Why. Enjoy.


I’ve added this question first for one particularly good reason. This is THE question to start with. I can give you a 1 million foot answer of “why” to build your warehouse. This answer includes: how to enterprise questions, understand the trends and directions that your company has gone in, understand, and thoroughly comprehend the path you are in. A strong BI/Analytics team with a comprehensive data warehouse also allows you to gauge the success of campaigns you have in place to change the path you are on.

As for the micro “why”, this is a question you must answer for yourself. Make no mistake about it. If you want your efforts to be successful, you absolutely MUST answer these questions. Failing to ask the “why” question to everything you build usually ends in disaster – a long drawn out disaster at that.

You should not only ask the question of “why” for your implementation as a whole, but also ask this question for every project and report.  This may seem exhausting, but you know what’s more exhausting? Spending 2 months and 3 resources to build a dashboard and set of reports that no one needs or wants, and as a result, don’t use.  All because no one asked the question of “why are we doing this?”


I know what you are thinking: “BI Pharaoh, this is a no brainer, of course I know what we are building.” My questions would be:

1.) Do you really?

2.) Does everyone know?

Although you may assume that this is a no brainer, I would invite you to conduct a small experiment. Hop on Survey Monkey and send out a quick, simple survey to all the individuals that are related to your project and consequently should know the “what” of your project.  In other words, people who should be able to the question of “WHAT” are we doing?

You may be surprised at the answers you receive to the “what are you building or doing” question. The answer will reveal the commitment level and let you know how the vision has been received. Take for example three construction workers, working on the exact same building project. You pull each one to the side privately and question each of them, stating “Hey, what are you doing and why?” And their replies are:

Worker 1: I’m working to get a paycheck.

Worker 2: We are building a building, trying to be on time so I can get a bonus.

Worker 3: We are building a children’s hospital where new aged-research will take place, cancer will be healed, and parents are never charged a dime.

Which one of these workers truly knows what they are building? Make sure you have cast a similar vision when it comes to the things you are building on your BI team.

To put together a truly successful data warehouse, it is also important that you think outside of “one box.”  Simply stated, don’t just point your data warehouse implementation at one portion of the business.  Think beyond just sales.  Include Finance, HR, operations, customer service, contracts, and possibly even janitorial.  You may think these things are not related, but keep in mind, your entire business is interrelated in some way so your data warehouse should also be.


In the 21st century, you have a veritable plethora of options when it comes to where the warehouse will reside. It could be on the premises of your data center, on the premises of a private cloud, outsourced to a local data center, in AWS, in Azure, in Google Cloud, or possibly, all of the above. For possibly the first time in IT history, the question of “where” isn’t a barrier. However, that does not mean it doesn’t require planning. On the contrary, the virtually limitless configuration options call for much more planning, as there will be more moving pieces and possibly a distributed model with pieces of your implementation residing in different architectures.

Who and “Where the Who”

Okay so, that sounds confusing. Allow me to break this down. Let’s first start with the “who”.  Don’t underestimate this question, it may be a little deeper and more extensive than its three-letter simplicity suggests.  Begin with who is the solution for. Don’t confuse this with who is going to use it; the person using it may not be who the solution is intended for.  An example may be an ordering system used by agents, but ultimately, it’s the customer on the phone that the solution is for. In short, you need to consider and include those that the system is for, those that will use it, those that will support the solution, and those that are building the solution. Do your best to blur the lines between these different functions. Consider where these different functions exist and work to bring them together, that is, if you want someone to use it and use it properly.

When (Real Time vs Scheduled)

The question of when deals with the “freshness” of data. With all the talk of real time, it is easy to fall into the trap that everything needs to be right away, however consider the data itself.  There is no need to have real time data functions on information that only changes once a month, once a quarter, or even once a year!  Be smart when it comes to the processes you put in place to move data. Don’t let the trap of real time woo you into over engineering a solution and wasting precious time and resources that could have been used to accomplish another business problem.

On the other hand, there are times when real time is needed. In this case, you should explore streaming technologies such as AWS’ kinesis or Apache Kafka.  Exploration of these types of technologies could take your real-time data access to another level.


This article was meant to bring a high level of understanding to your data warehousing and data solutions needs. I will go into depth on some of these later, however it is imperative to have the basic concepts under your belt prior to moving to full implementation.