In an advancing age of improved data analysis, more and more enterprises are adopting business intelligence and analytics tools to benefit the company. Despite the variety of the sectors involved, all of these companies have a single common goal to use business intelligence software to transform data into actionable insights and competitive initiatives.
As the primary competitive advantage, data analysis should deliver an increased understanding of the factors that shape markets, influence businesses, and help companies to act on that knowledge. Ultimately, the hope is to be able to outmaneuver and outsell competitors, while proactively addressing customer needs.
With business intelligence and data analytics becoming the go-to approaches for enterprises, many companies are choosing to invest in developing their own data warehouse, commonly talked about as a data lake. In order to generate valuable insights from deep data analysis, enterprises need to have a reliable data warehouse as the foundation.
Today, a data warehouse is used to do more than just integrating data from multiple sources for better, more accurate analysis. A data warehouse must also be reliable, traceable, secure, and efficient at the same time. It needs to offer these advantages to differentiate itself from a simple database, especially in business intelligence.
As part of its real estate data and analytics platform, Cherre enables their customers to build a massive data warehouse from clients’ disparate real estate data. We pull and collate information from public, paid, and internal sources so our experience with big data analytics is extensive. This article outlines the lessons Cherre learned from our own experience at implementing data best practices.
Valuable insights can only be generated when the data warehouse successfully integrates data from multiple sources for reporting and analysis. This is where good data warehouse governance becomes very important. There are several enterprise data warehouse best practices and governance tips to keep in mind, along with key principles to implement.
Good data warehouse governance starts with three crucial components, the first one being leadership from team champions in data governance. This is a DevOps concept to roll out best practice adoption through advocates—strong and respected leaders from any position and hierarchy in the company. Such champions should coach the rest of the team to understand and implement the values of excellent data governance. Good governance needs this type of support and endorsement of invested stakeholders in the company to improve adoption success across the whole enterprise.
The more stakeholders who are involved and reinforce cohesion, the easier it will be to achieve good data warehouse governance. A unified data initiative needs to be financed, enforced, and prioritized across the entire enterprise. Such champions can help teach the team about the importance of consistency, in order to save time, and ensure the work becomes long-term best practice and not merely a short-term project.
The second component is organization itself. For good data warehouse governance to be implemented, best practices and data management policies need to be implemented correctly and, above all, consistently. In the case of data warehouse governance, strict control and constant oversight are very important for maintaining the quality of insights generated by data analysis.
There is another reason why relying on organization as a crucial component is necessary, and that is the fact that the insights generated by a company’s data warehouse will also affect the entire enterprise. Understanding the importance of safeguarding the data warehouse through positive policies and good governance provides the right foundation.
The third component is process. Good data warehouse governance is never a one-time thing. It is a continuous process of review, fine-tuning, and enforcement of data governance policies. For the data warehouse to remain effective in serving its purpose, the data warehouse governance process must also be effective.
The easy parts of the equation are financing the governance process and the creation of data management policies. The real challenge comes from enforcing the data management policies, especially when the enterprise is relatively complex in structure. Even simple tasks such as defining and monitoring key performance metrics can be complex when the processes are not well-defined.
If we take a closer look at how data warehouse governance must be approached—based on the three components we discussed earlier—it becomes clear that this is a process of continuous improvement based on strong advocacy by the data governance champions as mentioned earlier. Good data warehouse governance has a primary objective of keeping the data warehouse relevant and effective, and that often means adjusting to business objectives when needed.
It is up to strong leadership to lead the way in identifying the necessary changes. Change in demand is the most common one to deal with, and it is a change that will come from getting developer buy-in and support from senior management.
Different decisions require different sets of information and insights. This gives the data warehouse champions a unique vantage point in seeing how the data warehouse must meet demand for information.
The same is true for changes to input. As demand for insights and information shifts, the data warehouse must be able to integrate data from new sources. At the same time, it needs to remain effective enough to eliminate unnecessary data sources from its catalogue. These changes must happen quickly to avoid bias and deviations.
The infrastructure supporting a company’s data warehouse is also constantly changing. At the very least, more storage is required as more data sources are added. Advanced processing such as machine learning-based big data analysis requires more computing power and memory. There are even changes in networking configuration to anticipate.
Maintenance is the last component in the equation. It involves assessing potential risks, handling operation issues, and making sure that the data warehouse remains operational at an efficient level. That last part is a big challenge for bigger enterprises, since they tend to overstress resources sooner than needed in order to fill performance gaps.
It’s important when setting up a data warehouse to make sure that data governance policies and enforcement efforts remain true to the business objectives of the company. However, an improved organizational element is not the only component needed in order to achieve good data warehouse governance.
As well as encouraging successful adoption through strong leadership, it is also a good time to look at implementation team leads. These are the teams that will handle day-to-day tasks of ensuring compliance, maintaining the data warehouse itself, and making sure that the data warehouse (and the data streams it integrates) continue to perform optimally.
The implementation team should include the data governance champions as a best practice. As suggested earlier, these can be from any role in the team such as network administrators, systems administrators, data engineers, data modelers, and front-end developers for when UI or other components are needed.
Leadership needs to consistently support all members of the implementation team and support translating enterprise data analysis needs into data warehouse policies. It is also up to the leadership involved to determine the direction of activities such as development and utilization, and make sure that feedback from the implementation teams and other stakeholders flows well.
There are multiple reasons why good data warehouse governance is a must, and it goes beyond the need for better data collection and management. Yes, having an efficient system for collecting and processing data allows the enterprise to benefit from lower data management costs in the long run, but the business implications are no less significant.
For starters, data can be fully integrated and processed holistically. Data relating to financial activities of the company, for instance, can be made more valuable when compounded with external data about market growth, competitors’ actions, and industry average values. For example, sales data can be analyzed in a deeper way within the context of market performance and changes.
The result is a healthier data-driven decision-making process, and one that encourages collaboration between departments. When a thorough analysis is performed, multiple aspects can be taken into account from the start. When deciding to expand the manufacturing line, for instance, market insights can be just as valuable as data from the sales and marketing teams.
There is also the possibility of increasing revenue from good data warehouse governance, both from the reduction of CAPEX and OPEX, and from the increase in revenue through the discovery of new opportunities. These are objectives that can be achieved through better data management and more accurate decision-making processes.
A data warehouse doesn’t have to be complex. Through good data warehouse governance and the implementation of data management best practices, everyone in the enterprise can play an active role in maximizing the business benefits of a data warehouse.
▻ VP of Engineering @ Cherre ▻ Cloud Solutions Architect ▻ DevOps Evangelist
Stefan is an IT professional with 20+ years management and hands-on experience providing technical and DevOps solutions to support strategic business objectives.