Data storage fads have come and gone over the past decade as the industry shifted from on-premises data storage to the cloud. Remember how “Big Data” was going to change everything? Then came the Data Lake, which had great promise, but still left many questions unanswered. Governance and data quality controls that are needed to provide dimensional constraints just weren’t there. Seems that every vendor began to promote their version of “The Modern Data Warehouse”. However, most first-generation cloud-borne, file-based data products don’t naturally blend with analytic reporting platforms like Power BI.
When I first started attending conference and user group sessions about Lakehouse architecture, I didn’t get it at first, but I do now; and it checks all the boxes. As a Consulting Services Director in a practice with over 200 BI developers and data warehouse engineers, I see first-hand how our customers – large and small – are adopting the Lakehouse for BI, Data science and operational reporting.
What is so attractive about this Lakehouse thing, anyway? Although SQL Server and Oracle will be around for a long time, many now consider them to be “legacy” databases used to manage line-of-business data. ETL tools like SSIS and Informatica are now has-beens. Students are coming out of universities coding in Python. Data professionals are accustomed to using Notebooks to transform sets of data. Data Engineers use Pipelines rather than packages. File-based data is easier and less-expensive to manage. The Parquet format provides far more flexibility than text files, but we can still use JSON, XML and CSV files for portability. Delta change tracking is magical. The SPARK engine is the industry standard, and universal for fast queries and on-demand processing. Scale-out clustering allows you to only pay for what you use, when you need it.
The Lakehouse is the evolution of the earlier cloud data platform in many pieces that came with “some assembly required”. All of the components are modern, mature and capable but complicated and require specialized skills. Imagine that the new version is easier to assemble with instructions that are only a few pages with stick figures, and it comes with an Allen wrench.
We’re seeing consulting customers putting Lakehouse and BI solutions on-line in just a few weeks. Then they iterate to scale-up their modern data warehouse/BI platform as they train their Center of Excellence champions and Data Governance organizations as they progress. In short, the Lakehouse paradigm is working!
4 thoughts on “How Lakehouse Architecture is Revolutionizing Business Intelligence”
Informasi ini bagus