As I visit businesses, consulting clients and training classes, to teach data reporting and analytics; there is a recurring conversation. It is one that I have had for eighteen years. The audiences change and the technology implementations are a little different over time, but the essential conversation is still the same.
This happened again last week as I was trying to explain the unique characteristics of Multidimensional and Tabular storage to a client. I’m developing a training course where I needed to explain the concepts once again – and then it hit me! …these unique data storage and analytic technologies do what we dream about in science fiction stories and achieve capabilities we imagine existing in the distant future. Channel surfacing on television this week, I watched an episode of Timeless, a Star Trek rerun and a Star Wars movie – where time-travel, space-warp travel and teleportation were commonplace realities. Although fictional as they were portrayed, I think these concepts are very real in our technology landscape. Please indulge me as I explain.
We live in a Linear world. We reside in a place, work in a place, store and move stuff from place to place. Centuries ago, if man needed to move something (perhaps just himself) to a distant place, he would walk, ride a horse or take a sailboat. In weeks or months, he would arrive in another place. Today, we get in a car, navigate the streets, perhaps highways and Interstates and then arrive in a different place within minutes or hours. For longer trips, we board a large metal tube, sit very close to several complete strangers as we climb up into the sky; some period of time goes by and then we somehow arrive in a distant place along with our stuff. At 35,000 feet where the air is very thin, a jet airplane can travel many times faster; leaving an altitude where certain physical laws restrict speed, only to re-enter that atmosphere after it has arrived. To someone from the distant past, this experience would seem fictional.
On a daily basis, I sit or stand in front of my computer monitors, and see and speak with people in distant places. We have live conversations about the weather in their time zone or hemisphere. Through some strange but now common process, technology converts our speech, images, directions, documents, effort and thoughts into 8-bit network packets and electrons that travel almost instantly through space; leaving the earth for a short period, to be transported and reassembled somewhere else.
Years ago, when I wanted new stuff, I would drive to the store. If they didn’t have what I wanted or if it was too expensive, I would spend the day driving to different stores. Today, I “go” to the store on my computer and the very next day, my stuff arrives on my doorstep. We are continually inventing ways to bend space and teleport things within the changing confines of reality.
Data storage is much like the real world. We store terabytes and petabytes (numbers we can’t even explain) in a small space. But, to navigate through the linear storage structure of relational databases and files, the data might as well be thousands of miles or kilometers apart. In order to perform time-variance and statistical calculations, program code must access a group of records in one location to aggregate them, and then find a related group of records – perhaps millions of rows apart – to perform another aggregation. The query might need to perform this operation dozens, hundreds or thousands of times; before it can calculate the final results. One of the best examples is a time-series calculation where a query must “visit” records for each time period to perform an aggregation. The effort is compounded when the goal is to compare aggregate values and variances for parallel and relative periods (e.g. “one year ago”, “three years ago”, etc.) and then perform forecasting or regressions.
In relational storage, the query engine must “drive” from place to place, navigating the streets (file groups, files, pages and rows) to find the records. In an analytic structure (like an Analysis Services cube, or SSAS Tabular model or Power Pivot/Power BI in-memory model), the query engine performs a calculation operation and then teleports to a new location to find a different group of related records.
In a multidimensional cube, the storage processing code quite literally folds space to create the cube structure and pre-calculate some of the aggregate values. Rather than leaving all the data in a linear table, it actually moves data segments close together, reshaping the data structure (thus, the term “cube”). This approach required a great deal of complexity in both the software and the resulting data structure. Advances in computer hardware and software techniques helped simplify the approach. The in-memory tabular model exists in linear, yet fragments of compressed space – where the actual distance between adjacent records is much closer than it appears to be. When the query process is done looking up a group of records it warps to the next group of related records using a memory address. Instead of traveling through space, it steps out of it, like the jet airplane leaving the atmosphere, to re-enter in a different location. It takes much less time to run queries because it is not moving through all the rows of linear tables.
image: microsoftpressstore.com, Ferrari, Russo
By using unique and innovative data storage methods (to reduce storage space and access data faster without moving through data), we can, in effect; perform time-travel, warp space and teleportation.