sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009
The message from the Microsoft Business Applications Summit this week was very clear: Power BI is growing up.
We have known for a while that Power BI is a great front-end tool for enterprise-scale tabular and multidimensional models. Power BI Premium will now allow almost limitless data scale using extremely large datasets by leveraging composite models that employ both DirectQuery and VertiPaq cached mode tables. This feature shatters the previous barriers where data models could only support either DirectQuery or cached mode (imported data). Additionally, Microsoft publicly unveiled plans to add IT developer centric features to Power BI Desktop for managing large models, version control and application development lifecycle.
These announcements leave many open questions about how Microsoft will continue to support self-service and enterprise customers with one tool and about the long-term future of SSAS and Visual Studio-based SSDT. At this point, none of these tools are on the chopping block, but it is clear that James Phillips and his team have significant momentum and are not slowing down to wait for other products to catch-up.
In this interview with Christian Wade, he describes the benefits of this new capability. Christian is Group Program Manager on the Power BI and Analysis Services product team with focus on data model development and scalability features.
Composite Models & Aggregations
In the opening keynote and again in his sessions, Christian demonstrated Power BI reports on the taxi driver activity database with over a trillion rows of raw data. The larger dataset was in a Spark cluster, accessed using DirectQuery. Aggregated tables were stored in the in-memory model using the new composite model feature. As the data was explored in report visuals, the Power BI engine would seamlessly switch from tabular in-memory aggregate tables to DirectQuery source data in order to return low-level details. Composite models will allow mashing-up imported database and file-based data with an DirectQuery.
There are limits and complexities with these new features. You cannot mashup imported tables in a Power BI model based in a direct connection to SSAS, but enterprise-scale features in Power BI arguably may not steer a solution architect to select SSAS over Power BI for serious data modeling. With incremental data refresh, large model support, row-level security and many other “big kid” features, Power BI might be a preferable choice. I’m not ready to rule-out Analysis Services as the better option for most enterprise solutions – at least not in the near future, but Power BI is definitely heading in that direction.
Power BI Desktop Features for Developers
News Flash: Power BI Desktop will be the primary tool for serious, enterprise data model development. A bunch of features are in the works and will arrive soon for preview. The obvious question is what will happen to SSDT for Visual Studio and the answer is that it’s not going away but the product team is putting their energy into enhancing Power BI Desktop for now. It will likely have some kind of “Beginner/Advanced” switch to simplify things for the novice user and isolate advanced or complicated, developer-centric features.
Here are a few screen captures of the forthcoming features:
Schema Compare & Model Integration
Like the BISM Normalizer tool created by Christian Wade, there will be a fully-supported compare and merge tool baked into Desktop. The early build is branded with components from MAQ Software.
Relationship Diagram Designer, Properties Pane and Application Lifecycle Management Features
A new-and-improved diagram designer is imminent. It will allow large models to be more effectively managed and displayed. You will have the ability to save individual views of the diagram with selected, subject-area tables.
Like in Visual Studio, properties can be set and managed in a Properties sheet window. Support for standard SSAS features like display folders is coming and can be managed here.
Business Applications Release Notes
With the velocity of features and updates, a comprehensive set of release notes are available here. This document will be updated frequently with the revised roadmap, feature history and announcements.
APIS and Powershell CmdLets for Power BI Administration
At the MVP Summit in March Kay Unkroth promised API endpoints and Powershell CmdLets for managing all aspects of Power BI. He even did a dance and sang a song to make the announcement memorable. This week, the product team delivered what they promised, with the announcement official here.
The following features and capabilities are on the delivery short list coming in the next 2-3 months:
Power BI Service
The new, “modern workspace” eventually replacing current workspaces integrated with Office 365 groups. The new app workspace will no longer be bound to an Office 365 Group. The feature will be optional at first, but will replace current workspaces soon.
Now, users can be: Member, Admin, Contributor
Can add groups to workspace
Premium capacity tenants can have nodes in different regions
Admins will have visibility and access to all objects and resources
Microsoft Business Applications Summit
Keynotes and Featured Sessions from the Microsoft Business Applications Summit are here.
Since starting the Guy In A Cube series over three years ago, Adam Saxton has become the front man for Microsoft Business Intelligence feature announcements and Power BI how-to tutorials. Joined by Patrick Leblanc last year, the Guy In a Cube series features over 200 short video tutorials with at least two new clips added every week. The Guy In A Cube YouTube channel currently has over 13,000 followers.
I always look forward to spending time with both of these guys at community events. I first met Adam several years ago when he was a Senior Support Escalation Engineer for Microsoft who helped with a number of tough BI server setups. Patrick did senior level consulting work for customers in the field before joining Microsoft as Data Platform Solutions Architect. Adam and Patrick are natural born entertainers. With their deep knowledge and expertise with database technology, security and the entire BI, analytics and reporting stack; they offer one of the top online resources for learning and keeping up with the rapidly-expanding Microsoft BI platform.
At conferences and events, I rarely see Adam without a camera in his hand and so was a real treat to get them in front of my camera for this interview. We chatted about how they got started and how they continue to use their series to highlight new and important features, and to provide guidance to support the Microsoft BI community.
In this interview during the 2017 PASS Summit, we continue to explore the topic of this series of blog posts on Microsoft BI solutions for the enterprise. Patrick and Adam share their guidance about managing large-scale solutions, version control and multi-developer projects using Power BI, Analysis Services and SQL Server.
Between sessions at the PASS Summit, I had the privilege of interviewing Alberto Ferrari and Marco Russo; SSAS Maestros and BI industry thought leaders. Alberto and Marco are revered as expert trainers and practitioners of SSAS and Power BI solutions. They are also well known for evangelizing best practices using the DAX calculation and query language. We spoke about some of the challenges and learnings they have gained from years of large-scale project implementations.
Both SSAS Tabular and Power BI Desktop share the same characteristic in that the semantic data model definition is described in a single file. Compared with older multidimensional model projects, the single file approach simplifies much of the development effort but it can be challenging to coordinate projects with multiple developers. They shared tool recommendations to assist with project integration, version management and development. Marco and Alberto recommend a number of community supported add-ins and other freely available software to enhance Visual Studio and Power BI Desktop.
I sincerely appreciate these two gentlemen taking the time to share their insights in this short interview, and their many contributions over the years that continue to improve Microsoft’s industry-leading Business Intelligence platform and the community that supports it. A comprehensive library of resources, training, tools and articles are available at: SQLBI.com.
During the interview we talked about several different tools that they adamantly endorse for any SSAS developer to have at their disposal. From my own experience, in rough order of priority, I recommend :
BIDS Helper / BI Developer Extensions – This Visual Studio add-in is an important tool for anyone developing Business Intelligence solutions in any version of Visual Studio or SQL Server. It includes several essential features and capabilities for SSIS, SSRS and SSAS projects. It has been a community-developed and community-supported tool for many years and is considered by most experienced BI developers to be a core extension of the Visual Studio SSDT (formerly called “BIDS”) add-in.
Community developed tools like this help to spur future enhancements to the officially-supported capabilities in later Microsoft development tooling. I suggest that you carefully read the documentation for SSAS Tabular features in this tool because some of the options only apply to certain versions of SSAS. Some of the BIDS Helper features for SSAS 2012 and 2014 have since been added to the officially-supported designer for SSAS 2016 and 2017. BIDS Helper was initially maintained on the Microsoft CodePlex site (here: http://bidshelper.codeplex.com/) and has since been moved to GitHub here: https://bideveloperextensions.github.io/. It can now be installed in the Visual Studio Tools > Extensions and Updates menu by search for “BI Developer Extensions” in the Online section.
DAX Studio – This is a stand-alone application and the most comprehensive DAX, SSAS and Power BI calculation script and query editing and performance tool available. It is an open source project managed and maintained by a collaboration of several community MVPs and industry leaders (coordinators are Darren Gosbell & Marco Russo). The latest version and codebase have recently been moved from CodePlex to http://daxstudio.org.
Tabular Editor – An open source, stand-alone SSAS Tabular model design tool that runs outside of Visual Studio. It is distributed through a GitHub repo at: https://tabulareditor.github.io. The source code is maintained at: https://github.com/otykier/TabularEditor. It is described as a lightweight editor for SSAS Tabular Models built in .NET/WinForms and utilizes the Analysis Management Objects (AMO) library.
DAX Editor – an extension to SSDT that makes it easier to edit DAX measures in a Tabular project. It includes: DAX Syntax Highlighting, DAX Formatting and Textual representation of measures defined in a Tabular project
Extracting/importing measures from/to a Tabular model file (.BIM). This add-in is developed and maintained by SQLBI. It can be found in the Visual Studio > Tools > Extensions and Updates menu or downloaded from SQLBI at: https://www.sqlbi.com/tools/dax-editor.
BISM Normalizer – An add-in for Visual Studio/SSDT that can be installed from the Visual Studio Tools > Extensions and Updates menu by search for “BISM Normalizer”. This is an essential tool if you need to compare multiple SSAS Tabular projects and databases (including workspace databases) and then manage the steps to selectively merge additions and differences between objects. I have found this useful for integrating parallel development work on the same SSAS Tabular project but I will warn you that if used routinely to merge multiple changes to a single project, it can be a complex undertaking and not something I recommend on a daily basis.
BISM Normalizer is developed and maintained by Christian Wade, formerly an independent consultant and now a Senior Program Manager on the SSAS product team. Although Christian works for Microsoft, this is still considered a community-supported, third-party tool not officially supported by Microsoft. It is, however, the only tool designed specifically to perform schema comparisons and object merging in SSAS.
In our discussion, we also refer to these tools and resources:
VertiPaq Analyzer – An older tool (actually an Excel Power Pivot workbook) created by Kasper de Jonge, Program Manager from the Power BI/SSAS product team, that imports SSAS DMV query results and reports object size, object data compression rates and optimization statistics. It has since been updated and enhanced. This is a convenient way to get information quickly about your SSAS or Power BI model. The same information can be obtained, with a little more effort, by running DMV queries in DAX Studio or SSMS. Links to the original and updated versions are available on the SQL BI site at: https://www.sqlbi.com/articles/data-model-size-with-vertipaq-analyzer/.
DAX Formatter – An online tool for optimizing and reformatting DAX expressions, available from SQLBI, at: http://www.daxformatter.com/
Tabular Modeling in Microsoft SQL Server Analysis Services – is a Microsoft Press book written by Alberto and Marco that applies mainly to SSAS Tabular 2016 and 2017 projects. This is an absolutely invaluable reference and tutorial on SSAS design best practices and design techniques. Previous books from our Italian friends include DAX Patterns and The Definitive Guide to DAX; all excellent learning resources that I have in my book library.
Shortly after this conversation, Marco and Alberto delivered two sessions at the PASS Summit conference about DAX best practices and performance optimizations. A little later, Marco sat with Guy In A Cube hosts Adam Saxton and Patrick Leblanc to chat about these sessions. You can find that interview here. I also had a chance to catch-up with Adam and Patrick in an earlier interview during the conference, which I will share in a later blog post.
The best method to implement row-level security in a published Power BI model or SSAS Tabular model consumed from the Power BI service will depend to some degree on your data structure and requirements. The method I demonstrate here is one of the most flexible approaches and one that I commonly use in my projects. Continue reading →
NEWS FLASH: Power BI reports can be deployed to SQL Server Reporting Services web portal. The production-ready release is targeted for mid 2017. This is much sooner than most folks in the community were anticipating. An installable technical preview is targeted for January of 2017. This announcement was just made on the SQL Server Reporting Services Team Blog.
From the announcement:
Which Power BI capabilities do you plan to add to SSRS?
We’re focusing our efforts on adding Power BI reports to SSRS and on supporting the features Power BI Desktop offers for use within these reports, including a variety of data connectors and visualizations. Beyond the current Technical Preview, we plan to add support for
Custom visuals
Additional data connectors (besides Analysis Services), cached data, and scheduled data refresh
Power BI mobile apps (viewing Power BI reports stored in SSRS)
Given our focus on Power BI reports, we have no current plans to add other Power BI features (such as “dashboards,” Q&A, Quick Insights, and others) to SSRS.
What can we expect in the next Technical Preview of Power BI reports in SSRS?
With the current Technical Preview, we used a pre-configured Azure VM to offer you a preview that’s quick and easy to try. Our focus for the next Technical Preview is on a version you can download and install on your own VM or server, a necessary next step toward a production-ready version. Aside from this aspect, the functionality will be similar to the current Technical Preview’s.
When will we have this next Technical Preview?
We’re targeting January 2017 to release this next Technical Preview.
What’s the release vehicle for a production-ready version?
We plan to release the production-ready version in the next SQL Server release wave. We won’t be releasing it in a Service Pack, Cumulative Update, or other form of update for SSRS 2016.
When will we have a production-ready version?
We’re targeting availability in mid-2017.
Can I deploy SSRS 2016 today and migrate to SSRS with Power BI reports when it’s available?
Yes, we aim to make it easy to migrate to SSRS with Power BI reports from SSRS 2016 and previous versions.
I just finished posting a 4-part series demonstrating how to use Power BI with on-premises data. The running time for all four parts is about 44 minutes. I cover using the Power BI Enterprise Gateway to connect published Power BI reports to on-premises data sources:
SQL Server database connected with DirectQuery
Large SQL Server database with clustered column store index connected with DirectQuery
SSAS multidimensional/cube on-prem with direct connect
SSAS Tabular with large data volume & multiple partitions on-prem with direct connect
The four-part series is on my YouTube channel using the following links:
Part 1
Part 2
Part 3
Part 4
Adam Saxton also gives a nice walkthrough of using SQL Server DirectQuery with Power BI that may provide an additional perspective and information about securing and sharing the solution.
I’d also like to give a plug for a terrific Power BI architecture diagram poster created by Dustin Ryan. I love the simplicity of this comprehensive graphic. You can download the PDF file from his blog.
This post is really a collection of field notes and some lessons learned from recent project experience. I’ve done plenty of SSAS Tabular projects over the past few years – usually starting with a Visual Studio project rather than Power Pivot. I’ve also done a bit of Power Pivot work for clients. These projects were either delivered to an analyst using Excel on their desktop or some business users through SharePoint. But, authoring in Excel Power Pivot and deploying to a server-hosted Tabular model has been mainly theoretical up to this point so I thought I’d share my experience. Continue reading →
I’ve just finished a series of four articles for SQL Server Pro Magazine, along with sample projects and hands-on exercises. The series will take you through SSAS Tabular model design from start to finish, using the Adventure Works sample data in SQL Server 2012 or 2014. Here are links to all four articles followed by an excerpt from each.
Starting Your Modeling Career with Analysis Services Tabular Models Part 1
Four-part series
This is the first of a four-part series about getting started with Tabular model design using SQL Server Analysis Services in SQL Server 2012 and 2014. You will learn the basics from start to finish and build a complete solution. A sample project is provided for for each stage of the solution so you can follow-along with each article. Continue reading →
It was a great honor to be asked to join my associates from SolidQ at the Microsoft Virtual Academy Studios in Redmond and talk about how to upgrade to SQL Server 2012 and 2014. These recordings, which are also on my YouTube Channel, include the material I covered in these sessions. The entire series of studio presentations are hosted on Channel 9 and here at the MVA with accompanying assessment surveys and resources.
In these studio sessions, I am joined by my fellow authors of the 429 page SQL Server 2014 Upgrade Technical Guide; Richard Waymire, Ron Talmage and Jim Miller from SolidQ. Jim and I were responsible for the Business Intelligence content. In our sessions Jim covered SSIS and SSAS Multidimensional, and I covered SSAS Tabular, BI tools and SSRS. In this edited portion of the SSAS session, Jim begins with a brief summary of multidimensional upgrade options and I continue to discuss opportunities to implement SSAS Tabular solutions. BI topics apply equally to SQL Server 2012 and 2014 upgrades. Continue reading →
Please join my associates and I for an all-day SQL Server Upgrade workshop on November 3rd
If you are planning an upgrade to a newer version of SQL Server, you won’t want to miss this all-day, preconference workshop.
Join John Martin, Sr. Technology Evangelist from Microsoft, will spend the morning discussing migration planning from SQL Server 2005.
In the afternoon, SolidQ mentors Ron Talmage, Richard Waymire, Jim Miller and Paul Turley will talk about and demonstrate upgrading specific workloads (projects and features) from older product versions to newer product versions. We will introduce the comprehensive SQL Server 2014 Upgrade Whitepaper that we recently wrote for publication by Microsoft.
Additional to upgrading specifically from SQL Server 2005 to SQL Server 2014, we will also talk about upgrading from SQL Server 2008 and 2008 R2 to SQL Server 2012 and 2014.
An upgrade and/or migration is far more than just a case of moving a database or installing a new version of SQL Server, there have been so many changes since SQL 2005 arrived that we need to do a lot of tasks to ensure we have a successful upgrade project.
This session will guide you through the process, looking at not only the technology but the methodology, processes and tools that we have at our disposal to make sure that when we do move from SQL Server 2005 to 2012/2014 or to the cloud with SQL Server in an Azure VM or Azure SQL Database that we do so in a way that we can be confident of success. We will take a special look at workload-specific upgrade needs for OLTP, HA, SSAS/SSRS, and SSIS environments.
In my afternoon section, I will demonstrate the capabilities of SSAS Tabular models and discuss upgrading and migrating Reporting Services from 2005, 2008 and 2008R2 to 2012 and 2014.
I’ve created a simple query performance logging tool for Analysis Services, called the SSAS Performance Logger. The tool allows you to navigate through the metadata for a tabular model and select from measures and table attributes to build a simple query which is executed and timed. The query can be executed many times to get an accurate sampling of test variations and to test performance while various loads are placed on the server.
I literally created this over the weekend and plan to add additional capabilities as time allows – so please check back. To provide feedback, add comments to this post.
My initial objective was to choose two different measures that were alternate methods of calculating the same value, run them side-by-side and then see which performed best. Then it occurred to me that we can run any number of measure performance tests in iterative loops and compare the results by playing back the captured log results. Since the results are captured in a log database, the test results for each measure or query can easily be compared and analyzed using tools like Excel and Reporting Services.
One of my main objectives for a future version of the tool is to add logging for server performance counters like memory usage, thread counts and CPU capacity; while these queries are running.
What you will need:
I’ve developed SSAS Performance Logger for use with SSAS Tabular in SQL Server 2012. To manage the logging database, it uses the OLEDB data provider for SQL Server 2012 which may be backward compatible as far back as SQL Server 2008. It also uses the ADOOMD data provider version for Analysis Services. When MDX support is added to a future version, it should support SSAS sources as far back as SSAS 2008.
System requirements:
Microsoft .NET Framework 4.5
ADOMD 6.0 (can be installed with the SQL Server 2012 SP1 Feature Pack)
An instance of SQL Server 2008 or better with administrative/create database rights for the logging database
With these components installed, you should be able to copy the executable to a folder and run it.
To install:
Download the zip file and extract it to a folder on your system
Verify that the required dependent components are installed
Run SSAS Perf Tester.exe
The first time it runs, the application will check for the logging database and prompt you to create it
On the SSAS Query page, enter the Analysis Services server or instance name and press Enter
Select a model or perspective, measure and table/attribute from the drop-down list boxes to build the query
Click the Start button to run the query once and see how long it took to run
On the Options page, type or select the number of times to repeat the query
Click the Start button the run the query in a loop. The results are displayed in a line chart showing the duration in milliseconds for each execution
Every query execution is logged for later analysis. Here is a view of the logItem table in the SSASPerformanceTest database:
A Getting-Started and Survival Guide for planning, designing and building Tabular Semantic Models with Microsoft SQL Server 2012 Analysis Services.
by Paul Turley
This post will be unique in that it will be a living document that will be updated and expanded over time. I will also post-as-I-go on the site about other things but this particular post will live for a while. I have a lot of good intentions – I know that about myself and I also know that the best way to get something done is to get it started – especially if I’m too busy with work and projects. If it’s important, the “completing” part can happen later. In the case of this post, I’ll take care of building it as I go, topic by topic. Heck, maybe it will never be “finished” but then are we ever really done with IT business solutions? I have been intending to get started on this topic for quite some time but in my very busy project schedule lately, didn’t have a concise message for a post – but I do have a lot to say about creating and using tabular models.
I’ve added some place-holder topic headers for some things that are on my mind. This list is inspired by a lot of the questions my consulting customers, students, IT staff members and business users have asked me on a regular basis. This will motivate me to come back and finish them and for you to come back and read them. I hope that you will post comments about your burning questions, issues and ideas for related topics to cover in this living post about tabular model design practices and recommendations.
Why Tabular?
SQL Server Analysis Services is a solid and mature platform that now serves as the foundation for two different implementations. Multidimensional models are especially suited for large volumes of dimensionally-structured data that have additive measure values that sum-up along related dimensional attributes & hierarchies.
By design, tabular architecture is more flexible than multidimensional in a number of scenarios. Tabular also works well with dimensional data structures but also works well in cases where the structure of the data doesn’t resemble a traditional star or snowflake of fact and dimension tables. When I started using PowerPivot and tabular SSAS projects, I insisted on transforming data into star schemas like I’ve always done before building a cube. In many cases, I still do because it’s easier to design a predictable model that performs well and is easy for users to navigate. A dimensional model has order and disciple however, the data is not always shaped this way and it can take a lot of effort to force it into that structure.
Tabular is fast for not only additive, hierarchal structured data but in many cases, it works well with normalized and flattened data as long as all the data fits into memory and the model is designed to support simple relationships and calculations that take advantage of the function engine and VertiPaq compression and query engine. It’s actually pretty easy to make tabular do silly, inefficient things but it’s also not very hard to make it work really well, either.
James Serra has done a nice job of summarizing the differences between the two choices and highlighted the strengths and comparative weaknesses of each in his April 4 blog post titled SQL Server 2012: Multidimensional vs Tabular. James points out that tabular models can be faster and easier to design and deploy, and that they concisely perform well without giving them a lot of extra attention for tuning and optimization. Honestly, there isn’t that much to maintain and a lot of the tricks we use to make cubes perform better (like measure group partitioning, aggregation design, strategic aggregation storage, usage-base optimization, proactive caching and cache-warming queries) are simply unnecessary. Most of these options don’t really exist in the tabular world. We do have partitions in tabular models but they’re really just for ease of design.
What About Multidimensional – Will Tabular Replace It?
The fact is the multidimensional databases (which most casual SSAS users refer to as “cubes”) will be supported for years to come. The base architecture for SSAS OLAP/UDM/Multidimensional is about 13 years old since Microsoft originally acquired a product code base from Panorama and then went on to enhance and then rewrite the engine over the years as it has matured. In the view of many industry professionals, this is still the more complete and feature-rich product.
Both multi and tabular have some strengths and weaknesses today and one is not clearly superior to the other. In many cases, tabular performs better and models are more simple to design and use but the platform is lacking equivalent commands and advanced capabilities. In the near future, the tabular product may inherit all of the features of its predecessor and the choice may become more clear; or, perhaps a hybrid product will emerge.
Isn’t a Tabular Model Just Another Name for a Cube?
No. …um, Yes. …well, sort of. Here’s the thing: The term “cube” has become a defacto term used by many to describe the general concept of a semantic model. Technically, the term “cube” defines a multidimensional structure that stores data in hierarchies of multi-level attributes and pre-calculated aggregate measure values at the intersect points between all those dimensions and at strategic points between many of the level members in-between. It’s a cool concept and an an even cooler technology but most people who aren’t close to this product don’t understand all that. Users just know that it works somehow but they’re often confused by some of the fine points… like the difference between hierarchies and levels. One has an All member and one doesn’t but they both have all the other members. It makes sense when you understand the architecture but it’s just weird behavior for those who don’t.
Since the tabular semantic model is actually Analysis Services with a single definition of object metadata, certain client tools will continue to treat the model as a cube, even though it technically isn’t. A tabular Analysis Services database contains some tables that serve the same purpose as measure groups in multidimensional semantic models. The rest of the tables are exposed as dimensions in the same way that cube dimensions exists in multidimensional. If a table in a tabular model includes both measures and attribute fields, in certain client tools like Excel, it will show up twice in the model; once as a measure group table and once as a dimension table.
(more to come)
Tabular Model Design: The Good, the Bad, the Ugly & the Beautiful
I’ve taught a few PowerPivot training sessions to groups of business users (now, remember that Tabular SSAS is really just the scaled-up version of PowerPivot.) Admittedly I’m more accustomed to working with IT professionals and when I teach or work with users, I have to throttle my tendency to go deep and talk about technical concepts. In these classes, I find myself restating the same things I’ve heard in conference presentations and marketing demos about PowerPivot data sources, like “you can import just about anything into PowerPivot”. As I read the bullet points and articulate the points on the presentation slides to these users, I have this nagging voice in the back of my mind. I’ve spent many years of my career unraveling the monstrosities that users have created in Access, Excel & Visual Basic.
Whether stated or implied, there is a common belief that a PowerPivot solution doesn’t require the same level of effort to transform, prepare and cleanse data before it gets imported into a data model. For many years, we’ve been telling these users that it will take a serious effort, at significant cost, to prepare and transform data before we can put it into a data mart or cube for their consumption. In a typical BI solution, we usually burn 70-80% of our resource hours and budget on the ETL portion of the project. Now, using the same data sources, users are being told that they can do the same thing themselves using PowerPivot!
Data Modeling 101 for Tabular Models
One of the things that I really enjoy about building tabular models is that I can have my data in multiple structures and it still works. If the data is in a traditional BI “Kimball-style” Star schema, it works really well. If the data is normalized as it would be in a typical transactional-style database, it still works. Even if I have tables that are of a hybrid design; with some characteristics of both normalized and dimensional models, it all works beautifully.
Here’s the catch; one of the reasons we build dimensional data model is because they are simple and predictable. It’s really easy to get lost in a complex data structure and when you start combining data form multiple source systems, that’s where you’re likely to end up. Getting business data into a structure that is intuitive, that behaves correctly and gives reliable results can be a lot of work so be cautious. Just because a tabular model can work with different data structures doesn’t that you don’t need to prepare your data, clean it up and organize it before building the semantic model.
The classic star schema is one of the most effective ways to organize data for analysis. Rather than organizing all data elements into separate tables according to the rules of normal form, we consolidate all the measures that are related to common dimensional attributes and with a common grain (or aggregation level), into a fact table. The dimensional attributes are stored in separate dimension tables – one table per unique business entity, along with related attributes. Any group of measures not related to the same set of dimensions at the same level would be stored in their own fact table. In the example, Invoice measures that are related to stores and customers, recorded every quarter are in one fact table. The sales debit records for customers and stores that are recorded daily go in a different fact table. The account adjustments don’t record the store key but they are uniquely related to accounting ledger entries stored in the ledger table. Note the direction of the arrows showing that facts are related to lookup values in the dimension tables.
Exhibit 1 – A Fully conformed Star Schema
If you can pound your data into the shape or a star schema and this meets your requirements; this is what I usually recommend. It’s a simple and predictable method to organize data in a well-defined structure. Now, let’s look a variation of this approach that has characteristics of both the star schema and normalized form. We’ll call this a “hybrid” model.
The following hybrid schema contains two fact tables in a master/detail relationship. The cardinality of the Invoice and LineItem tables is one-to-many where one invoice can have multiple line items. This would be considered a normalized relationship with the InvoiceID primary key related to the an InvoiceID foreign key in the LineItem table.
The Invoice table contains a numeric measure called Invoice Amount that can be aggregated by different dimensional attributes. Those attributes, such as Store Name, Customer Name or any of the calendar date units in the Dates table that are organized into a natural hierarchy (with levels Year, Month and Date). To facilitate this, the invoice table is related to three different dimension tables: Stores, Customers and Dates. Each of the dimension tables has a primary key related to corresponding foreign keys in the fact table. The LineItem table also numeric measures and is related to the Products table, also a dimension table.
Exhibit 2 – A Hybrid Star / Master-Detail Schema
This semantic model supports two levels of aggregation with respect to the Invoice and LineItem records. If I were to browse this model in an Excel Pivot Table and put all the stores on rows, I could aggregate the Invoice Amount and see the sum of all Invoice Amount values for each store
<< need pivot table graphic here >>
Are There Rules for Tabular Model Design?
Oh, absolutely. Tabular SSAS and PowerPivot allow you to work with data is a variety of formats – structured & unstructured, dimensional & normalized. You have a lot of flexibility but there are rules that govern the behavior and characteristics of data. If you don’t follow the rules, your data may not meet your requirements in the most cost-effective way.
This reminds me of an experience when I started high school.
Rule #1: Model the data source
Rule #2: Cleanse data at the source
Rule #3:
Tabular Model Design Checklist
What’s the Difference Between Calculated Columns & Measures?
What are the Naming Conventions for Tabular Model Objects?
What’s the Difference Between PowerPivot and Tabular Models?
How to Promote a Business-created PowerPivot Model to an IT-managed SSAS Tabular Model
Getting Started with DAX Calculations
DAX: Essential Concepts
DAX: Some of the Most Useful Functions
DAX: Some of the Most Interesting Functions
Using DAX to Solve real-World Business Scenarios
Do I Write MDX or DAX Queries to Report on Tabular Data?
Can I Use Reporting Services with Tabular & PowerPivot Models?
Do We Need to Have SharePoint to Use Tabular Models?
What Do You Teach Non-technical Business Users About PowerPivot and Tabular Models?
What’s the Best IT Tool for Reporting on Tabular Models?
What’s the Best Business User Tool for Browsing & Analyzing Business Data with Tabular Models?
Survival Tips for Using the Tabular Model Design Environment
How Do You Design a Tabular Model for a Large Volume of Data?
How Do You Secure a Tabular Model?
How to Deploy and Manage a Tabular Model SSAS Database
Tabular Model Common Errors and Remedies
Tabular Model, Workspace and Database Recovery Techniques
Scripting Tabular Model Measures
Simplifying and Automating Tabular Model Design Tasks
Tuning and Optimizing a Tabular Model
How do you tune a tabular model? You don’t.
You can prevent performance and usability problems through proper design.
In March, I posted about the public preview of the “Microsoft SQL Server 2012 With Power View For Multidimensional Models”. Well, the official release is now available for those currently using SQL Server 2012. When the preview became available a couple of months ago, I was very excited to test it out so I downloaded it, quickly scanned the release notes and then proceeded to upgrade an existing SQL Server 2012 SP1 development server. What I missed in the release notes was the requirement to uninstall several existing components and then to install them from scratch. That wasn’t as easy as I had hoped but it’s pretty typical for prereleased software to not include upgrade support. After all, the product teams are focused on finishing features and debugging and not getting all the upgrades and installation details sorted out. Those steps usually happen last in the release cycle.
Not to worry, this new capability is now part of the Cumulative Update 4 for SQL Server 2012. This means that it will be fully-supported as an upgrade to an existing SQL Server 2012 installation. This is very exciting news. If you have seen Power View demonstrated with new SSAS tabular models and PowerPivot models in Excel and SharePoint, you know what a simple and powerful data browsing and visualization tool it is. Some people have been a little disappointed that Power View initially only worked with new xVelocity-based tabular models and not the multidimensional cubes built with SQL Server Analysis Services, that have become common in many Microsoft centered IT shops throughout the industry.
The Microsoft product teams have shared a lot of good news, like this, recently about BI innovations – with Power View in Excel 2013 and GeoFlow recently released. They are likely to share even more good news in the weeks and months ahead. It’s an exciting time to see some very impressive, powerful, fun to develop and fun to use BI business and IT tools all coming together to meet very real business problems.
I don’t know about you but I’m going to get this baby installed and working right away. I have clients who have been waiting patiently (and some not so patiently) to use Power View with their existing cubes. I love to be the bearer of good news.
As is usual when something noteworthy happens on the Microsoft BI community, Chris Webb has blog eloquently on the topic and with significant detail. Read today’s post on his blog here.
This is a very significant day in the progression of the Microsoft Business Intelligence platform. Earlier this year, Power View was released with SQL Server 2012; a remarkable step forward in data visualization. It wasn’t all bad news that Power View only worked with the new tabular semantic models and PowerPivot worksheets published in SharePoint but it did limit our options. We’ve quietly lived with the fact that this great new reporting tool couldn’t be used with Microsoft’s flagship analytical data (OLAP) engine without building new semantic models. Well, now it does, and that’s very good news.
There are many free eBooks & resources available from Microsoft and members of the MVP community. This is a collection of several very useful free publications:
Power View Infrastructure Configuration and Installation Link
Introducing Microsoft SQL Server 2012 PDF | EPUB | MOBI
Introducing Microsoft SQL Server 2008 R2 PDF | XPS