sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009
Power BI: a new suite of Business Intelligence tools
Over the past few months, teams at Microsoft have made several new Business Intelligence tools available for preview; some only privately and some to the public. The entire suite will soon be available for either public preview or release under the new name: “Power BI”. All of the components of Power BI are listed below but the big news is a new hosted offering called “Power BI for Office 365” and “Power BI Sites”. The announcement was made at the Worldwide Partner Conference this week. Users can sign-up to be notified when the new offerings are available for general availability, apparently in the very near future. I’ve had an opportunity to work with early, pre-released versions and it has been interesting to see the gaps being filled a little at a time. On the heals of the new suite, some of the names of existing products are also being changed. It’s hard to have a conversation about the collection of Microsoft’s “Power”/”Pivot”/”Point”…named tools and not get tongue twisted but these changes bring more consistency.
Bottom line: this is good news and a promising step forward – especially for smaller businesses. Larger, enterprise customers should know that this move is consistent with Microsoft’s “cloud first” philosophy and these capabilities are being introduced through Office365/Azure platform with required connectivity. Read the commentary on community leaders’ sites below. I have no doubt that there will be a lot of discussion on this in the weeks to come with more announcements from Microsoft in the near future.
When Power View was released with SQL Server 2012 Enterprise and Business Intelligence Editions, it was available only when integrated with SharePoint 2010 Enterprise Edition. This is a good solution for enterprise customers but it was complex and expensive for some to get started. Power View was also offered only as a Silverlight application that wouldn’t work on many mobile devices and web browsers. For this reason, Power View has really been viewed as a “Microsoft only” tool and only for big companies with deep pockets and very capable IT support groups. Even the new Power View add-in for Excel 2013 ProPlus Edition requires Silverlight which is not a show-stopper for most folks but a hindrance for multi-platform and tablet users. This all changes with this new offering as the Power View visualization tool in the hosted product come in 3 new flavors: native Windows 8 app (runs on desktop, Surface RT & Pro), native iOS (targeting the iPad) and HTML5 (works on practically any newer device). This means that when you open a Power View report on your Surface or iPad, it can run as an installed app with all the cool pinch-zoom and gestures you’ve come to expect on a tablet device. For now, this is good news for the cloud user as no on-premises option is currently available. An interesting new edition will be the introduction of a semantic translation engine for natural language queries, initially for English.
Formerly known as “Data Explorer”, this add-in for Excel 2013 allows you to discover and integrate data into Excel. Think of it as intelligent, personal ETL with specialized tools to pivot, transform and cleanse data obtained from web-based HTML tables and data feeds.
This Excel 2013 ProPlus add-in, which was previously known as “GeoFlow”, uses advanced 3-D imaging to plot data points on a global rendering of Bing Maps. Each data point can be visualized as a column, stacked column or heat map point positioned using latitude & longitude, named map location or address just like you would in a Bing Maps search. You can plot literally thousands of points and then tour the map with the keyboard, mouse or touch gestures to zoom and navigate the globe. A tour can be created, recorded and then played back. Aside from the immediate cool factor of this imagery, this tool has many practical applications.
Power Pivot
The be reveal is that “PowerPivot” shall now be known as “Power Pivot”. Note, the space added so that the name is consistent with the other applications. We all know and love this tool, an add-in for Excel 2010 and Excel 2013 ProPlus (two different versions with some different features) that allow large volumes of related, multi-table data sources to be imported into an in-memory semantic model with sophisticated calculations. On a well-equipped computer, this means that a model could contain tens of millions of rows that get neatly compressed into memory and can be scanned, queried and aggregated very quickly. Power Pivot models (stored as an Excel .xlsx file) can be uploaded to a SharePoint where they become a server-managed resource. A Power Pivot model can also be promoted to a server-hosted SSAS Tabular model where data is not only managed and queried on an enterprise server but also takes on many of the features and capabilities of classic SSAS multidimensional database. Whether a Power Pivot model is published to a SharePoint library or promoted to a full-fledged SSAS Tabular model, the data can be queried by any client tool as if it were an Analysis Services cube.
Power View
For now, Power View in Excel 2013 ProPlus and Power View in SharePoint 2010 Enterprise and SharePoint 2013 Enterprise remain the same – the Silverlight-based drag-and-drop visual analytic tool. With the addition of SQL Server 2012 CU4, Power View in SharePoint can be used with SharePoint published Power Pivot models, SSAS Tabular models and SSAS Multidimensional “cube” models. There has been no news yet about a non-Silverlight replacement for the on-premise version of Power View. The Microsoft teams and leadership have heard the requests and feedback, loud-and-clear, from the community and we can only guess that there is more is in-the-works but I make no forecast or assumptions about the eventual availability of an on-premise offering similar to Power BI for Office 365.
I’m very excited to see my first feature article published in SQL Server Pro Magazine titled Custom Programming to Enhance SSRS Reports; How to write a custom assembly and use it to build a dynamic report dataset. The article was posted online in April and featured in the July printed and electronic edition. SQL Server Pro (formerly known as “SQL Server Magazine” or “SQLMag”) is published by Penton Media and is the largest publication for the SQL Server community. Please read the article, download the code, work through the exercise and let me know if you have comments or questions.
I posted an early draft of this article in my blog last year titled Using Custom Assemblies in Reports to Generate Query Logic (parts 1 and 2). The code was cleaned-up and tech edited in the new article which I recommend as the most reliable source (not that I write bad code, mind you, but it never hurts to have a formal tech review.)
A Getting-Started and Survival Guide for planning, designing and building Tabular Semantic Models with Microsoft SQL Server 2012 Analysis Services.
by Paul Turley
This post will be unique in that it will be a living document that will be updated and expanded over time. I will also post-as-I-go on the site about other things but this particular post will live for a while. I have a lot of good intentions – I know that about myself and I also know that the best way to get something done is to get it started – especially if I’m too busy with work and projects. If it’s important, the “completing” part can happen later. In the case of this post, I’ll take care of building it as I go, topic by topic. Heck, maybe it will never be “finished” but then are we ever really done with IT business solutions? I have been intending to get started on this topic for quite some time but in my very busy project schedule lately, didn’t have a concise message for a post – but I do have a lot to say about creating and using tabular models.
I’ve added some place-holder topic headers for some things that are on my mind. This list is inspired by a lot of the questions my consulting customers, students, IT staff members and business users have asked me on a regular basis. This will motivate me to come back and finish them and for you to come back and read them. I hope that you will post comments about your burning questions, issues and ideas for related topics to cover in this living post about tabular model design practices and recommendations.
Why Tabular?
SQL Server Analysis Services is a solid and mature platform that now serves as the foundation for two different implementations. Multidimensional models are especially suited for large volumes of dimensionally-structured data that have additive measure values that sum-up along related dimensional attributes & hierarchies.
By design, tabular architecture is more flexible than multidimensional in a number of scenarios. Tabular also works well with dimensional data structures but also works well in cases where the structure of the data doesn’t resemble a traditional star or snowflake of fact and dimension tables. When I started using PowerPivot and tabular SSAS projects, I insisted on transforming data into star schemas like I’ve always done before building a cube. In many cases, I still do because it’s easier to design a predictable model that performs well and is easy for users to navigate. A dimensional model has order and disciple however, the data is not always shaped this way and it can take a lot of effort to force it into that structure.
Tabular is fast for not only additive, hierarchal structured data but in many cases, it works well with normalized and flattened data as long as all the data fits into memory and the model is designed to support simple relationships and calculations that take advantage of the function engine and VertiPaq compression and query engine. It’s actually pretty easy to make tabular do silly, inefficient things but it’s also not very hard to make it work really well, either.
James Serra has done a nice job of summarizing the differences between the two choices and highlighted the strengths and comparative weaknesses of each in his April 4 blog post titled SQL Server 2012: Multidimensional vs Tabular. James points out that tabular models can be faster and easier to design and deploy, and that they concisely perform well without giving them a lot of extra attention for tuning and optimization. Honestly, there isn’t that much to maintain and a lot of the tricks we use to make cubes perform better (like measure group partitioning, aggregation design, strategic aggregation storage, usage-base optimization, proactive caching and cache-warming queries) are simply unnecessary. Most of these options don’t really exist in the tabular world. We do have partitions in tabular models but they’re really just for ease of design.
What About Multidimensional – Will Tabular Replace It?
The fact is the multidimensional databases (which most casual SSAS users refer to as “cubes”) will be supported for years to come. The base architecture for SSAS OLAP/UDM/Multidimensional is about 13 years old since Microsoft originally acquired a product code base from Panorama and then went on to enhance and then rewrite the engine over the years as it has matured. In the view of many industry professionals, this is still the more complete and feature-rich product.
Both multi and tabular have some strengths and weaknesses today and one is not clearly superior to the other. In many cases, tabular performs better and models are more simple to design and use but the platform is lacking equivalent commands and advanced capabilities. In the near future, the tabular product may inherit all of the features of its predecessor and the choice may become more clear; or, perhaps a hybrid product will emerge.
Isn’t a Tabular Model Just Another Name for a Cube?
No. …um, Yes. …well, sort of. Here’s the thing: The term “cube” has become a defacto term used by many to describe the general concept of a semantic model. Technically, the term “cube” defines a multidimensional structure that stores data in hierarchies of multi-level attributes and pre-calculated aggregate measure values at the intersect points between all those dimensions and at strategic points between many of the level members in-between. It’s a cool concept and an an even cooler technology but most people who aren’t close to this product don’t understand all that. Users just know that it works somehow but they’re often confused by some of the fine points… like the difference between hierarchies and levels. One has an All member and one doesn’t but they both have all the other members. It makes sense when you understand the architecture but it’s just weird behavior for those who don’t.
Since the tabular semantic model is actually Analysis Services with a single definition of object metadata, certain client tools will continue to treat the model as a cube, even though it technically isn’t. A tabular Analysis Services database contains some tables that serve the same purpose as measure groups in multidimensional semantic models. The rest of the tables are exposed as dimensions in the same way that cube dimensions exists in multidimensional. If a table in a tabular model includes both measures and attribute fields, in certain client tools like Excel, it will show up twice in the model; once as a measure group table and once as a dimension table.
(more to come)
Tabular Model Design: The Good, the Bad, the Ugly & the Beautiful
I’ve taught a few PowerPivot training sessions to groups of business users (now, remember that Tabular SSAS is really just the scaled-up version of PowerPivot.) Admittedly I’m more accustomed to working with IT professionals and when I teach or work with users, I have to throttle my tendency to go deep and talk about technical concepts. In these classes, I find myself restating the same things I’ve heard in conference presentations and marketing demos about PowerPivot data sources, like “you can import just about anything into PowerPivot”. As I read the bullet points and articulate the points on the presentation slides to these users, I have this nagging voice in the back of my mind. I’ve spent many years of my career unraveling the monstrosities that users have created in Access, Excel & Visual Basic.
Whether stated or implied, there is a common belief that a PowerPivot solution doesn’t require the same level of effort to transform, prepare and cleanse data before it gets imported into a data model. For many years, we’ve been telling these users that it will take a serious effort, at significant cost, to prepare and transform data before we can put it into a data mart or cube for their consumption. In a typical BI solution, we usually burn 70-80% of our resource hours and budget on the ETL portion of the project. Now, using the same data sources, users are being told that they can do the same thing themselves using PowerPivot!
Data Modeling 101 for Tabular Models
One of the things that I really enjoy about building tabular models is that I can have my data in multiple structures and it still works. If the data is in a traditional BI “Kimball-style” Star schema, it works really well. If the data is normalized as it would be in a typical transactional-style database, it still works. Even if I have tables that are of a hybrid design; with some characteristics of both normalized and dimensional models, it all works beautifully.
Here’s the catch; one of the reasons we build dimensional data model is because they are simple and predictable. It’s really easy to get lost in a complex data structure and when you start combining data form multiple source systems, that’s where you’re likely to end up. Getting business data into a structure that is intuitive, that behaves correctly and gives reliable results can be a lot of work so be cautious. Just because a tabular model can work with different data structures doesn’t that you don’t need to prepare your data, clean it up and organize it before building the semantic model.
The classic star schema is one of the most effective ways to organize data for analysis. Rather than organizing all data elements into separate tables according to the rules of normal form, we consolidate all the measures that are related to common dimensional attributes and with a common grain (or aggregation level), into a fact table. The dimensional attributes are stored in separate dimension tables – one table per unique business entity, along with related attributes. Any group of measures not related to the same set of dimensions at the same level would be stored in their own fact table. In the example, Invoice measures that are related to stores and customers, recorded every quarter are in one fact table. The sales debit records for customers and stores that are recorded daily go in a different fact table. The account adjustments don’t record the store key but they are uniquely related to accounting ledger entries stored in the ledger table. Note the direction of the arrows showing that facts are related to lookup values in the dimension tables.
Exhibit 1 – A Fully conformed Star Schema
If you can pound your data into the shape or a star schema and this meets your requirements; this is what I usually recommend. It’s a simple and predictable method to organize data in a well-defined structure. Now, let’s look a variation of this approach that has characteristics of both the star schema and normalized form. We’ll call this a “hybrid” model.
The following hybrid schema contains two fact tables in a master/detail relationship. The cardinality of the Invoice and LineItem tables is one-to-many where one invoice can have multiple line items. This would be considered a normalized relationship with the InvoiceID primary key related to the an InvoiceID foreign key in the LineItem table.
The Invoice table contains a numeric measure called Invoice Amount that can be aggregated by different dimensional attributes. Those attributes, such as Store Name, Customer Name or any of the calendar date units in the Dates table that are organized into a natural hierarchy (with levels Year, Month and Date). To facilitate this, the invoice table is related to three different dimension tables: Stores, Customers and Dates. Each of the dimension tables has a primary key related to corresponding foreign keys in the fact table. The LineItem table also numeric measures and is related to the Products table, also a dimension table.
Exhibit 2 – A Hybrid Star / Master-Detail Schema
This semantic model supports two levels of aggregation with respect to the Invoice and LineItem records. If I were to browse this model in an Excel Pivot Table and put all the stores on rows, I could aggregate the Invoice Amount and see the sum of all Invoice Amount values for each store
<< need pivot table graphic here >>
Are There Rules for Tabular Model Design?
Oh, absolutely. Tabular SSAS and PowerPivot allow you to work with data is a variety of formats – structured & unstructured, dimensional & normalized. You have a lot of flexibility but there are rules that govern the behavior and characteristics of data. If you don’t follow the rules, your data may not meet your requirements in the most cost-effective way.
This reminds me of an experience when I started high school.
Rule #1: Model the data source
Rule #2: Cleanse data at the source
Rule #3:
Tabular Model Design Checklist
What’s the Difference Between Calculated Columns & Measures?
What are the Naming Conventions for Tabular Model Objects?
What’s the Difference Between PowerPivot and Tabular Models?
How to Promote a Business-created PowerPivot Model to an IT-managed SSAS Tabular Model
Getting Started with DAX Calculations
DAX: Essential Concepts
DAX: Some of the Most Useful Functions
DAX: Some of the Most Interesting Functions
Using DAX to Solve real-World Business Scenarios
Do I Write MDX or DAX Queries to Report on Tabular Data?
Can I Use Reporting Services with Tabular & PowerPivot Models?
Do We Need to Have SharePoint to Use Tabular Models?
What Do You Teach Non-technical Business Users About PowerPivot and Tabular Models?
What’s the Best IT Tool for Reporting on Tabular Models?
What’s the Best Business User Tool for Browsing & Analyzing Business Data with Tabular Models?
Survival Tips for Using the Tabular Model Design Environment
How Do You Design a Tabular Model for a Large Volume of Data?
How Do You Secure a Tabular Model?
How to Deploy and Manage a Tabular Model SSAS Database
Tabular Model Common Errors and Remedies
Tabular Model, Workspace and Database Recovery Techniques
Scripting Tabular Model Measures
Simplifying and Automating Tabular Model Design Tasks
Tuning and Optimizing a Tabular Model
How do you tune a tabular model? You don’t.
You can prevent performance and usability problems through proper design.
When I started playing with GeoFlow, I just had to find a way to do something interesting with it. This is a short world tour with visits to a few SQL Server community members and leaders around the globe talking about their SQL Server communities. GeoFlow is a new BI tool currently in development at Microsoft. It’s an add-in for Excel 2013 using the SQL Server Analysis Services tabular in-memory aggregation that plots and visualizes data points on the globe using imagery from Bing Maps. After visualizing data on a map, you can create a 3D tour with full zoom and navigation in 3D geographical space.
Thanks to my friends and associates who have contributed to this effort thus far. This is the first draft so please check back and I’ll add more stops on the tour along with more sophisticated data points. The dataset in my current GeoFlow workbook is very light and with more contributions, I’ll can add creative ways to visualize the global SQL Server world community. Enjoy!
GeoFlow is still in development and not yet generally available. It’s too soon to foresee when it will be complete enough for a public preview and then for release but I’ll let you know what I know when I know it.
Today in Vancouver, British Columbia, at the SQL Saturday #198; I presented a session titled “Data Visualization Choices”. As promised, my slide deck is available for download here.
This is the first draft of the session I’m preparing for the PASS Business Analytics Conference coming up on April 11-12 in Chicago. I’ll have another update for that conference.
The email message earlier today said “Please note that the official schedule will be released tomorrow, February 6th, so please do not share your schedule until that time.” Well, what does that mean, exactly? It means that it’s midnight right now and time to tell the world…. We are going public, people! The PASS Business Analytics 2013 session schedule has been completed, the speakers are working on their presentations, feverously practicing their demo material and rehearsing in the mirror, to no one in the car, to the kids and to the dog who already knew you were crazy before all this mad ness began. Yep, we take this stuff pretty seriously. Continue reading →
There is little doubt that the movement toward self-service BI is going to change the climate for some IT professional who specialize in BI – but the question is “how?” and “how much?” Continue reading →
One of the characteristics of a really good, classic movie is that it has a lot of memorable dialog. I could go on for hours quoting one-liners from The Blues Brothers or Princess Bride. Likewise, I think a good book leaves the reader with gems to ponder and to stimulate ideas. Such has been my recent experience reading Rob Collie’s “DAX Formulas for PowerPivot, The Excel Pro’s Guide to Mastering DAX”. Continue reading →
Literally, minutes after I began posting my running notes from the keynote presentations and the first session I attended, I received a request to fill a last minute opening on the schedule and prepare a second session. I’m working on preparing a new version of “Visual Report Design – Bringing Sexy Back”. I will be presenting that session tomorrow. Continue reading →
Well, here I am sitting in a hotel room in Seattle on Sunday night, November 4th; the week of the PASS Global Summit. This is my favorite week of the year in the SQL Server community. If you’ve arrived ahead of the conference and have some time, please reach out so we can connect about BI, reporting, architectural design or life in general. My twitter tag is @paul_turley. Continue reading →
So here’s the deal… If you can make it out to any of these events, more power to you – do it – be there and support the cause of truth and justice for BI. If not, watch my blog and I will eventually get all of the session content here for your edification.
Fall is a busy time for speaking engagements which include user groups, SQL Saturdays and the PASS Global Summit. I’m speaking at a number of events and as the schedules are confirmed, I’ll update this list and upload or provide links to presentations and demo content. I hope you can attend some of these events and if not, please check back here to view the recordings and content.
The schedules for some of these events have not yet been set so this list will be updated as this information becomes available. Return to this page where links will be added for session decks, Prezi links and downloadable sample code.
These session additional abstracts were submitted and may be selected for the events that have not yet been confirmed. If you are interested in having me speak at an event, please consider these topics.
There are many free eBooks & resources available from Microsoft and members of the MVP community. This is a collection of several very useful free publications:
Power View Infrastructure Configuration and Installation Link
Introducing Microsoft SQL Server 2012 PDF | EPUB | MOBI
Introducing Microsoft SQL Server 2008 R2 PDF | XPS
I just received an email message from the PASS Global Summit 2012 organizers after submitting session abstracts a few weeks ago. They told me not to make any public announcements about the information they sent me but they also gave me this cool button graphic. Hmmm… I wonder what this means!
More information to come when the time is right.
PASS is the Professional Association for SQL Server
The global summit is the largest gathering of the SQL Server community and the premier event for learning and knowledge sharing from the amazing SQL Server community, Microsoft SQL Server product teams, MVPs, book authors and partners. This is an awesome event and a very special community of professionals and industry experts.
Just go. If you do, you’ll be glad you did and if you don’t, you’ll wish you had. trust me.
Just stepping out the door on on my way to the DevTeach / SQL Teach conference in Vancouver, BC, Canada. The conference begins tomorrow at the conference center and the Hilton Vancouver Metrotown in Burnaby. I’m looking forward to meeting up with other MVPs and associates from SolidQ.
I’ll be delivering two sessions about Microsoft SQL Server Business Intelligence, on Data Visualization Choices and Dashboard Design using Reporting Services. Return to this post after May 30 to download copies of my slide decks and demonstrations.
Thanks to the organizers and volunteers for making SQL Saturday #108 an overwhelming success. I’m just wrapping up the day following the Redmond SQL Saturday held in the Commons on the Redmond Microsoft campus. Greg Larsen and all of the organizers did a fantastic job to make this a successful event for well over 300 attendees.
With his usual flair and wit, Buck Woody gave the key note address; talking about cloud services, service oriented architectures and how to make yourself valuable and indispensible in the IT database marketplace.
Thank you for all those who attended my session on dashboard design with SQL Server Reporting Services.
I enjoyed Mark Tabladillo’s insightful presentation about SQL Server data mining and particularly how to use PowerShell to create and manage mining models.
Kevin Kline from Quest Software educated and entertained us with SQL Server trivia questions. I spent the afternoon talking shop with members of the SQL Server Customer Advisory Team (SQLCAT) and caught up with several of the other MVPs and speakers including Aaron Nelson, Jes Borland, Hugo Kornelis, Mark Simms, David Eichner, Wes Brown, Stacia Misner, Buck Woody and Kevin Kline. Nice job, everyone.You just can’t compare SQL Saturday to any other event with headliner speakers from the industry conferences and authors of the leading industry books, speaking for free to support the community. If you are not attendee a SQL Saturday close to you, plan to attend one at your next opportunity.