sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009
Last year Dun and Bradstreet, the world’s leading business data authority, recruited leading SQL Server MVPs to help their customers integrate business solutions using D&B services, and conducted a set of video interviews with 18 members of the MVP program community at the PASS Global Summit in Seattle. Details about the program are on the D&B MVP Program web site here. Many of these fine people are trusted friends, associates, and long-time members of the outstanding SQL Server community. Following is an excerpt of the announcement on the D&B MVP Program web site:
Each year the Professional Association of SQL Server (PASS) Experts holds their annual SQL PASS Conference. This year it was held Nov. 3rd and 4th for Pre-Conference Sessions and officially the conference ran from Nov. 5th through the 7th in Seattle, WA.
We hired a professional Hollywood Videographer to tape interviews of as many of our D&B MVPs that we could, so you could "get to know them better" both in their areas of expertise as well as in their role as a D&B MVP. We taped 18 of our current 42 D&B MVPs.
I truly hope you watch this video series, even one per night (each one is between 5-9 minutes long). It will show you the collective intelligence we have in our MVPs, it will show you their fun and personable personalities, and tell you how you can contact them online.
If you are looking for the industry top experts to help with data management, Business Intelligence, problem-solving or how to use any of the business data integration services from Dun & Bradstreet, these are the people to go to.
Here are the D&B MVPs (and two others) that were interviewed that day, in the order of the video series:
This week Microsoft announced the availability of Power BI Dashboards and the browser-based dashboard designer. What is it and why is it important? The most significant thing about it is that report and dashboard users do not need to have Excel 2013 ProPlus edition or an Office 365 subscription to use Power BI. This is very good news as it opens these amazing capabilities up to a much wider audience; those who work for companies that don’t have Office 2013 ProPlus or who are not using Office 365 ProPlus. Power BI updates and new features are being released at a very fast pace and there is much to consider. The definition of “Power BI” and Microsoft’s larger Business Intelligence and data analytics offering continues to evolve.
First, exactly what’s new and recently available?
Web-based Power BI dashboard designer
iOS Power BI app
New REST APIs
In August, I posted this about what Power BI was and how to start using it. At that time and up until this week, Power BI required an Office 365 Enterprise (E3) account with an additional subscription option. The new dashboard designer preview is currently free and doesn’t require an Office 365 subscription. It’s essentially Power View, Q&A, a new tiles interface with mini Power View visuals and connectivity management options. Data sources can include a variety of online sources in addition to your own data in a Power Pivot model you author using Excel and then upload to a OneDrive folder or an Office 365 tenant. You can also connect Power View reports and dashboards to an on-premises SSAS tabular model. You heard me… your data doesn’t have to be hosted in the cloud. The dashboards and reports are still published to the cloud but your data can stay on-prem. This is a very significant move toward Power BI becoming an enterprise-ready technology.
The iOS app for iPad is now in the Apple Store and is also currently free. It allows users to view and interact with published dashboards in a native app on their iPad. An iPhone version of the app will soon be available. This app is similar to, but not quite the same thing as, Power View; it’s more of a dashboard view-only app with touch-friendly interactions. For Power View users, it will be familiar and it’s very intuitive and easy to use. This is another big step forward for Microsoft BI. These are some camera shots on my iPad. Even if you don’t have your own data and models to browse, the included samples are great. Check-out the details here and then go to the iTunes or App Store and search for Power BI to install it yourself.
The new REST APIS for Power BI are still very early and in preview but they show that we will be able to integrate Power BI capabilities into custom business solutions. This is a long-awaited capability that holds a lot of promise for serious developers and system integrators. Get the preview here.
Power BI – and everything related to it – is Microsoft’s big-bet for BI and this is where their resources and innovations are being invested. Given the dependence on the cloud, I don’t see all of our enterprise consulting clients jumping on the bandwagon just yet but I do see many more businesses beginning to take this option more seriously, especially with the promise of integration APIs and on-premises data support.
Here is a collection of recent blog posts and announcements:
Requirement: Allow users to enter comments after they review results from a Power Pivot model and then show the comment text in the Pivot Table report results. Here’s a quick example of the final result. My example uses Flintstones characters in-place of employee or customer names and a simple measure value. After reviewing the results, the user moves to another worksheet and enters a comment for the date and person with an expiration date for the comment. After refreshing the report, each cell with a comment is indicated with a little “dog ear” and the comment is displayed by clicking the cell.
In practice, the data model could be in a server-hosted SSAS Tabular model, Office 365/Power BI or SharePoint-hosted Power Pivot workbook, or a desktop Excel file. In one real world application, if entries don’t meet company standards, the person reviewing the data entry can enter comments about these late entries and other policy violations. A comment has an expiration date and in the production solution, the comment expiration is also dependent on a calculated “severity level” which can be baked into the expiration logic.
My first thought when considering the need to incorporate text strings into a semantic model was that it might not be a good idea to store large text strings in a Power Pivot or Tabular model. After all, the tabular model technology is really optimized to handle numeric and redundant data. Upon carefully talking through the requirements, we’re pretty comfortable that the number of columns will be fairly low and the length of the comment strings should be conservative. The VertiPaq/VertiScan engine is actually pretty forgiving about having “stuff” added to a model that it will ignore except when it’s being used in a calculation or query. So, that’s good – my comment rows shouldn’t get in the way of core features like measure aggregation and dimension navigation. I’m really curious to find out how viable this is in a larger-scale application so if you end-up doing this with a lot of data or have the opportunity to test the limits of this technique, please share your results.
With that in mind, I took the plunge and built a simple proof-of-concept solution; which is what I’m demonstrating here. If the storage and query engine are OK coexisting with these new column fields, the big question that remains is how will the query engine react when we start to send it some unconventional long string results. Will these long values just get crunched through the query engine expressions along with the usual measures without making a ruckus.
Now remember that this is an über-simplified model and that the real one contains about 30 tables, but the major components are the same. The main fact table (tbl_fact) contains keys related to some outside tables and a measure called “Sum of Sum Value” which aggregates the SomeValue column for every Person in a pivot table report. So, here’s the solution:
The Comments table (tblComments, which I’ll call “Comments”) is essentially fact table related to Dates and People:
The source for the Comments table is a worksheet/table. Here are some of the comments:
In the production solution, we want the comment to show up until the expiration date. Rather than waiting for the system date to change, I’ve added a table of EffectiveDate values to use as a slicer for prototyping. The DAX measure expressions are added to the calculation area in the Comment table.
I’ll start with a simple version of the the CurrentComment measure before adding the expiration logic:
Current Comment:=IF(HASONEVALUE(tbl_Comments[Comment]), VALUES(tbl_Comments[Comment]))
it’s important to test the comment for only one value using the HASONEVALUE function to avoid an error. The VALUES function just returns the column value as a string. Now, I’ll add the expiration logic which uses the first measure:
Once this is all tested and working, we can just substitute “TODAY()” in-place of “FIRSTDATE( tblEffectivedate[EffectiveDate] )” to use the system date.
Now, to add the Excel comments. This is the fun part (as if it hasn’t been exciting enough thus far!)
Add a pivot table to a new sheet name “Report”. Add Dates and People on rows and the “Some of Some Value” measure on columns (which actually adds it to the VALUES). Add the “Comment Until Expiration” to columns as well. Select the last column added to the pivot table and hide it. In my example, this is column D.
You’ll need to have macros enabled and trust VBA in Options > Macro Settings.
Use Alt+F11 to open the Visual Basic for Applications editor and enter the following code into a code module (usually Modules > Module1):
‘***********************************************************
‘ Paul Turley, 9-22-14
‘ Dynamically add comments to Pivot Table value cells based
‘ on an adjacent hidden column.
‘***********************************************************
Dim ws As Worksheet
Dim pt As PivotTable
Const iPTRowOffSet As Integer = 3 ‘ Pivot Table starts on this row
Const iPTHeaderRows As Integer = 2 ‘ Number of Pivot Table header rows
Const iPTClearSafetyRows As Integer = 100 ‘ Number of rows beyond the last PT row that get cleared (in-case the filtered table shrinks by a large # of rows)
Dim iRows As Integer
Dim iRow As Integer
Dim CommentRng As Range
Dim ValueRange As Range
Sub SetObjects()
Set ws = Worksheets(“Report”)
Set pt = ws.PivotTables(1)
End Sub
Sub btnCreateComments()
SetObjects
ClearAllComments
CreateComments
End Sub
Sub CreateComments()
For iRow = (iPTRowOffSet + iPTHeaderRows) To pt.RowRange.Rows.Count + iPTRowOffSet
Set CommentRng = pt.ColumnRange(iRow, 2)
If CommentRng.Cells(0).Value <> “” And iRow >= iPTRowOffSet Then
Set ValueRange = Worksheets(“Report”).Cells(iRow + 1, 3)
If ValueRange.Comment Is Nothing Then
ValueRange.AddComment (CommentRng.Cells(0).Value)
End If
ValueRange.Comment.Visible = False
End If
Next
End Sub
Sub ClearAllComments()
SetObjects
For iRow = (iPTRowOffSet + iPTHeaderRows) To (pt.RowRange.Rows.Count + iPTRowOffSet + iPTClearSafetyRows)
Set ValueRange = Worksheets(“Report”).Cells(iRow + 1, 3)
If Not ValueRange.Comment Is Nothing Then ValueRange.Comment.Delete
Next
End Sub
Note that there are some assumptions made in this code. I don’t think the pivot table needs to start at cell B3 but that’s where mine is. Note the constants at the top that are used to skip the header cells. These values will need to be adjusted if you make changes.
Next, open the code module for the workbook (named ThisWorkbook) and add the following code for the SheetCalculate event (three lines added between the generated Private Sub & End Sub lines):
Private Sub Workbook_SheetCalculate(ByVal Sh As Object) SetObjects
ClearAllComments
CreateComments End Sub
Now, let the magic happen. Every time the sheet is calculated (this event fires every time it is connected, filtered, sliced or the pivot table is changed), the code iterates through the measure column cells and removes all the Excel comments and then iterates through and adds new Excel comments using the values in the adjacent hidden column. The hidden column contains the calculated “Comment Until Expiration” measure text values.
You can add or remove comment text in the Comments sheet, move to the report sheet and use Data > Refresh to update the model. Use slicers to change the EffectiveDate and filters to add or remove row values. With each action, you should see comments come and go. To view a comment, click the cell and hover the mouse pointer over the dog-ear in the upper-right corner of the cell. There are more comment options on the Review ribbon. Make sure you save this as a macro-enabled workbook (.xlsm) file.
As always, I’m interested in feedback, thoughts and suggestions.
Thank you to the Microsoft MVP Program for awarding me MVP for six years running! I love this program and look forward to another year of great community events and opportunities to advise and mentor.
This year I had the privilege to speak at numerous SQL Saturdays, conferences and user groups, help develop a certification program for the University of Washington, contribute to books and papers, moderate forums, test new products and advise Microsoft product teams, direct the Oregon SQL PASS Chapter, and work with consulting clients to develop database and BI solutions using cutting-edge technologies. I’m planning the Oregon SQL Saturday this year and will be co-presenting a pre-conference at the PASS Global Summit.
I’m always looking for new opportunities to support the SQL Server and Microsoft Business Intelligence community. If you’re looking for a speaker, writer or need some help, please contact me.
I just received some bad news and found out that a friend passed away today. Reed Jacobson was one of the early leaders in the Microsoft Business Intelligence community. Somewhere around 2004-2008 I worked with Reed at Hitachi Consulting in Seattle where we travelled, taught classes and worked on client projects together. He was one of the earliest adopters and gurus in the BI community working for a small start-up called OLAP Train who helped bring SSAS (aka “OLAP Services”) to the masses. He spoke at many conferences and taught a lot of classes.
Not only was he one of the smartest people I’ve known when it came to solving business problems with Analysis Service & MDX but he was a kind, friendly man who really enjoyed helping and connecting with people. He wrote many books about data analysis with Excel & SSAS and he wrote the MDX training curriculum for Hitachi Consulting. He was my mentor. We co-taught some of those early classes and I spent some time with him on the road. He bent my brain intellectually and we talked about work, family, politics and religion. He was man of principles who asked questions and gave a lot of thought to his actions.
Reed left a lot more than a collection of old technical books behind. He had many good friends, customers and connections who appreciate his leadership, guidance and friendship. I am privileged to have worked with Reed.
A memorial will be held this Sunday, July 20 at 7:00 PM
Dr. Jose Rizal Park; 1008 12th Ave S, Seattle, Washington 98134
Someone recently asked me through LinkedIn how much it would cost to implement a BI project with Microsoft tools. This is a big question that can’t easily be answered without understanding the variables of scale, complexity, user and business requirements. My knee jerk reaction to a question like this is to stand by the “it depends” answer but this is a very good question. In this new age of self-service BI, can BI work really be more affordable? At the PASS Business Analytics Conference in San Jose last week I spoke with data analysts (not just IT developers) who were actually building BI solutions by themselves for groups of business professionals to run their companies using tools like Power Pivot and Excel. The monthly subscription cost for SharePoint Online and Power BI is $40 per person. Considering the power of these tools, if this meets their needs, that’s pretty cheap.
For substantive BI projects, some cost factors include the number of users, volumes of data, number of reports and specific components of the solution. Honestly, a small to moderate BI solution can realistically cost somewhere between $1,000 and $1,000,000 depending on many factors. If the source data quality is very good and data is conformed across the business, costs will be lower and if not; estimating is more difficult. I nearly always advise an assessment project followed by a proof of concept.
It is true that a single developer could use Power BI & Excel to deliver a small solution for a few hundred dollars in subscription fees that could be usable in a couple of weeks. A larger, more formal project employing 5 full-time developers in 6 months using the full suite of SQL Server BI tools might cost a million dollars to deliver.
I like to respond to questions with real answers and so I thought about the cost of some of the smaller BI projects I’ve worked on. From that experience, here is a realistic example for a small, 2-3 month project that might serve a hundred or so users:
Server hardware: $50,000 Software licensing (SQL Server & SharePoint): $100,000 External resource labor: $100,000 Internal resource reassigned labor: $100,000 Training & other misc costs: $50,000
This estimate would be for a team that have done this kind of work successfully a few times and understand the cadence of BI project work. If you have not gone through this process, you could multiply the estimate by 2-3x. Every one of these cost components depends on factors like scale and complexity, changing requirements and data quality issues. For example, SharePoint server licensing cost increases when you add users so a solution for 100 users might cost $400,000 and a solution for 2000 users might cost $2 million.
As a consultant, I’m naturally inclined to recommend hiring a seasoned expert to manage things and steer the team through the project maze. The honest fact is that BI work is something we get good at only with a lot of experience. It takes time and energy to fully understand the dynamics of data, development teams and the unique challenges involved in connecting business, technology and information as we do with BI.
There are really no close parallels to BI projects so until you have a chance to go through the process, it’s difficult to anticipate costly elements like data quality problems that don’t show up until late in the project, learning to use semantic modeling tools and query languages like MDX and DAX, error handling and configuration management in ETL packages and reporting tools that seem to be easy to use until you take on complex visualization challenges. Compared with other types of software projects, in BI the easy stuff is easier and the hard stuff can be a lot harder. This makes it increasingly difficult to estimate the time and cost to complete a project so the next time a BI consultant answers your request for an estimate with an answer like “it depends” and “oh, somewhere between a thousand and a million dollars, please understand that we really mean it. Sometime we laugh a little because we know how hard it is to take this seriously but this is the truth.
Estimating this type of work is inherently hard. Several cost estimating formulas have been devised in the industry but for the most part, they’re all B.S. and exist simply because bean counters and executives demand hard numbers. An estimating formula taught in prominent project management curriculum is to add an arbitrary best-case estimate, a median or most likely estimate times four, and worst-case estimate and then divide that result by six. That’s it. It’s a little like the secret recipe for Kentucky Fried Chicken used by hundreds of professionally-trained project estimators who know the inside secret. Again, the super-secret formula is:
Don’t get me wrong, I realize that it’s not acceptable in business to say “I don’t know what it’s going to cost. Just give me an open budget and we’ll do our best to deliver some results.” I once worked for a small consulting shop who serviced some big, well-known clients with very strict requirements for their vendors. We put a guy in charge of our estimating process. He analyzed dozens of projects and sought the advice of many experienced project managers and developers. He reviewed the final cost of several finished projects and came up with formulas to valuate the cost of application modules, database objects and individual lines of code. In the end, the best formula was to ask the developer to estimate their time to write and debug their work, apply a standard hourly bill rate, double that and add a third. I’m serious… that was our method – and it worked!
I worked for a larger, more formal consulting firm and they had an estimating model that was meticulously engineered down to every single item and unit of work that was recorded using piles of forms and templates. It took more effort to estimate the project then to actually build the solution. We had big clients with deep pockets who were accustomed to paying high hourly bill rates and covering even higher rates for Vice President and Director level people to come in for meetings and dinner shin digs. Nearly every project finished on-time and within budget! The funny thing is that the reason they were so good at estimating the project cost so accurately was that the estimate became a self-fulfilling prophesy. They built all this overhead into the estimate and then used it up when as they were flying all the consultants across the country every week to stay in fancy hotels and spend per diems. The work was executed as planned with plenty of management and administrative staff to track every single hour and deliverable. The best BI expert consultants I know are pretty responsible with their clients’ travel budgets and expenses. Their hourly bill rate is on the high side but they are worth every penny when you consider the return-on-investment for a successful project. In the end, the business gains value and thrives because business leaders can make informed decisions.
BI work is as much art as it is science. The science part provides methods and disciplines based on standards and industry best practices but the artistic element of data modeling and visualization is a path to both innovation and distraction. Self-service BI provides interesting opportunities to explore data and discover patterns but new ideas can also take a project off path. The great challenge is to envision a solution and deliver it within the defined scope and estimated budget. After delivering what was promised, we can build on the success of the project and be more creative after building on the foundation of a well-crafted BI solution.
This series of video sessions will demonstrate how to create a business intelligence solution with sets of business data from multiple sources. In the first session, I import business records from 5 different D&B data feeds using the Windows Azure Marketplace with desktop BI tools in Excel 2013 ProPlus. In future sessions, I will combine public data sets, use features of Power BI and migrate to an enterprise scale BI solution with SQL Server, SSAS tabular and SharePoint.
D&B data fees in the Windows Azure Marketplace can be used to get detailed, current demographic and financial information about specific businesses or companies in different regions and categories. These business records can be used for things like market analysis, sales, verification and research.
Session 1 – Getting started (48:00) Analyzing Dunn & Bradstreet business records data from the Windows Azure Marketplace using Excel 2013 Power Query, Power Pivot and Power View to create a basic dashboard
To watch this video with more playback controls, you can also view it on ScreenCast here.
Session 2 – Enhancing the Solution (32:00) Combining Dunn & Bradstreet Customer Records with Public & Enterprise Data Geographic visualization with Power Map, publishing to Power BI and using Q&A to search a model with natural language
Good technology books usually come in one of two forms. Some of the books on my shelves go deep technically, contain useful tidbits of code but read like the phonebook. Just a few are interesting and insightful. This book is a rare gem that does both. Paul te Braak is well-known in the Business Intelligence community for his expertise and contributions and he delivers a unique guide that starts with the basics and proceeds to cover all of the essentials with depth and practical examples to solve some challenging business problems.
You might expect a book that introduces fundamental concepts to gloss-over advanced topics and avoid complex applications. This book covers the essentials of data modeling and analysis with Excel and Power Pivot in language that is plain and easy to understand but it doesn’t stop with the basics. It covers practical and useful applications without insulting the reader’s intelligence. As an experienced data modeler, I found several useful techniques and new methods to use the tools and language. Paul’s coverage of practical techniques spans the spectrum of business applications and product features. This is a rare book that is not only a good tutorial with many hands-on examples that can be repeated by the reader but it’s also a great reference of useful techniques and code samples.
Highlights include:
The integration of Excel features such as ranges, tables, pivot tables and pivot charts with the additional features of Power Pivot and Power View that extend and enhance these capabilities
Examples and instructions are directed at Excel 2010 users and the author compares some the different features in Excel 2010 and Excel 2013.
Fundamentals of the DAX calculation language
Importing data as text, different date formats and implied data type columns
Beyond the basics, a schema.ini file is used to define column data types
Importing data from a database, working with database tables, views and queries, managing connections and challenges encountered running the table import wizard multiple times
Data feeds using OData and using a Reporting Services report to provide a data feed
Decisions a designer makes to enable the user’s experience when browsing a model. This includes sorting values, navigating hierarchies that enable drill-down interaction.
DAX “X” functions (SUMX, MINX, etc.) to perform row-level aggregation
Working with parent-child hierarchies using specialized DAX path functions.
Advanced browsing features, adjusting pivot table options to optimize the user experience
Building and using KPIs and using alternate table relationships
Time calculations and date functions. This chapter covers running totals and totals to date
Date part aggregate functions (MTD, YTD, etc.),
Essential data math and comparisons
LastYear and PriorPeriod functions, TotalYTD
Manufacturing calendar, working with “445” dates
Creating a dynamic relative time measure, using a shell dimension table
Using DatesBetween to show the average value for the past 10 days
Apply advanced modeling technique to bin, sort and rank values for reporting
Expand concepts introduced in chapter 3, using the DAX “X” functions to perform row iteration in advanced financial applications
Defining and working with many-to-many relationships. This is often no trivial task to completely understand many-to-many relationship requirements and to apply a working solution that provides the intended results
Addressing inventory and stock-keeping challenges
Conditional aggregation at different levels
Budgeting and forecasting vs actuals
Programming Excel to enhance the users experience
Excel VBA event programming to respond to slicers
Using cube functions
Interacting with charts and slicers
Building solutions for the enterprise
Using the SSDS Tabular designer
Migrating Power Pivot models to Tabular server solutions
What I’m going to show you is not rocket science. This one gets filed under the category of “well, duh. That was easy …now that I see how to do it.”
Power View sets the maximum value for a chart series based on the maximum unfiltered axis value; pretty simple but not too flexible. Let’s say that I want to show the progress of project completion. If I add a percentage type measure to the Value of a column chart with no fields on the Axis, it will plot a single column “gauge” chart with the maximum value (in this case 65%) somewhere near the top of the chart. I suppose that’s what we asked it to do but there is no way to manually set the maximum range of the axis scale. Oh, what to do?
Like most beyond-the-basics capabilities in Power View, the answer is to enhance the semantic model – and this one’s easy. In the PowerPivot or SSAS model, create another measure. I’ll do it right next to the existing Percentage calculation named “Project Pct Complete”. The objective is to get the gauge visual (our no axis column chart) scale to add up to 100% so what we need is another measure that when added to the first one, adds up to 100%. I’ll call that calculation “Remainder Pct Complete” using the DAX expression:
By subtracting the % Complete from the number one, the result will always be the remainder of 100%. I’ll format the new measure as a percentage.
Now I add both measures to the chart/gauge like so:
I haven’t quite got the legend to work the way I want it to. In this case, I’d probably just hide and and maybe show data labels for the two series values. That’s all up to you, of course. For extra credit, let’s add another progress gauge that shows the overall progress even when a slicer is used to filter the data. First of all, test this. I’ve added a Priority slicer and selecting one or a combination of items slices the project rows and changes the progress gauge value.
Back in the model, I add two more measures starting with the existing expressions. You can just copy one of the measure cells and paste it into an empty cell. This must be done on-at-a-time. The expression is enhanced using the CALCULATE and ALL functions, like this:
This says “Apply the Project Pct Complete measure calculation to whatever filter context is provided in the CALCULATE function. In this case, we are says “ignore the natural row or filtering context of the query and always apply the calculation to all rows in the Projects table.”
Now, when a slicer or filter is used on the report, the Overall gauge continues to show the same percentage while the rest of the report, including the Progress gauge shows only filtered values.
Power BI: a new suite of Business Intelligence tools
Over the past few months, teams at Microsoft have made several new Business Intelligence tools available for preview; some only privately and some to the public. The entire suite will soon be available for either public preview or release under the new name: “Power BI”. All of the components of Power BI are listed below but the big news is a new hosted offering called “Power BI for Office 365” and “Power BI Sites”. The announcement was made at the Worldwide Partner Conference this week. Users can sign-up to be notified when the new offerings are available for general availability, apparently in the very near future. I’ve had an opportunity to work with early, pre-released versions and it has been interesting to see the gaps being filled a little at a time. On the heals of the new suite, some of the names of existing products are also being changed. It’s hard to have a conversation about the collection of Microsoft’s “Power”/”Pivot”/”Point”…named tools and not get tongue twisted but these changes bring more consistency.
Bottom line: this is good news and a promising step forward – especially for smaller businesses. Larger, enterprise customers should know that this move is consistent with Microsoft’s “cloud first” philosophy and these capabilities are being introduced through Office365/Azure platform with required connectivity. Read the commentary on community leaders’ sites below. I have no doubt that there will be a lot of discussion on this in the weeks to come with more announcements from Microsoft in the near future.
When Power View was released with SQL Server 2012 Enterprise and Business Intelligence Editions, it was available only when integrated with SharePoint 2010 Enterprise Edition. This is a good solution for enterprise customers but it was complex and expensive for some to get started. Power View was also offered only as a Silverlight application that wouldn’t work on many mobile devices and web browsers. For this reason, Power View has really been viewed as a “Microsoft only” tool and only for big companies with deep pockets and very capable IT support groups. Even the new Power View add-in for Excel 2013 ProPlus Edition requires Silverlight which is not a show-stopper for most folks but a hindrance for multi-platform and tablet users. This all changes with this new offering as the Power View visualization tool in the hosted product come in 3 new flavors: native Windows 8 app (runs on desktop, Surface RT & Pro), native iOS (targeting the iPad) and HTML5 (works on practically any newer device). This means that when you open a Power View report on your Surface or iPad, it can run as an installed app with all the cool pinch-zoom and gestures you’ve come to expect on a tablet device. For now, this is good news for the cloud user as no on-premises option is currently available. An interesting new edition will be the introduction of a semantic translation engine for natural language queries, initially for English.
Formerly known as “Data Explorer”, this add-in for Excel 2013 allows you to discover and integrate data into Excel. Think of it as intelligent, personal ETL with specialized tools to pivot, transform and cleanse data obtained from web-based HTML tables and data feeds.
This Excel 2013 ProPlus add-in, which was previously known as “GeoFlow”, uses advanced 3-D imaging to plot data points on a global rendering of Bing Maps. Each data point can be visualized as a column, stacked column or heat map point positioned using latitude & longitude, named map location or address just like you would in a Bing Maps search. You can plot literally thousands of points and then tour the map with the keyboard, mouse or touch gestures to zoom and navigate the globe. A tour can be created, recorded and then played back. Aside from the immediate cool factor of this imagery, this tool has many practical applications.
Power Pivot
The be reveal is that “PowerPivot” shall now be known as “Power Pivot”. Note, the space added so that the name is consistent with the other applications. We all know and love this tool, an add-in for Excel 2010 and Excel 2013 ProPlus (two different versions with some different features) that allow large volumes of related, multi-table data sources to be imported into an in-memory semantic model with sophisticated calculations. On a well-equipped computer, this means that a model could contain tens of millions of rows that get neatly compressed into memory and can be scanned, queried and aggregated very quickly. Power Pivot models (stored as an Excel .xlsx file) can be uploaded to a SharePoint where they become a server-managed resource. A Power Pivot model can also be promoted to a server-hosted SSAS Tabular model where data is not only managed and queried on an enterprise server but also takes on many of the features and capabilities of classic SSAS multidimensional database. Whether a Power Pivot model is published to a SharePoint library or promoted to a full-fledged SSAS Tabular model, the data can be queried by any client tool as if it were an Analysis Services cube.
Power View
For now, Power View in Excel 2013 ProPlus and Power View in SharePoint 2010 Enterprise and SharePoint 2013 Enterprise remain the same – the Silverlight-based drag-and-drop visual analytic tool. With the addition of SQL Server 2012 CU4, Power View in SharePoint can be used with SharePoint published Power Pivot models, SSAS Tabular models and SSAS Multidimensional “cube” models. There has been no news yet about a non-Silverlight replacement for the on-premise version of Power View. The Microsoft teams and leadership have heard the requests and feedback, loud-and-clear, from the community and we can only guess that there is more is in-the-works but I make no forecast or assumptions about the eventual availability of an on-premise offering similar to Power BI for Office 365.
I’m very excited to see my first feature article published in SQL Server Pro Magazine titled Custom Programming to Enhance SSRS Reports; How to write a custom assembly and use it to build a dynamic report dataset. The article was posted online in April and featured in the July printed and electronic edition. SQL Server Pro (formerly known as “SQL Server Magazine” or “SQLMag”) is published by Penton Media and is the largest publication for the SQL Server community. Please read the article, download the code, work through the exercise and let me know if you have comments or questions.
I posted an early draft of this article in my blog last year titled Using Custom Assemblies in Reports to Generate Query Logic (parts 1 and 2). The code was cleaned-up and tech edited in the new article which I recommend as the most reliable source (not that I write bad code, mind you, but it never hurts to have a formal tech review.)
I recently had the opportunity to review a pre-published copy of the “Instant” series book “Creating Data Models with PowerPivot How-to” by Leo Taehyung Lee, published by Packt Publishing.
This is a short, concise book; only 46 pages long. Quite honestly, if I had picked up this book for myself, I wouldn’t have gotten much out of it. But, I’m an IT Professional with a few year’s worth of experience with PowerPivot and I need to go beyond the basics. Last week, I taught a PowerPivot/Excel 2010 class to a group of financial business users that covered getting started, the basics and some intermediate use scenarios; and I will recommend this book to every student in that class. It’s a perfect fit.
Readers who are using Excel 2010 will find this a good fit for them. The book was written using examples in Excel 2010 and then mentions that PowerPivot is also available for Excel 2013 but doesn’t explain the differences, which are many (a newer version of PowerPivot is part of the Professional Plus edition of Excel 2013). Excel 2013 ProPlus users may be able to apply a good portion of the material but will encounter features on different ribbons and menus and other subtle differences between the two products that are not noted in the book.
Just like the title states, “Creating Data Models with PowerPivot How-to” is a short, fast and focused guide that will get a new PowerPivot user started and using the Excel add-in to answer real business questions and to get good use of this product. It’s well-written guide, using clear and easy-to-understand language. The basics are covered sufficiently without getting into the weeds and exploring complex features. Topics are effectively illustrated and easy to find. If you’re a non-technical business user and need quick, simple guide to get you started, this book is a good place to begin.
More information s available from the publisher here.
It’s that time of year again… end of the school year, graduation, Fathers’ Day and of course; time to take the annual BI Survey! The survey is an annual report conducted by Business Application Research Center (BARC), an independent software industry analyst, comparing the industry BI tools rated by thousands of industry professionals. Participants will have access to a free version of the results that they can use to compare and rank each vendor tool on a variety of criteria. BARC also sells a report of detailed survey results with exhaustive product and vendor background information.
A Getting-Started and Survival Guide for planning, designing and building Tabular Semantic Models with Microsoft SQL Server 2012 Analysis Services.
by Paul Turley
This post will be unique in that it will be a living document that will be updated and expanded over time. I will also post-as-I-go on the site about other things but this particular post will live for a while. I have a lot of good intentions – I know that about myself and I also know that the best way to get something done is to get it started – especially if I’m too busy with work and projects. If it’s important, the “completing” part can happen later. In the case of this post, I’ll take care of building it as I go, topic by topic. Heck, maybe it will never be “finished” but then are we ever really done with IT business solutions? I have been intending to get started on this topic for quite some time but in my very busy project schedule lately, didn’t have a concise message for a post – but I do have a lot to say about creating and using tabular models.
I’ve added some place-holder topic headers for some things that are on my mind. This list is inspired by a lot of the questions my consulting customers, students, IT staff members and business users have asked me on a regular basis. This will motivate me to come back and finish them and for you to come back and read them. I hope that you will post comments about your burning questions, issues and ideas for related topics to cover in this living post about tabular model design practices and recommendations.
Why Tabular?
SQL Server Analysis Services is a solid and mature platform that now serves as the foundation for two different implementations. Multidimensional models are especially suited for large volumes of dimensionally-structured data that have additive measure values that sum-up along related dimensional attributes & hierarchies.
By design, tabular architecture is more flexible than multidimensional in a number of scenarios. Tabular also works well with dimensional data structures but also works well in cases where the structure of the data doesn’t resemble a traditional star or snowflake of fact and dimension tables. When I started using PowerPivot and tabular SSAS projects, I insisted on transforming data into star schemas like I’ve always done before building a cube. In many cases, I still do because it’s easier to design a predictable model that performs well and is easy for users to navigate. A dimensional model has order and disciple however, the data is not always shaped this way and it can take a lot of effort to force it into that structure.
Tabular is fast for not only additive, hierarchal structured data but in many cases, it works well with normalized and flattened data as long as all the data fits into memory and the model is designed to support simple relationships and calculations that take advantage of the function engine and VertiPaq compression and query engine. It’s actually pretty easy to make tabular do silly, inefficient things but it’s also not very hard to make it work really well, either.
James Serra has done a nice job of summarizing the differences between the two choices and highlighted the strengths and comparative weaknesses of each in his April 4 blog post titled SQL Server 2012: Multidimensional vs Tabular. James points out that tabular models can be faster and easier to design and deploy, and that they concisely perform well without giving them a lot of extra attention for tuning and optimization. Honestly, there isn’t that much to maintain and a lot of the tricks we use to make cubes perform better (like measure group partitioning, aggregation design, strategic aggregation storage, usage-base optimization, proactive caching and cache-warming queries) are simply unnecessary. Most of these options don’t really exist in the tabular world. We do have partitions in tabular models but they’re really just for ease of design.
What About Multidimensional – Will Tabular Replace It?
The fact is the multidimensional databases (which most casual SSAS users refer to as “cubes”) will be supported for years to come. The base architecture for SSAS OLAP/UDM/Multidimensional is about 13 years old since Microsoft originally acquired a product code base from Panorama and then went on to enhance and then rewrite the engine over the years as it has matured. In the view of many industry professionals, this is still the more complete and feature-rich product.
Both multi and tabular have some strengths and weaknesses today and one is not clearly superior to the other. In many cases, tabular performs better and models are more simple to design and use but the platform is lacking equivalent commands and advanced capabilities. In the near future, the tabular product may inherit all of the features of its predecessor and the choice may become more clear; or, perhaps a hybrid product will emerge.
Isn’t a Tabular Model Just Another Name for a Cube?
No. …um, Yes. …well, sort of. Here’s the thing: The term “cube” has become a defacto term used by many to describe the general concept of a semantic model. Technically, the term “cube” defines a multidimensional structure that stores data in hierarchies of multi-level attributes and pre-calculated aggregate measure values at the intersect points between all those dimensions and at strategic points between many of the level members in-between. It’s a cool concept and an an even cooler technology but most people who aren’t close to this product don’t understand all that. Users just know that it works somehow but they’re often confused by some of the fine points… like the difference between hierarchies and levels. One has an All member and one doesn’t but they both have all the other members. It makes sense when you understand the architecture but it’s just weird behavior for those who don’t.
Since the tabular semantic model is actually Analysis Services with a single definition of object metadata, certain client tools will continue to treat the model as a cube, even though it technically isn’t. A tabular Analysis Services database contains some tables that serve the same purpose as measure groups in multidimensional semantic models. The rest of the tables are exposed as dimensions in the same way that cube dimensions exists in multidimensional. If a table in a tabular model includes both measures and attribute fields, in certain client tools like Excel, it will show up twice in the model; once as a measure group table and once as a dimension table.
(more to come)
Tabular Model Design: The Good, the Bad, the Ugly & the Beautiful
I’ve taught a few PowerPivot training sessions to groups of business users (now, remember that Tabular SSAS is really just the scaled-up version of PowerPivot.) Admittedly I’m more accustomed to working with IT professionals and when I teach or work with users, I have to throttle my tendency to go deep and talk about technical concepts. In these classes, I find myself restating the same things I’ve heard in conference presentations and marketing demos about PowerPivot data sources, like “you can import just about anything into PowerPivot”. As I read the bullet points and articulate the points on the presentation slides to these users, I have this nagging voice in the back of my mind. I’ve spent many years of my career unraveling the monstrosities that users have created in Access, Excel & Visual Basic.
Whether stated or implied, there is a common belief that a PowerPivot solution doesn’t require the same level of effort to transform, prepare and cleanse data before it gets imported into a data model. For many years, we’ve been telling these users that it will take a serious effort, at significant cost, to prepare and transform data before we can put it into a data mart or cube for their consumption. In a typical BI solution, we usually burn 70-80% of our resource hours and budget on the ETL portion of the project. Now, using the same data sources, users are being told that they can do the same thing themselves using PowerPivot!
Data Modeling 101 for Tabular Models
One of the things that I really enjoy about building tabular models is that I can have my data in multiple structures and it still works. If the data is in a traditional BI “Kimball-style” Star schema, it works really well. If the data is normalized as it would be in a typical transactional-style database, it still works. Even if I have tables that are of a hybrid design; with some characteristics of both normalized and dimensional models, it all works beautifully.
Here’s the catch; one of the reasons we build dimensional data model is because they are simple and predictable. It’s really easy to get lost in a complex data structure and when you start combining data form multiple source systems, that’s where you’re likely to end up. Getting business data into a structure that is intuitive, that behaves correctly and gives reliable results can be a lot of work so be cautious. Just because a tabular model can work with different data structures doesn’t that you don’t need to prepare your data, clean it up and organize it before building the semantic model.
The classic star schema is one of the most effective ways to organize data for analysis. Rather than organizing all data elements into separate tables according to the rules of normal form, we consolidate all the measures that are related to common dimensional attributes and with a common grain (or aggregation level), into a fact table. The dimensional attributes are stored in separate dimension tables – one table per unique business entity, along with related attributes. Any group of measures not related to the same set of dimensions at the same level would be stored in their own fact table. In the example, Invoice measures that are related to stores and customers, recorded every quarter are in one fact table. The sales debit records for customers and stores that are recorded daily go in a different fact table. The account adjustments don’t record the store key but they are uniquely related to accounting ledger entries stored in the ledger table. Note the direction of the arrows showing that facts are related to lookup values in the dimension tables.
Exhibit 1 – A Fully conformed Star Schema
If you can pound your data into the shape or a star schema and this meets your requirements; this is what I usually recommend. It’s a simple and predictable method to organize data in a well-defined structure. Now, let’s look a variation of this approach that has characteristics of both the star schema and normalized form. We’ll call this a “hybrid” model.
The following hybrid schema contains two fact tables in a master/detail relationship. The cardinality of the Invoice and LineItem tables is one-to-many where one invoice can have multiple line items. This would be considered a normalized relationship with the InvoiceID primary key related to the an InvoiceID foreign key in the LineItem table.
The Invoice table contains a numeric measure called Invoice Amount that can be aggregated by different dimensional attributes. Those attributes, such as Store Name, Customer Name or any of the calendar date units in the Dates table that are organized into a natural hierarchy (with levels Year, Month and Date). To facilitate this, the invoice table is related to three different dimension tables: Stores, Customers and Dates. Each of the dimension tables has a primary key related to corresponding foreign keys in the fact table. The LineItem table also numeric measures and is related to the Products table, also a dimension table.
Exhibit 2 – A Hybrid Star / Master-Detail Schema
This semantic model supports two levels of aggregation with respect to the Invoice and LineItem records. If I were to browse this model in an Excel Pivot Table and put all the stores on rows, I could aggregate the Invoice Amount and see the sum of all Invoice Amount values for each store
<< need pivot table graphic here >>
Are There Rules for Tabular Model Design?
Oh, absolutely. Tabular SSAS and PowerPivot allow you to work with data is a variety of formats – structured & unstructured, dimensional & normalized. You have a lot of flexibility but there are rules that govern the behavior and characteristics of data. If you don’t follow the rules, your data may not meet your requirements in the most cost-effective way.
This reminds me of an experience when I started high school.
Rule #1: Model the data source
Rule #2: Cleanse data at the source
Rule #3:
Tabular Model Design Checklist
What’s the Difference Between Calculated Columns & Measures?
What are the Naming Conventions for Tabular Model Objects?
What’s the Difference Between PowerPivot and Tabular Models?
How to Promote a Business-created PowerPivot Model to an IT-managed SSAS Tabular Model
Getting Started with DAX Calculations
DAX: Essential Concepts
DAX: Some of the Most Useful Functions
DAX: Some of the Most Interesting Functions
Using DAX to Solve real-World Business Scenarios
Do I Write MDX or DAX Queries to Report on Tabular Data?
Can I Use Reporting Services with Tabular & PowerPivot Models?
Do We Need to Have SharePoint to Use Tabular Models?
What Do You Teach Non-technical Business Users About PowerPivot and Tabular Models?
What’s the Best IT Tool for Reporting on Tabular Models?
What’s the Best Business User Tool for Browsing & Analyzing Business Data with Tabular Models?
Survival Tips for Using the Tabular Model Design Environment
How Do You Design a Tabular Model for a Large Volume of Data?
How Do You Secure a Tabular Model?
How to Deploy and Manage a Tabular Model SSAS Database
Tabular Model Common Errors and Remedies
Tabular Model, Workspace and Database Recovery Techniques
Scripting Tabular Model Measures
Simplifying and Automating Tabular Model Design Tasks
Tuning and Optimizing a Tabular Model
How do you tune a tabular model? You don’t.
You can prevent performance and usability problems through proper design.
In March, I posted about the public preview of the “Microsoft SQL Server 2012 With Power View For Multidimensional Models”. Well, the official release is now available for those currently using SQL Server 2012. When the preview became available a couple of months ago, I was very excited to test it out so I downloaded it, quickly scanned the release notes and then proceeded to upgrade an existing SQL Server 2012 SP1 development server. What I missed in the release notes was the requirement to uninstall several existing components and then to install them from scratch. That wasn’t as easy as I had hoped but it’s pretty typical for prereleased software to not include upgrade support. After all, the product teams are focused on finishing features and debugging and not getting all the upgrades and installation details sorted out. Those steps usually happen last in the release cycle.
Not to worry, this new capability is now part of the Cumulative Update 4 for SQL Server 2012. This means that it will be fully-supported as an upgrade to an existing SQL Server 2012 installation. This is very exciting news. If you have seen Power View demonstrated with new SSAS tabular models and PowerPivot models in Excel and SharePoint, you know what a simple and powerful data browsing and visualization tool it is. Some people have been a little disappointed that Power View initially only worked with new xVelocity-based tabular models and not the multidimensional cubes built with SQL Server Analysis Services, that have become common in many Microsoft centered IT shops throughout the industry.
The Microsoft product teams have shared a lot of good news, like this, recently about BI innovations – with Power View in Excel 2013 and GeoFlow recently released. They are likely to share even more good news in the weeks and months ahead. It’s an exciting time to see some very impressive, powerful, fun to develop and fun to use BI business and IT tools all coming together to meet very real business problems.
I don’t know about you but I’m going to get this baby installed and working right away. I have clients who have been waiting patiently (and some not so patiently) to use Power View with their existing cubes. I love to be the bearer of good news.
As is usual when something noteworthy happens on the Microsoft BI community, Chris Webb has blog eloquently on the topic and with significant detail. Read today’s post on his blog here.