sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009
The message from the Microsoft Business Applications Summit this week was very clear: Power BI is growing up.
We have known for a while that Power BI is a great front-end tool for enterprise-scale tabular and multidimensional models. Power BI Premium will now allow almost limitless data scale using extremely large datasets by leveraging composite models that employ both DirectQuery and VertiPaq cached mode tables. This feature shatters the previous barriers where data models could only support either DirectQuery or cached mode (imported data). Additionally, Microsoft publicly unveiled plans to add IT developer centric features to Power BI Desktop for managing large models, version control and application development lifecycle.
These announcements leave many open questions about how Microsoft will continue to support self-service and enterprise customers with one tool and about the long-term future of SSAS and Visual Studio-based SSDT. At this point, none of these tools are on the chopping block, but it is clear that James Phillips and his team have significant momentum and are not slowing down to wait for other products to catch-up.
In this interview with Christian Wade, he describes the benefits of this new capability. Christian is Group Program Manager on the Power BI and Analysis Services product team with focus on data model development and scalability features.
Composite Models & Aggregations
In the opening keynote and again in his sessions, Christian demonstrated Power BI reports on the taxi driver activity database with over a trillion rows of raw data. The larger dataset was in a Spark cluster, accessed using DirectQuery. Aggregated tables were stored in the in-memory model using the new composite model feature. As the data was explored in report visuals, the Power BI engine would seamlessly switch from tabular in-memory aggregate tables to DirectQuery source data in order to return low-level details. Composite models will allow mashing-up imported database and file-based data with an DirectQuery.
There are limits and complexities with these new features. You cannot mashup imported tables in a Power BI model based in a direct connection to SSAS, but enterprise-scale features in Power BI arguably may not steer a solution architect to select SSAS over Power BI for serious data modeling. With incremental data refresh, large model support, row-level security and many other “big kid” features, Power BI might be a preferable choice. I’m not ready to rule-out Analysis Services as the better option for most enterprise solutions – at least not in the near future, but Power BI is definitely heading in that direction.
Power BI Desktop Features for Developers
News Flash: Power BI Desktop will be the primary tool for serious, enterprise data model development. A bunch of features are in the works and will arrive soon for preview. The obvious question is what will happen to SSDT for Visual Studio and the answer is that it’s not going away but the product team is putting their energy into enhancing Power BI Desktop for now. It will likely have some kind of “Beginner/Advanced” switch to simplify things for the novice user and isolate advanced or complicated, developer-centric features.
Here are a few screen captures of the forthcoming features:
Schema Compare & Model Integration
Like the BISM Normalizer tool created by Christian Wade, there will be a fully-supported compare and merge tool baked into Desktop. The early build is branded with components from MAQ Software.
Relationship Diagram Designer, Properties Pane and Application Lifecycle Management Features
A new-and-improved diagram designer is imminent. It will allow large models to be more effectively managed and displayed. You will have the ability to save individual views of the diagram with selected, subject-area tables.
Like in Visual Studio, properties can be set and managed in a Properties sheet window. Support for standard SSAS features like display folders is coming and can be managed here.
Business Applications Release Notes
With the velocity of features and updates, a comprehensive set of release notes are available here. This document will be updated frequently with the revised roadmap, feature history and announcements.
APIS and Powershell CmdLets for Power BI Administration
At the MVP Summit in March Kay Unkroth promised API endpoints and Powershell CmdLets for managing all aspects of Power BI. He even did a dance and sang a song to make the announcement memorable. This week, the product team delivered what they promised, with the announcement official here.
The following features and capabilities are on the delivery short list coming in the next 2-3 months:
Power BI Service
The new, “modern workspace” eventually replacing current workspaces integrated with Office 365 groups. The new app workspace will no longer be bound to an Office 365 Group. The feature will be optional at first, but will replace current workspaces soon.
Now, users can be: Member, Admin, Contributor
Can add groups to workspace
Premium capacity tenants can have nodes in different regions
Admins will have visibility and access to all objects and resources
Microsoft Business Applications Summit
Keynotes and Featured Sessions from the Microsoft Business Applications Summit are here.
Yesterday a friend asked for a little help getting started with Power BI. He’s a DBA and system administrator and wanted to cut his teeth on Power BI with a really simple dashboard-style scorecard report. Using a list of database servers with license expiration dates, he thought it would be a simple matter to calculate and show the expiration status for each server using a simple traffic light indicator. The envisioned server list might look something like this:
Makes perfect sense, right? This is a basic use case and a good application for simple KPIs; with the one minor caveat that POWER BI DOESN’T SUPPORT THIS!
This topic has become a bit of a soapbox topic for me because it’s a capability that, in my opinion, is a very obvious gap in the Power BI feature set. After unleashing my rant, I’ll demonstrate a solution in this post.
<BEGIN RANT>
The most interesting thing about this missing feature is that for many years it has existed in the products that evolved into the current Power BI product . Key Performance Indicators (KPIs) are defined as scriptable objects in SQL Server Analysis Services (SSAS) with tremendous flexibility. KPIs are simple… the STATE element of a KPI (often considered “Bad”, “OK”, or “Good” status) is translated into a visual indicator, usually an icon (commonly “Red”, “Yellow” or “Green”, respectively). There are variations on this theme but it’s a very simple concept and a good solution has existed for many years. In SSAS Tabular, the State logic was dummied-down to a slider control that eliminated some of the flexibility we have in the earlier multidimensional project designer but it still works. The slider UX expects that the state applies when a value is equal to or greater then the threshold for yellow and green, and less-then the threshold value for red. Queries returned from SSAS include metadata that tells Excel, Power BI visuals or a variety of other client tools: “The KPI state is 1 (meaning ‘good’) so display a square green icon for this item”. If you have the luxury of building your data model in Analysis Services using the SQL Server Data Tools (SSDT) designer for tabular models – or in Power Pivot for Excel, you would define a KPI using this dialog:
The actual return value for a KPI designed this way is really just “–1”, “0” or “1” which typically represent “Bad”, “OK” and “Good” states, respectively. As I said, you have other options like switching the red/green position or using 5 states rather than 3. The multidimensional KPI designer even gives you more flexibility by allowing you to write a formula to return the VALUE, STATE and TREND element values for a KPI separately. It would be wonderful to have the same capability in Power BI. It would be marvelous if we could the slider UI like this and then an Advanced button to override the default logic and define more complex rules in DAX! The SSAS architecture already supports this capability so it just needs to be added to the UI.
If you design your data model using SSAS multidimensional or tabular, or using Power Pivot for Excel (which was the first iteration of Power BI) KPIs are just magically rendered in native Power BI visuals like a Table or Matrix. But alas, Power BI Desktop does not have this well-established feature that could easily be ported from Power Pivot or the SSAS Tabular model designer.
</ END RANT>
…back to my friend’s simple scorecard report.
Using out-of the box features, the best we could do was this…
Create a calculated column in the table that returns -1 when the expiration date has passed, 0 if it is today and 1 if the expiration date is in the future. Here’s the DAX script for the column definition:
Expiration Status Val =
IF([EndofLifeDate] < TODAY(), -1
, IF([EndofLifeDate] > TODAY(), 1
, 0
)
)
Next, add some fields and the new column to a table visual and use the Conditional Formatting setting in the table properties to set rules for the Back Color property of the calculated column, like this:
Here’s the table with the conditionally-formatted column:
Why Not Use the KPI Visuals?
The standard KPI visual in Power BI is designed to visualize only one value rather than one for each row in a table. Like an Excel Pivot Table, if KPIs were defined in a Power Pivot or SSAS cube or model; a Power BI Table will simply visualize them but the Power BI model designer doesn’t yet offer the ability to create KPI objects.
Several community developers have tried to fill the feature gap with custom visuals but every one of them seems to address different and specific use cases, such as time-series trending or comparing multiple measure values. I have yet to use one of the available KPI visuals that just simply allows you to visualize the KPI status for each row in a table, without having to customize or shape the data in unique and complicated ways.
How to Design Status KPIs With Indicators
Here’s the fun part: Using the Expiration Status column values (-1, 0 or 1), we can dynamically switch-out the image information in another calculated column. Power BI has no provision for embedding images into a report in a way that they can be used dynamically. You can add an image, like a logo, to a report page and you can reference image files using a URL but you cannot embed them into a table or use conditional expressions.
Using this trick, you can conditionally associate images with each row of a table. This is a technique I learned from Jason Thomas, whose blog link is below. Using a Base64 encoder, I encoded three state KPI indicator images as text which I then copied and pasted into the following calculated column formula DAX script:
The encoded binary strings correspond to these three images, in this order:
To reuse this, you should be able to simply copy and paste this code from here into a new calculated column. You no longer need the image files because that binary content is now stored in the table column. It really doesn’t matter what labels you use for the status key values as long as they correspond to the keys used in the preceding code. I’m using the conventional -1, 0 and 1 because that’s the way SSAS KPIs work.
On the Modeling ribbon, set the Data Category for the new column to “Image URL”:
That’s it! Just add any of these columns to a Table visual and WHAM, KPI indicators!
*Incidentally, since adopting Jason’s technique, Gerhard Brueckl came up with a method utilizing Power Query to manage and import image files that I will use in the future. Prior to that, I used this site Jason recommended in his post. My thought is that if a separate table only stored three rows (one for each KPI status), the status key value would be used to relate the tables. It would be interesting to see if using a related table reduces the PBIX file size or if VertiPaq can effectively compress the repeating values of image column. May be a good topic for a later post.
Since starting the Guy In A Cube series over three years ago, Adam Saxton has become the front man for Microsoft Business Intelligence feature announcements and Power BI how-to tutorials. Joined by Patrick Leblanc last year, the Guy In a Cube series features over 200 short video tutorials with at least two new clips added every week. The Guy In A Cube YouTube channel currently has over 13,000 followers.
I always look forward to spending time with both of these guys at community events. I first met Adam several years ago when he was a Senior Support Escalation Engineer for Microsoft who helped with a number of tough BI server setups. Patrick did senior level consulting work for customers in the field before joining Microsoft as Data Platform Solutions Architect. Adam and Patrick are natural born entertainers. With their deep knowledge and expertise with database technology, security and the entire BI, analytics and reporting stack; they offer one of the top online resources for learning and keeping up with the rapidly-expanding Microsoft BI platform.
At conferences and events, I rarely see Adam without a camera in his hand and so was a real treat to get them in front of my camera for this interview. We chatted about how they got started and how they continue to use their series to highlight new and important features, and to provide guidance to support the Microsoft BI community.
In this interview during the 2017 PASS Summit, we continue to explore the topic of this series of blog posts on Microsoft BI solutions for the enterprise. Patrick and Adam share their guidance about managing large-scale solutions, version control and multi-developer projects using Power BI, Analysis Services and SQL Server.
Between sessions at the PASS Summit, I had the privilege of interviewing Alberto Ferrari and Marco Russo; SSAS Maestros and BI industry thought leaders. Alberto and Marco are revered as expert trainers and practitioners of SSAS and Power BI solutions. They are also well known for evangelizing best practices using the DAX calculation and query language. We spoke about some of the challenges and learnings they have gained from years of large-scale project implementations.
Both SSAS Tabular and Power BI Desktop share the same characteristic in that the semantic data model definition is described in a single file. Compared with older multidimensional model projects, the single file approach simplifies much of the development effort but it can be challenging to coordinate projects with multiple developers. They shared tool recommendations to assist with project integration, version management and development. Marco and Alberto recommend a number of community supported add-ins and other freely available software to enhance Visual Studio and Power BI Desktop.
I sincerely appreciate these two gentlemen taking the time to share their insights in this short interview, and their many contributions over the years that continue to improve Microsoft’s industry-leading Business Intelligence platform and the community that supports it. A comprehensive library of resources, training, tools and articles are available at: SQLBI.com.
During the interview we talked about several different tools that they adamantly endorse for any SSAS developer to have at their disposal. From my own experience, in rough order of priority, I recommend :
BIDS Helper / BI Developer Extensions – This Visual Studio add-in is an important tool for anyone developing Business Intelligence solutions in any version of Visual Studio or SQL Server. It includes several essential features and capabilities for SSIS, SSRS and SSAS projects. It has been a community-developed and community-supported tool for many years and is considered by most experienced BI developers to be a core extension of the Visual Studio SSDT (formerly called “BIDS”) add-in.
Community developed tools like this help to spur future enhancements to the officially-supported capabilities in later Microsoft development tooling. I suggest that you carefully read the documentation for SSAS Tabular features in this tool because some of the options only apply to certain versions of SSAS. Some of the BIDS Helper features for SSAS 2012 and 2014 have since been added to the officially-supported designer for SSAS 2016 and 2017. BIDS Helper was initially maintained on the Microsoft CodePlex site (here: http://bidshelper.codeplex.com/) and has since been moved to GitHub here: https://bideveloperextensions.github.io/. It can now be installed in the Visual Studio Tools > Extensions and Updates menu by search for “BI Developer Extensions” in the Online section.
DAX Studio – This is a stand-alone application and the most comprehensive DAX, SSAS and Power BI calculation script and query editing and performance tool available. It is an open source project managed and maintained by a collaboration of several community MVPs and industry leaders (coordinators are Darren Gosbell & Marco Russo). The latest version and codebase have recently been moved from CodePlex to http://daxstudio.org.
Tabular Editor – An open source, stand-alone SSAS Tabular model design tool that runs outside of Visual Studio. It is distributed through a GitHub repo at: https://tabulareditor.github.io. The source code is maintained at: https://github.com/otykier/TabularEditor. It is described as a lightweight editor for SSAS Tabular Models built in .NET/WinForms and utilizes the Analysis Management Objects (AMO) library.
DAX Editor – an extension to SSDT that makes it easier to edit DAX measures in a Tabular project. It includes: DAX Syntax Highlighting, DAX Formatting and Textual representation of measures defined in a Tabular project
Extracting/importing measures from/to a Tabular model file (.BIM). This add-in is developed and maintained by SQLBI. It can be found in the Visual Studio > Tools > Extensions and Updates menu or downloaded from SQLBI at: https://www.sqlbi.com/tools/dax-editor.
BISM Normalizer – An add-in for Visual Studio/SSDT that can be installed from the Visual Studio Tools > Extensions and Updates menu by search for “BISM Normalizer”. This is an essential tool if you need to compare multiple SSAS Tabular projects and databases (including workspace databases) and then manage the steps to selectively merge additions and differences between objects. I have found this useful for integrating parallel development work on the same SSAS Tabular project but I will warn you that if used routinely to merge multiple changes to a single project, it can be a complex undertaking and not something I recommend on a daily basis.
BISM Normalizer is developed and maintained by Christian Wade, formerly an independent consultant and now a Senior Program Manager on the SSAS product team. Although Christian works for Microsoft, this is still considered a community-supported, third-party tool not officially supported by Microsoft. It is, however, the only tool designed specifically to perform schema comparisons and object merging in SSAS.
In our discussion, we also refer to these tools and resources:
VertiPaq Analyzer – An older tool (actually an Excel Power Pivot workbook) created by Kasper de Jonge, Program Manager from the Power BI/SSAS product team, that imports SSAS DMV query results and reports object size, object data compression rates and optimization statistics. It has since been updated and enhanced. This is a convenient way to get information quickly about your SSAS or Power BI model. The same information can be obtained, with a little more effort, by running DMV queries in DAX Studio or SSMS. Links to the original and updated versions are available on the SQL BI site at: https://www.sqlbi.com/articles/data-model-size-with-vertipaq-analyzer/.
DAX Formatter – An online tool for optimizing and reformatting DAX expressions, available from SQLBI, at: http://www.daxformatter.com/
Tabular Modeling in Microsoft SQL Server Analysis Services – is a Microsoft Press book written by Alberto and Marco that applies mainly to SSAS Tabular 2016 and 2017 projects. This is an absolutely invaluable reference and tutorial on SSAS design best practices and design techniques. Previous books from our Italian friends include DAX Patterns and The Definitive Guide to DAX; all excellent learning resources that I have in my book library.
Shortly after this conversation, Marco and Alberto delivered two sessions at the PASS Summit conference about DAX best practices and performance optimizations. A little later, Marco sat with Guy In A Cube hosts Adam Saxton and Patrick Leblanc to chat about these sessions. You can find that interview here. I also had a chance to catch-up with Adam and Patrick in an earlier interview during the conference, which I will share in a later blog post.
The best method to implement row-level security in a published Power BI model or SSAS Tabular model consumed from the Power BI service will depend to some degree on your data structure and requirements. The method I demonstrate here is one of the most flexible approaches and one that I commonly use in my projects. Continue reading →
Please join me and other 2017 PASS Summit speakers for 24 Hours of PASS: Summit Preview on July 19th and 20th. 24HOP is a series of 60 minute on-line sessions presented back-to-back for 24 hours, from the same professionals who will deliver preconference and main conference sessions during the Summit the first week of November this year. These online sessions are free of charge and normally attended by thousands of individuals to gain insight and knowledge about the topics. 24 Hours of PASS features free educational webinars delivered over 24 hours. Topics covered in this edition include Performance Tuning, SQL Server 2017, Linux, DevOps, Azure, PowerShell, SSRS, Power BI and much more. Browse all sessions. These webinars provide a sneak peek at some of the best practices, expert tips and demos you’ll find at this year’s PASS Summit in Seattle. Continue reading →
As I visit businesses, consulting clients and training classes, to teach data reporting and analytics; there is a recurring conversation. It is one that I have had for eighteen years. The audiences change and the technology implementations are a little different over time, but the essential conversation is still the same.
This happened again last week as I was trying to explain the unique characteristics of Multidimensional and Tabular storage to a client. I’m developing a training course where I needed to explain the concepts once again – and then it hit me! …these unique data storage and analytic technologies do what we dream about in science fiction stories and achieve capabilities we imagine existing in the distant future. Channel surfacing on television this week, I watched an episode of Timeless, a Star Trek rerun and a Star Wars movie – where time-travel, space-warp travel and teleportation were commonplace realities. Although fictional as they were portrayed, I think these concepts are very real in our technology landscape. Please indulge me as I explain.
We live in a Linear world. We reside in a place, work in a place, store and move stuff from place to place. Centuries ago, if man needed to move something (perhaps just himself) to a distant place, he would walk, ride a horse or take a sailboat. In weeks or months, he would arrive in another place. Today, we get in a car, navigate the streets, perhaps highways and Interstates and then arrive in a different place within minutes or hours. For longer trips, we board a large metal tube, sit very close to several complete strangers as we climb up into the sky; some period of time goes by and then we somehow arrive in a distant place along with our stuff. At 35,000 feet where the air is very thin, a jet airplane can travel many times faster; leaving an altitude where certain physical laws restrict speed, only to re-enter that atmosphere after it has arrived. To someone from the distant past, this experience would seem fictional.
image: nasa.gov
On a daily basis, I sit or stand in front of my computer monitors, and see and speak with people in distant places. We have live conversations about the weather in their time zone or hemisphere. Through some strange but now common process, technology converts our speech, images, directions, documents, effort and thoughts into 8-bit network packets and electrons that travel almost instantly through space; leaving the earth for a short period, to be transported and reassembled somewhere else.
Years ago, when I wanted new stuff, I would drive to the store. If they didn’t have what I wanted or if it was too expensive, I would spend the day driving to different stores. Today, I “go” to the store on my computer and the very next day, my stuff arrives on my doorstep. We are continually inventing ways to bend space and teleport things within the changing confines of reality.
Data storage is much like the real world. We store terabytes and petabytes (numbers we can’t even explain) in a small space. But, to navigate through the linear storage structure of relational databases and files, the data might as well be thousands of miles or kilometers apart. In order to perform time-variance and statistical calculations, program code must access a group of records in one location to aggregate them, and then find a related group of records – perhaps millions of rows apart – to perform another aggregation. The query might need to perform this operation dozens, hundreds or thousands of times; before it can calculate the final results. One of the best examples is a time-series calculation where a query must “visit” records for each time period to perform an aggregation. The effort is compounded when the goal is to compare aggregate values and variances for parallel and relative periods (e.g. “one year ago”, “three years ago”, etc.) and then perform forecasting or regressions.
image: ctdn.com
In relational storage, the query engine must “drive” from place to place, navigating the streets (file groups, files, pages and rows) to find the records. In an analytic structure (like an Analysis Services cube, or SSAS Tabular model or Power Pivot/Power BI in-memory model), the query engine performs a calculation operation and then teleports to a new location to find a different group of related records.
image: space.com
In a multidimensional cube, the storage processing code quite literally folds space to create the cube structure and pre-calculate some of the aggregate values. Rather than leaving all the data in a linear table, it actually moves data segments close together, reshaping the data structure (thus, the term “cube”). This approach required a great deal of complexity in both the software and the resulting data structure. Advances in computer hardware and software techniques helped simplify the approach. The in-memory tabular model exists in linear, yet fragments of compressed space – where the actual distance between adjacent records is much closer than it appears to be. When the query process is done looking up a group of records it warps to the next group of related records using a memory address. Instead of traveling through space, it steps out of it, like the jet airplane leaving the atmosphere, to re-enter in a different location. It takes much less time to run queries because it is not moving through all the rows of linear tables.
By using unique and innovative data storage methods (to reduce storage space and access data faster without moving through data), we can, in effect; perform time-travel, warp space and teleportation.
I just finished posting a 4-part series demonstrating how to use Power BI with on-premises data. The running time for all four parts is about 44 minutes. I cover using the Power BI Enterprise Gateway to connect published Power BI reports to on-premises data sources:
SQL Server database connected with DirectQuery
Large SQL Server database with clustered column store index connected with DirectQuery
SSAS multidimensional/cube on-prem with direct connect
SSAS Tabular with large data volume & multiple partitions on-prem with direct connect
The four-part series is on my YouTube channel using the following links:
Part 1
Part 2
Part 3
Part 4
Adam Saxton also gives a nice walkthrough of using SQL Server DirectQuery with Power BI that may provide an additional perspective and information about securing and sharing the solution.
I’d also like to give a plug for a terrific Power BI architecture diagram poster created by Dustin Ryan. I love the simplicity of this comprehensive graphic. You can download the PDF file from his blog.
A new ground-breaking feature in SQL Server 2016 allows report user context to be passed through shared data sources, enabling role-based, row-level security. This feature doesn’t require complex administration configurations, service principal delegation, Kerberos, Claims/Windows Token conversion or impersonation. You may not be as excited as I am about this unless you’ve also lived through the tedious configuration process we’ve had to endure for many years until recently.
A little over a week ago I heard from my friend and past co-author, Robert Bruckner – a Senior Architect and Development Lead on the Reporting Services product team. He told me I needed to check out a new capability they had just enabled in SQL Server 2016 that would make it much easier then before to implement row-level security for SQL Server Analysis Services. I had to check it out for myself and was please to discover that this technique works with Reporting Services mobile and paginated reports. It also works in a variety of server architectures. I tested it in a domain environment and on a simple workgroup server. It will work in scaled-out scenarios, and with or without SharePoint in the mix.
Until now, there were two major blocking points to authenticate report users on the SSAS server so data could be restricted or filtered for each user. Kay Unkroth talks about the prior authentication landscape in his post to the SQL Server Reporting Services Team Blog titled Passing user names to Analysis Services with personalized connection strings in SQL Server 2016 Reporting Services. He describes the challenges in more detail and I’ll summarize from my experience.
1. The first challenge is that user context gets lost as database connections are passed from one service to another. Before we get into the weeds with this story, let’s just categorize this as complicated and confusing. It raises high-visibility questions and concerns that necessitate handing-off to support and administrative folks to take care of settings and components outside the environments you manage – namely the database and report servers.
Not long ago when we tried to use SharePoint as the platform for all BI and reporting systems; getting connection and user information to pass through through all the layers in the server architecture was more like running a relay race… a user logged into SharePoint and navigated to a report. At the point, the credential object for the user was exchanged for a Claims Token. The user who chose to run the report and the Claims Token had to be delegated to a Windows service principal name (SPN), which, in-turn was delegated to the service account running a specific service on the Windows server. It was then up to the service to interface with your database in Analysis Services. Without getting into more details; the service account was not the user who had access to certain important data. For the service account to connect to the database at all, it needed to have either admin privileges – a potential security risk – which is now being handled by your system or database administrators (who probably know very little about your reporting solution, your users’ need or business objectives. They just want to make sure it has no security holes and will lock-down anything suspicious or unfamiliar.
2. The second challenge is that Reporting Services requires shared data sources with stored user credentials, in order to use a lot of it’s advanced features like shared datasets (under some conditions), data-driven subscriptions, caching, history or snapshots. In every case, the credentials of the user actually running the report get exchanged for the service proxy account and the original user context is lost. Analysis Services sees the same user connect for every report request. A security genius might conclude that these are all issues that can be handled through some security configuration mechanism – and they would be right. The problem is that they often don’t get handled because of the complexity or the simple fact that management falls within another team’s realm of responsibility.
Over the many years that Reporting Services has been a leading reporting tool in the industry, practitioners have asked if they could somehow add an expression to a shared data source definition that would inject the username into the connection string so it could be deciphered in the server. Interesting idea with a lot of reasons that it wouldn’t happen – – sort of.
Using conventional approaches, where principal delegation was configured and used correctly – which may or may not involved the Kerberos protocol and NTLM Authentication.
The New And Improved Technique: Using the CUSTOMDATA Property
Drumroll, please… Here’s the big change that allows new, cool things to happen.
In every data source connection string, you can add a simple expression that maps the current Windows username to the CUSTOMDATA property of the data source provider. This works in SSRS embedded data sources, shared data sources, in a SharePoint Office Data Connecter (ODC) file and in a SharePoint BISM connection file. In each case, the syntax should be the similar. Here is my shared data source on the SSRS 2016 report server:
The connection string modification is simple. Just add this text to the end, following a semicolon:
;CUSTOMDATA = {{ USERID }}
This value travels through the connection stack when ever a connection is open on the database server. Since the CUSTOMDATA connection property value is assessable and exposed by query and expression languages and APIs on the server, code can be written to respond to the user. In the case, we will filter queries with records and data that only the current user should see.
Windows and Report Users
Windows users, whether added to Active Directory, whether on domain-joined computers or on a simple workgroup server, are standard Windows users and need not be added to the Local or Domain Administrators group. Any SSAS users with restricted data access should also not be added to any Analysis Services administrator role.
As you can see, Paul and Fred are standard Windows users with limited access to elevated resources.
Analysis Service Roles and Membership
I’ve tested this technique with both SSAS multidimensional and Tabular, with the same results. I’ll show you the set-up on a named instanced on my server, SVR201R21, named SSAS_Tabular. The SSAS Security properties page for the instance show that only the Administrator local admin user and the SSAS service accounts. This is important to check because if you accidentally add a user as an administrator, row filtering won’t work. Testing and debugging solutions on a single development machine requires a lot of logging out and logging back in to test under different user conditions.
My SSAS Tabular database is named Adventure Works Tabular 2016. Under the database Roles, properties for the Users role, you see that there are three users; Fred, Paul and ReportUser. The ReportUser is our proxy users. When we need to setup shared a data source with saved credentials, I supply the user name and password for the ReporrUser account.
It’s important to note that the ReportUser must have Read access to the database in order to connect and execute queries. But, since that same account may be used to connect on behalf of others users, the ReportUser account will have restricted access to data through the security bridge tables that are implemented by the role’s row-level filtering expressions.
Row-level Filtering Bridge Table
The core approach for row-level, user-specific filtering is the same as always but the programmatic implementation is only slightly different. This is the essential foundation of that feature. The following table image demonstrates tow things. One is that the table itself can be completely generated using DAX. This is a calculated table (one of many new cool features added in SQL Server 2016). I’ve just used inline ROW functions to added each row to the UNION function that defines the entire table.
In production, you may want to store this lookup data in a SQL Server table for easy maintenance, and then process it into the model as you would all you tables.
For each row, the UserName is an exact match for the value passed in from the UsersName token in the connection string, which will be presented by the CUSTOMDATA DAX function. Do you see how this will work? For testing and debugging, you can add a couple of measures useful for know who the current user is and what the CUSTOMDATA function value returns.
User Name:=USERNAME()
Connected User:=CUSTOMDATA()
In a Reporting Services report, you can add these measures to your query and then add the dataset fields to textboxes in your report.
Implementing a Role-based Row-level Filter
Start by opening the Roles window…
Checking the user membership of the Users role; ReportUser must be a member if the report server or client tool is connected as ReportUser. Even though no records are mapped to this user in the security bridge table, it has served it’s purpose by getting us connected to the database.
Paul and Fred are in the role in case they connect on their own, like from Excel or from SSRS using a data source without stored credentials.
This demonstration scenario calls for filtering all the data related to the Sales Territories for user who have permission to see it, according to the Country/User mapping in the bridge table. If the data related to other business entities should have user-specific filtering, you would follow the same pattern and multiple filters will be in effect.
The filter on the Sales Territory table is a DAX expression. You can use other functions to provide additional logic but this is a good example that demonstrates the essential pattern. As each row is scanned during a query, the expression will determine if there is a match. If so, the row is returned. If there is no match, the row is ignored and not returned in the query results.
Here’s the filter expression with e brief explanation:
(again, each row in the table is considered in turn)…
Does the current row’s [Sales Territory Country] field value match the value of…
the result of the LOOKUPVALUE function which looks in the Security Bridge Country table for a Country matching the current Sales Territory[Country] AND where the User filed value on the same row matches the value returned by the CUSTOMDATA function.
Let’s try it out…
My Reporting Services report uses a data source where I have added the text ;CUSTOMDATA = {{ USERID }} to the end of the connection string.
The dataset is a simple MDX query that returns the Sales Territory Country, the Connected User test measure and the Reseller Total Sales measure. Notice that when the query runs locally in the designer, the CUSTOMDATA function returned the literal text in the parameter token. This is expected.
Here’s the report design. I’ve added the Connected User measure so you can see what the CUSTOMDATA function returns, along with the filtered results.
Run as the Administrator user, only the Country row for the corresponding row in the Security Bridge table is returned.
Let’s test the report as a real user.
I’ll sign-out as Administrator and then sign-in as Paul…
Running the deployed report in SSRS web portal this time, only the Country records are returned for this records that were mapped to my user name in the Security Bridge Country table. Furthermore, if I were to run a query that didn’t explicitly use the Sales Territory Country, only the records related to the countries in my filter scope would contribute to any totals due to the relationships in the model.
As I mentioned earlier, I also tested this in a domain environment and it also worked exactly as expected.
This means that the next time I’m called on to implement a row-level security solution, I won’t have to keep the system administrator up all night setting up complicated service principal names and configuring authentication delegation. If I have rights to setup the SSAS server and the data sources on the report server, I can now do all the configuration work myself – and that is pretty exciting.
Thanks again to Robert Bruckner for bringing this to my attention, and to Kay Unkroth for laying out the basics in his post. And a big big thanks to the Reporting Services product team for making making this a reality and saving the next generation of BI solution practitioners from reliving the challenges of the past.
Self-service data modeling tools like Power BI Desktop, Power Pivot and SSAS Tabular promote the advantage of using data from multiple data sources. I’m including SSAS Tabular on the “self-service” list because it is derived from Power Pivot and often used for “quicker” IT-managed BI projects. The alleged advantage of sourcing data directly from within the modeling tool might seem to suggest that a formal ETL process and dimensional modeling exercise is unnecessary. The more experience I gain with these newer tools in business settings has me ever thinking about using best practices and the right approach for scaling BI solutions of all sizes.
The bottom line, before getting into the weeds on this topic, is that there are cases where I think it makes perfect sense to skip the usual steps of staging, transforming and modeling data into a relational data mart. Consequently, by using this approach, you are also removing the safety net and limiting options to scale and expand the solution. I love tools like Power Query and Power BI Desktop to develop simple BI solutions quickly. Perhaps a related topic for another time but always consider options to incorporate enterprise scale with modern analytics. Solutions always grow.
The Self-Service Modeling Paradigm
More and more of the projects I inherit from consulting customers and previous developers resemble the following diagram:
In less-formal solutions, the “ODS” layer in this diagram may be little more than a small database on an existing server with lookup tables populated by a weekly scheduled SSIS package or stored procedure. A lot of these projects are either departmental or the result of a self-service project handed-off to the IT organization to be “right-sized” and transitioned from the desktop to a server-hosted solution.
Power Pivot, Power BI Desktop or SSAS Tabular are often used to mash-up data from multiple data sources. Some of the data may be staged and transformed to simplify a complicated query or to work-around security restrictions from the original data source. The data feeding multiple tables in the model arrives in separate streams where the keys then need to be manipulated in order to match-up fact table keys to identifying keys in the lookup/dimension tables.
What’s the Problem With This Approach?
In some solutions with a manageable scale and a reasonable tolerance for a certain amount of data loss and inconsistency, this approach may be just fine. There are very good reasons for inconsistencies between sets of data that come from different sources. When 20 million fact rows are obtained from an online ordering system and .1% don’t have matching customers records that come from the system used to manage the master customer list, it may simply be a question of timing. Or, maybe the definition of “customer” is slightly different in the two systems. Chances are that there are also legitimate data quality issues affecting a small percentage of these records.
Whatever the case may be, a data conformity or potential data quality issue in this design scenario falls on the model designer to either fix or ignore. Again, this may or may not be a concern but it is a decision point and might raise the question of ownership when questions about data quality are raised.
Before I continue, let’s revisit how this common issue is handled in more conventional Business Intelligence solutions…
Managing Data Conformity in Traditional BI
Here’s a quick overview of the conventional Business Intelligence approach. For the past twenty years or so, we’ve promoted strict dimensional modeling principles that can be summarized in the following diagram. This diagram is my interpretation of The Kimball Method, applied to my experience and tools available from Microsoft. I learned to orchestrate BI solutions using this approach; however tedious, time-consuming and expensive. This is my frame of reference for any BI project. Many variables can influence adaptations from this model such as product choices, big data sources, real-time analysis, data volume, budget or the stakeholder’s realm of influence or ownership – but this is my starting point for a BI project done “right”.
If we take this general approach and scale it down to a departmental or business unit level, where an enterprise data warehouse doesn’t exist to satisfy analytic reporting requirements; we commonly end-up with a solution that looks more like the following diagram. Data is still procured from multiple sources through extract and transform processes, loaded into a staging database and/or directly into a dimensional data mart. Even when using an Analysis Services Tabular project to build the semantic model, the model only has one data source with all the fact and dimension tables populated from the relational data mart. The dimensionalized data is then surfaced through a semantic model and then presented through ad-hoc analysis, analytic and operational reports, and executive dashboards.
Different Solution Architecture – Same Problem
The most significant difference between the previous two diagrams and the one I started with is that the solution developer has control over the data before it enters the semantic model. Data conformity and quality can be managed in different places which include:
At the source
In the ETL process after source data is staged, and before it enters the data warehouse/data mart
Within the semantic model
In theory, fixing “bad” or non-conforming data at the source seems to be ideal for everyone. In a utopian setting, the business can incorporate a master data management solution to let data stewards correct erroneous or unmatched records and then feed them back into source systems. Operational reports produced by business applications should match analytical reports and dashboards from the data warehouse/data marts and the models they feed. Please don’t misunderstand – many businesses incorporate MDM solutions and they work well. The reality in most scenarios is that this is a difficult undertaking. It requires business process changes and an enormous level of cooperation and coordination across business units, leadership and interdependent teams.
Weeding out bad records and making corrections as data flows downstream to the data warehouse is often a practical trade-off that has many advantages. Reports produced from the data warehouse or data marts should match-up to analytic reports and dashboards fed by semantic models which uses congruent data.
Any provisions to compensate for data anomalies within the data model will produce some degree of variation between reports produced upstream and analytic reports produced from the semantic model. This may be acceptable as long as users understand and expect certain discrepancies. Reconciling financial results and auditing to a narrow margin can be a difficult and time-consuming exercise. Often times user confidence can be maintained when the analytic model can produce the same bottom-line figure and discrepancies are evident only in different grouping buckets or categories – like a “catch-all” or “Unknown” member for records that don’t match.
How Analysis Services Multidimensional and Tabular Deal With Data Non-Conformity
SQL Server Analysis Services (which we now call “Multidimensional SSAS”) has handled mismatched keys the same way for several years. When fact rows are processed with key values that don’t match a related dimension, errors are reported and the model processing fails. The developer must make a deliberate exception for these records to match a specially-generated “Unknown” member of the dimension. Under the default behavior, a solution developer following The Kimball Method would have defined Unknown member records in every dimension table and then mapped the unmatched keys in the ETL process.
For the purpose of this discussion and as far as we’re concerned; SSAS Tabular, Power Pivot and Power BI Desktop are the same. All these tools, which are derived from different versions of the same code base, all deal with mismatched keys the same way.
SSAS Tabular is significantly different from multidimensional in that unmatched fact rows simply flow through model processing and a blank member is generated in each affected dimension table. Visually, this shows up as an aggregated value for an unlabeled heading. Casual users will miss this and won’t notice that a certain number of records have fallen through the cracks. There are currently no properties or settings to change this behavior. In fact, a Connect Item filed back in Feb 2014 was dismissed as a low priority suggestion. I’m not debating the issue but just informing you that this is how it works. I’d personally like to at least have the option to label the blank dimension record as “Unknown”. I’ve found some creative ways to work-around this limitation. The best documented approach is this one that Javier Guillen posted in 2012.
One of the reasons I like SSAS Tabular is that it’s simple. But, “simple” means that is has less “stuff”, less “clutter” and fewer features than its more complicated ancestor. I supposed that if every SSAS practitioner were able to get the product developers to add their favorite feature, we might end up right where we started – and that’s a dilemma. This is an over-simplification as there is always room for product improvement.
This is a topic worthy of deeper discussion so it’s on the list of items for me to explore in follow-up posts. Please let me know what you think, post your questions and provide some feedback.
The DAX function documentation in MSDN can be found on separate pages, organized by function categories and then details for every function. This is useful but you can’t view all of the functions in one searchable page. If you don’t know exactly what you’re looking for, it can be a bit of a hunting expedition. I find that sometimes it is convenient to have all of the documentation in one list, making it easier to find the function I’m looking for.
I’ve organized the complete set of 2016 DAX functions for SSAS Tabular, Power Pivot and Power BI in two different views; grouped by function category and in one single list. Using Excel Online, the lists can be sorted, searched and filtered. I hope this is useful.
This post is a teaser for an article I just published to SQL Server Pro Magazine. This is the third and not the final article about using Datazen to build mobile dashboards and business intelligence solutions. The entire article is available here.
(excerpt)
A few months ago, shortly after Microsoft announced their acquisition of the Datazen mobile dashboard product, I wrote two articles titled Getting Started with Datazen and Setting Up a Datazen Server. Datazen is a simple dashboard tool with native apps for all the major mobile device platforms. I mentioned that the tool had shortcomings with Analysis Services as a data source. When writing MDX queries, it was necessary to create calculated members for all of the row members and place them on the columns axis. An update released in August, 2015 improved the MDX query functionality where the work-around is no longer necessary. In this article, I’ll demonstrate a pair of drill-through dashboards using MDX queries and navigation that pass parameters to an MDX query. The is an overview rather than a tutorial. My focus is on queries, parameters and drill-through commands and not so much the dashboard layout. In the next article, I’ll show you how to go one step further and drill-through to an operational report created with Reporting Services.
A Quick Tour
To set the stage, my dataset is the airline on-time performance database from the Bureau of Transportation Statistics…
I’ve just finished a series of four articles for SQL Server Pro Magazine, along with sample projects and hands-on exercises. The series will take you through SSAS Tabular model design from start to finish, using the Adventure Works sample data in SQL Server 2012 or 2014. Here are links to all four articles followed by an excerpt from each.
Starting Your Modeling Career with Analysis Services Tabular Models Part 1
Four-part series
This is the first of a four-part series about getting started with Tabular model design using SQL Server Analysis Services in SQL Server 2012 and 2014. You will learn the basics from start to finish and build a complete solution. A sample project is provided for for each stage of the solution so you can follow-along with each article. Continue reading →
It was a great honor to be asked to join my associates from SolidQ at the Microsoft Virtual Academy Studios in Redmond and talk about how to upgrade to SQL Server 2012 and 2014. These recordings, which are also on my YouTube Channel, include the material I covered in these sessions. The entire series of studio presentations are hosted on Channel 9 and here at the MVA with accompanying assessment surveys and resources.
In these studio sessions, I am joined by my fellow authors of the 429 page SQL Server 2014 Upgrade Technical Guide; Richard Waymire, Ron Talmage and Jim Miller from SolidQ. Jim and I were responsible for the Business Intelligence content. In our sessions Jim covered SSIS and SSAS Multidimensional, and I covered SSAS Tabular, BI tools and SSRS. In this edited portion of the SSAS session, Jim begins with a brief summary of multidimensional upgrade options and I continue to discuss opportunities to implement SSAS Tabular solutions. BI topics apply equally to SQL Server 2012 and 2014 upgrades. Continue reading →
This post actually applies to both multidimensional (cube) projects and Tabular projects which manage data source connectivity in a similar fashion. The multidimensional design experience is different but the mechanics of the SSAS engine are essentially the same.
When loading data from a SQL Server data source into a Tabular model project, chances are very good that you’ll see an error on the first attempt. I’ve been creating SSAS projects for over over 15 years and, even though this is a relatively simple process, I still usually work through the initial configuration in trial-and-error mode. So, let’s call this a normal part of the process.
A Word of Advice
In Tabular model design, a common practice is to select all the tables and columns, rename objects and apply filters before finalizing the import step. If something goes wrong and you can’t connect to the data source, you’ll lose all this work and will need to start over. Test connectivity with one table before going through this aggravating cycle.
Here’s an example of an “ouch” moment after taking a half hour to make all the table/column selections. Note the lack of a “Back” button!
Getting Connected
When connecting to a SQL Server data source in the SSAS project designer, there are two different identities used at different times. When you initially create and test the connection in the designer, you will authenticate as yourself – the user logged into the local computer. However, when you actually load data into the table or process the model, even in the designer, Analysis Services authenticates using the identity of the SSAS service. If the SSAS instance is installed on your local development machine and you just used the default settings during setup, this is an account generated by SQL Server setup. If you’re using a production-like SSAS server, your system administrator may have configured the service to run under a specifically-defined domain principal. If the latter is the case, you’ll need to get your DBA or sys admin involved. If you have an isolated development environment, with everything install on one computer, the following instructions will take care of it.
I’m running all my SQL Server services on my local development machine and created a Tabular project that will use the AdventureWorksDW2012 sample database. The SSAS instance I plan to use for my workspace database is the local instance. First, let’s get the service account for the SSAS instance. The easiest way to view local services is to right-click “My Computer” or “This PC” in Windows Explorer and select “Manage”. Navigate to the Services node like so:
I have two SSAS instances installed and the default instance is called MSSQLSERVER. Double-click to view the Properties dialog and select the Log On page:
As you can see, the default service account is “NT Service\MSSQLServerOLAPService”. I’ve selected this text and copied it to the clipboard. I’m done with the Services and Properties windows.
Now, I open SSMS and view the Logins for the SQL Server instance my SSAS project uses for the data source. After connecting to the relational instance, expand the object explorer, Security and Logins. Two logins are needed; one for myself and one for the SSAS service account. Since this is my development machine and had installed SQL Server, I already have a login for my Windows user account and I’m a member of the sysadmin server role which grants me access to every database on the server. If you’re not a privileged user on this SQL Server instance, you will need a login created for your Windows credential and will need to at least be in the db_datareader role for the target database (example below).
To create the service account login, right-click Logins and choose New Login… Paste the service account name here and move to the User Mapping page. Select the target database and then check the box next to the db_datareader database role.
Click OK to save changes and you should be good to go.
Back in the Tabular model designer, I’m going to import only one small table to check connectivity. Once verified, I’ll click the Existing Connections toolbar icon, select all my tables, select and de-select columns, apply filters and then load the data with successful results. As soon as I see record counts, I know I’m in the clear…
Just One More Thing
(Columbo reference)
On a different but related topic since you’re getting your development environment ready… If you’re not a local administrator on the SSAS machine you’re using for the Tabular workspace database (typically a development machine), you need to make sure your Windows account is in the the Server Administrators role for the SSAS instance. If this isn’t the case, you’ll get an error when you process the model. To do this, connect to the SSAS instance in SSMS, right-click the server and open Properties. On the Security page, use the Add button to add your Windows account.