sharing my experiences with the Microsoft data platform, SQL Server BI, Data Modeling, SSAS Design, Power Pivot, Power BI, SSRS Advanced Design, Power BI, Dashboards & Visualization since 2009
This just in from the Reporting Services product team:
“Power BI reports in SQL Server Reporting Services: January 2017 Technical Preview now available” This feature addition will allow Power BI reports to be published to a local SQL Server Reporting Services server, entirely-on-premises without using the Power BI cloud service.
As I visit businesses, consulting clients and training classes, to teach data reporting and analytics; there is a recurring conversation. It is one that I have had for eighteen years. The audiences change and the technology implementations are a little different over time, but the essential conversation is still the same.
This happened again last week as I was trying to explain the unique characteristics of Multidimensional and Tabular storage to a client. I’m developing a training course where I needed to explain the concepts once again – and then it hit me! …these unique data storage and analytic technologies do what we dream about in science fiction stories and achieve capabilities we imagine existing in the distant future. Channel surfacing on television this week, I watched an episode of Timeless, a Star Trek rerun and a Star Wars movie – where time-travel, space-warp travel and teleportation were commonplace realities. Although fictional as they were portrayed, I think these concepts are very real in our technology landscape. Please indulge me as I explain.
We live in a Linear world. We reside in a place, work in a place, store and move stuff from place to place. Centuries ago, if man needed to move something (perhaps just himself) to a distant place, he would walk, ride a horse or take a sailboat. In weeks or months, he would arrive in another place. Today, we get in a car, navigate the streets, perhaps highways and Interstates and then arrive in a different place within minutes or hours. For longer trips, we board a large metal tube, sit very close to several complete strangers as we climb up into the sky; some period of time goes by and then we somehow arrive in a distant place along with our stuff. At 35,000 feet where the air is very thin, a jet airplane can travel many times faster; leaving an altitude where certain physical laws restrict speed, only to re-enter that atmosphere after it has arrived. To someone from the distant past, this experience would seem fictional.
image: nasa.gov
On a daily basis, I sit or stand in front of my computer monitors, and see and speak with people in distant places. We have live conversations about the weather in their time zone or hemisphere. Through some strange but now common process, technology converts our speech, images, directions, documents, effort and thoughts into 8-bit network packets and electrons that travel almost instantly through space; leaving the earth for a short period, to be transported and reassembled somewhere else.
Years ago, when I wanted new stuff, I would drive to the store. If they didn’t have what I wanted or if it was too expensive, I would spend the day driving to different stores. Today, I “go” to the store on my computer and the very next day, my stuff arrives on my doorstep. We are continually inventing ways to bend space and teleport things within the changing confines of reality.
Data storage is much like the real world. We store terabytes and petabytes (numbers we can’t even explain) in a small space. But, to navigate through the linear storage structure of relational databases and files, the data might as well be thousands of miles or kilometers apart. In order to perform time-variance and statistical calculations, program code must access a group of records in one location to aggregate them, and then find a related group of records – perhaps millions of rows apart – to perform another aggregation. The query might need to perform this operation dozens, hundreds or thousands of times; before it can calculate the final results. One of the best examples is a time-series calculation where a query must “visit” records for each time period to perform an aggregation. The effort is compounded when the goal is to compare aggregate values and variances for parallel and relative periods (e.g. “one year ago”, “three years ago”, etc.) and then perform forecasting or regressions.
image: ctdn.com
In relational storage, the query engine must “drive” from place to place, navigating the streets (file groups, files, pages and rows) to find the records. In an analytic structure (like an Analysis Services cube, or SSAS Tabular model or Power Pivot/Power BI in-memory model), the query engine performs a calculation operation and then teleports to a new location to find a different group of related records.
image: space.com
In a multidimensional cube, the storage processing code quite literally folds space to create the cube structure and pre-calculate some of the aggregate values. Rather than leaving all the data in a linear table, it actually moves data segments close together, reshaping the data structure (thus, the term “cube”). This approach required a great deal of complexity in both the software and the resulting data structure. Advances in computer hardware and software techniques helped simplify the approach. The in-memory tabular model exists in linear, yet fragments of compressed space – where the actual distance between adjacent records is much closer than it appears to be. When the query process is done looking up a group of records it warps to the next group of related records using a memory address. Instead of traveling through space, it steps out of it, like the jet airplane leaving the atmosphere, to re-enter in a different location. It takes much less time to run queries because it is not moving through all the rows of linear tables.
By using unique and innovative data storage methods (to reduce storage space and access data faster without moving through data), we can, in effect; perform time-travel, warp space and teleportation.
I recently needed to replicate all of the reports on one report server to a different server. I started downloading each report one at a time to a file system folder, moving them to the other machine and then uploading each RDL file. Needless to say, that’s a slow and painful process. Since a new set of PowerShell command lets (Cmdets) were introduced for Reporting Service just a couple of weeks ago, I decided to give them a try and share the results.
Aaron Nelson, long-time Microsoft Data Platform MVP and PowerShell fanatic, has literally been hounding me for the past few weeks to take some time and look at a new PowerShell library and scripts originally developed by Parth Shah from the SSRS product team. Jaime Tarquino also played a big role in automating the build process. It’s the personal commitment of folks at Microsoft like Parth and Jaime that really make a product like SSRS shine.
Aaron can be pretty persistent and I admit it’s ironic that I’ve been too busy finishing up the final editing work on the Wrox SSRS 2016 book and the new edX SSRS course to look at these RS PowerShell CmdLets but I’m glad that I did because they’re an absolute gold mine.
Back to my project… I have developed several demo and lab exercise reports for the new edX SSRS course: Analyzing and Visualizing Data with SQL Server Reporting Services. The development environment is an Azure virtual machine that I recently didn’t have access to while travelling and needed to move all the reports to a local VM to continue my work. I needed to make sure that current versions of these dozens of reports are on both servers. Using a couple of new PowerShell commands, this was quite easy. Here’s how it worked:
Step 1: I open the PowerShell ISE console on the server where I need to archive the reports (shown with the script I’ll run in the next few steps)
Step 2: run a command to download and install the rstools library (see line 1):
Step 3: Line 5 downloads all the reports in the “DAT214x Course” folder on my Azure VM report server to a local file system folder:
Out-RsFolderContent …
Step 4: On the destination server, I also run the command to install the rstools library…
Step 5: After copying the files to a local folder (or you could use a UNC path to access a shared folder), run Write-RsFolderContent to upload all the reports to the same destination folder:
Progress is displayed while the files are uploaded:
With shared data sources in the same places, all of the files are uploaded and working:
This saved me hours of time and provided peace of mind, knowing that I could easily automate large migrations in the future.
Here is the complete list of commands. You can get help, execution details and optional parameter information by prefixing the command with the word “help “:
You need to be running PowerShell as Administrator in order to do this.
My favorite command among them is this one which will upload an entire folder worth of Reports & Datasets to an SSRS server for you. Here’s an example using Jamie Thomson’s SSIS Reporting Pack (attached):
A new ground-breaking feature in SQL Server 2016 allows report user context to be passed through shared data sources, enabling role-based, row-level security. This feature doesn’t require complex administration configurations, service principal delegation, Kerberos, Claims/Windows Token conversion or impersonation. You may not be as excited as I am about this unless you’ve also lived through the tedious configuration process we’ve had to endure for many years until recently.
A little over a week ago I heard from my friend and past co-author, Robert Bruckner – a Senior Architect and Development Lead on the Reporting Services product team. He told me I needed to check out a new capability they had just enabled in SQL Server 2016 that would make it much easier then before to implement row-level security for SQL Server Analysis Services. I had to check it out for myself and was please to discover that this technique works with Reporting Services mobile and paginated reports. It also works in a variety of server architectures. I tested it in a domain environment and on a simple workgroup server. It will work in scaled-out scenarios, and with or without SharePoint in the mix.
Until now, there were two major blocking points to authenticate report users on the SSAS server so data could be restricted or filtered for each user. Kay Unkroth talks about the prior authentication landscape in his post to the SQL Server Reporting Services Team Blog titled Passing user names to Analysis Services with personalized connection strings in SQL Server 2016 Reporting Services. He describes the challenges in more detail and I’ll summarize from my experience.
1. The first challenge is that user context gets lost as database connections are passed from one service to another. Before we get into the weeds with this story, let’s just categorize this as complicated and confusing. It raises high-visibility questions and concerns that necessitate handing-off to support and administrative folks to take care of settings and components outside the environments you manage – namely the database and report servers.
Not long ago when we tried to use SharePoint as the platform for all BI and reporting systems; getting connection and user information to pass through through all the layers in the server architecture was more like running a relay race… a user logged into SharePoint and navigated to a report. At the point, the credential object for the user was exchanged for a Claims Token. The user who chose to run the report and the Claims Token had to be delegated to a Windows service principal name (SPN), which, in-turn was delegated to the service account running a specific service on the Windows server. It was then up to the service to interface with your database in Analysis Services. Without getting into more details; the service account was not the user who had access to certain important data. For the service account to connect to the database at all, it needed to have either admin privileges – a potential security risk – which is now being handled by your system or database administrators (who probably know very little about your reporting solution, your users’ need or business objectives. They just want to make sure it has no security holes and will lock-down anything suspicious or unfamiliar.
2. The second challenge is that Reporting Services requires shared data sources with stored user credentials, in order to use a lot of it’s advanced features like shared datasets (under some conditions), data-driven subscriptions, caching, history or snapshots. In every case, the credentials of the user actually running the report get exchanged for the service proxy account and the original user context is lost. Analysis Services sees the same user connect for every report request. A security genius might conclude that these are all issues that can be handled through some security configuration mechanism – and they would be right. The problem is that they often don’t get handled because of the complexity or the simple fact that management falls within another team’s realm of responsibility.
Over the many years that Reporting Services has been a leading reporting tool in the industry, practitioners have asked if they could somehow add an expression to a shared data source definition that would inject the username into the connection string so it could be deciphered in the server. Interesting idea with a lot of reasons that it wouldn’t happen – – sort of.
Using conventional approaches, where principal delegation was configured and used correctly – which may or may not involved the Kerberos protocol and NTLM Authentication.
The New And Improved Technique: Using the CUSTOMDATA Property
Drumroll, please… Here’s the big change that allows new, cool things to happen.
In every data source connection string, you can add a simple expression that maps the current Windows username to the CUSTOMDATA property of the data source provider. This works in SSRS embedded data sources, shared data sources, in a SharePoint Office Data Connecter (ODC) file and in a SharePoint BISM connection file. In each case, the syntax should be the similar. Here is my shared data source on the SSRS 2016 report server:
The connection string modification is simple. Just add this text to the end, following a semicolon:
;CUSTOMDATA = {{ USERID }}
This value travels through the connection stack when ever a connection is open on the database server. Since the CUSTOMDATA connection property value is assessable and exposed by query and expression languages and APIs on the server, code can be written to respond to the user. In the case, we will filter queries with records and data that only the current user should see.
Windows and Report Users
Windows users, whether added to Active Directory, whether on domain-joined computers or on a simple workgroup server, are standard Windows users and need not be added to the Local or Domain Administrators group. Any SSAS users with restricted data access should also not be added to any Analysis Services administrator role.
As you can see, Paul and Fred are standard Windows users with limited access to elevated resources.
Analysis Service Roles and Membership
I’ve tested this technique with both SSAS multidimensional and Tabular, with the same results. I’ll show you the set-up on a named instanced on my server, SVR201R21, named SSAS_Tabular. The SSAS Security properties page for the instance show that only the Administrator local admin user and the SSAS service accounts. This is important to check because if you accidentally add a user as an administrator, row filtering won’t work. Testing and debugging solutions on a single development machine requires a lot of logging out and logging back in to test under different user conditions.
My SSAS Tabular database is named Adventure Works Tabular 2016. Under the database Roles, properties for the Users role, you see that there are three users; Fred, Paul and ReportUser. The ReportUser is our proxy users. When we need to setup shared a data source with saved credentials, I supply the user name and password for the ReporrUser account.
It’s important to note that the ReportUser must have Read access to the database in order to connect and execute queries. But, since that same account may be used to connect on behalf of others users, the ReportUser account will have restricted access to data through the security bridge tables that are implemented by the role’s row-level filtering expressions.
Row-level Filtering Bridge Table
The core approach for row-level, user-specific filtering is the same as always but the programmatic implementation is only slightly different. This is the essential foundation of that feature. The following table image demonstrates tow things. One is that the table itself can be completely generated using DAX. This is a calculated table (one of many new cool features added in SQL Server 2016). I’ve just used inline ROW functions to added each row to the UNION function that defines the entire table.
In production, you may want to store this lookup data in a SQL Server table for easy maintenance, and then process it into the model as you would all you tables.
For each row, the UserName is an exact match for the value passed in from the UsersName token in the connection string, which will be presented by the CUSTOMDATA DAX function. Do you see how this will work? For testing and debugging, you can add a couple of measures useful for know who the current user is and what the CUSTOMDATA function value returns.
User Name:=USERNAME()
Connected User:=CUSTOMDATA()
In a Reporting Services report, you can add these measures to your query and then add the dataset fields to textboxes in your report.
Implementing a Role-based Row-level Filter
Start by opening the Roles window…
Checking the user membership of the Users role; ReportUser must be a member if the report server or client tool is connected as ReportUser. Even though no records are mapped to this user in the security bridge table, it has served it’s purpose by getting us connected to the database.
Paul and Fred are in the role in case they connect on their own, like from Excel or from SSRS using a data source without stored credentials.
This demonstration scenario calls for filtering all the data related to the Sales Territories for user who have permission to see it, according to the Country/User mapping in the bridge table. If the data related to other business entities should have user-specific filtering, you would follow the same pattern and multiple filters will be in effect.
The filter on the Sales Territory table is a DAX expression. You can use other functions to provide additional logic but this is a good example that demonstrates the essential pattern. As each row is scanned during a query, the expression will determine if there is a match. If so, the row is returned. If there is no match, the row is ignored and not returned in the query results.
Here’s the filter expression with e brief explanation:
(again, each row in the table is considered in turn)…
Does the current row’s [Sales Territory Country] field value match the value of…
the result of the LOOKUPVALUE function which looks in the Security Bridge Country table for a Country matching the current Sales Territory[Country] AND where the User filed value on the same row matches the value returned by the CUSTOMDATA function.
Let’s try it out…
My Reporting Services report uses a data source where I have added the text ;CUSTOMDATA = {{ USERID }} to the end of the connection string.
The dataset is a simple MDX query that returns the Sales Territory Country, the Connected User test measure and the Reseller Total Sales measure. Notice that when the query runs locally in the designer, the CUSTOMDATA function returned the literal text in the parameter token. This is expected.
Here’s the report design. I’ve added the Connected User measure so you can see what the CUSTOMDATA function returns, along with the filtered results.
Run as the Administrator user, only the Country row for the corresponding row in the Security Bridge table is returned.
Let’s test the report as a real user.
I’ll sign-out as Administrator and then sign-in as Paul…
Running the deployed report in SSRS web portal this time, only the Country records are returned for this records that were mapped to my user name in the Security Bridge Country table. Furthermore, if I were to run a query that didn’t explicitly use the Sales Territory Country, only the records related to the countries in my filter scope would contribute to any totals due to the relationships in the model.
As I mentioned earlier, I also tested this in a domain environment and it also worked exactly as expected.
This means that the next time I’m called on to implement a row-level security solution, I won’t have to keep the system administrator up all night setting up complicated service principal names and configuring authentication delegation. If I have rights to setup the SSAS server and the data sources on the report server, I can now do all the configuration work myself – and that is pretty exciting.
Thanks again to Robert Bruckner for bringing this to my attention, and to Kay Unkroth for laying out the basics in his post. And a big big thanks to the Reporting Services product team for making making this a reality and saving the next generation of BI solution practitioners from reliving the challenges of the past.
Color coordination is important in report and dashboard design. We’ve seen theming and color pallet options in the prior iteration of Power BI but this is not yet a feature in the new version. No doubt, the product will continue to move forward and we’re likely to see some kind of “theming” feature in a future release. You can, however, use individual color selections to create coordinated and appealing reports. In this post from Ron Ellis Gaut from CSG Pro, he shares a selection of 12 report color themes which you can use as examples:
Color choices matter when building Power BI solutions. The aesthetics are important, but colors also help communicate. The color choices you make up front are cumbersome to change later on. We look forward to the day when we can easily apply color themes in Power BI, but we are not there yet. Therefore, we are left to applying custom colors manually. This can be done for visuals by setting the Data Colors from the Format options on the Visualizations panel.
Given that many developers (like me) are not professional UI designers. How can we easily choose a color scheme? How can we easily present options to our clients an get their input.
To this end, I have put some color schemes together and published these on my PBI site. Check it out HERE
I maintain all the HEX values for the color pallet in Excel. Let me know if you want a copy of the Excel and I’m happy to share it with you.
I’m working on a series of articles about creating enterprise solutions with Power BI and the first article was published this past week in SQL Server Pro Magazine. I will republish the completed series but until then, I’ll post short excerpts from each article. In future posts, I’ll show examples and demonstrations.
Power BI is a bona fide sensation. It’s only been about seven months since the rebooted version 2 release and transition from Office 365 to PowerBI.com and Power BI Desktop. Since that time, Gartner has moved Microsoft to a leading position on the Magic Quadrant for Business Intelligence and Analytics Platforms. Like a young, prized race horse sprinting to the lead from the center of the pack; Power BI is being taken seriously and bigger players are considering the right timing to place bets.
Microsoft is rated by Gartner as the having the most complete vision while Tableau and Qlik rate slightly higher in their ability to implement. This makes perfect sense to me as “Microsoft BI” is not one simple product choice. This is my opinion: The Microsoft BI platform has many components and offers many choices, but is certainly comprehensive and scalable. Microsoft have always erred on the side of offering choices where some competitors have a monolithic product. Power BI is great single-product solution for simple projects, and with room to grow as needed. At scale, Power BI is delivering “the last mile” of data visualization beautifully on top of the robust foundation of SQL Server, Analysis Services, Integration Services, HDInsight, Stream Analytics, machine learning and all the other Azure services.
Three months ago I attended the first meeting of a new Power BI MeetUp group in Portland, Oregon. Last week we had a full house. Power BI groups are popping up all over the world as users, businesses and BI practitioners are embracing this product. The Power BI Community site currently lists 59 user groups! Capabilities added to the product each month are numerous and pace of development is increasing. The first wave mainly focused on features for self-service users and small group scenarios. The next wave will address larger-scale scenarios, security, administration and developer tools for integration. The SQL Server and BI product teams have made their intentions clear in recent announcements. Jen Underwood’s article, “Top Takeaways from Microsoft’s Reporting Roadmap” last month underscored that the new BI platform will address big business solutions in the cloud and on premises. We’ve seen tremendous momentum but there is more progress to be made.
Self-service data modeling tools like Power BI Desktop, Power Pivot and SSAS Tabular promote the advantage of using data from multiple data sources. I’m including SSAS Tabular on the “self-service” list because it is derived from Power Pivot and often used for “quicker” IT-managed BI projects. The alleged advantage of sourcing data directly from within the modeling tool might seem to suggest that a formal ETL process and dimensional modeling exercise is unnecessary. The more experience I gain with these newer tools in business settings has me ever thinking about using best practices and the right approach for scaling BI solutions of all sizes.
The bottom line, before getting into the weeds on this topic, is that there are cases where I think it makes perfect sense to skip the usual steps of staging, transforming and modeling data into a relational data mart. Consequently, by using this approach, you are also removing the safety net and limiting options to scale and expand the solution. I love tools like Power Query and Power BI Desktop to develop simple BI solutions quickly. Perhaps a related topic for another time but always consider options to incorporate enterprise scale with modern analytics. Solutions always grow.
The Self-Service Modeling Paradigm
More and more of the projects I inherit from consulting customers and previous developers resemble the following diagram:
In less-formal solutions, the “ODS” layer in this diagram may be little more than a small database on an existing server with lookup tables populated by a weekly scheduled SSIS package or stored procedure. A lot of these projects are either departmental or the result of a self-service project handed-off to the IT organization to be “right-sized” and transitioned from the desktop to a server-hosted solution.
Power Pivot, Power BI Desktop or SSAS Tabular are often used to mash-up data from multiple data sources. Some of the data may be staged and transformed to simplify a complicated query or to work-around security restrictions from the original data source. The data feeding multiple tables in the model arrives in separate streams where the keys then need to be manipulated in order to match-up fact table keys to identifying keys in the lookup/dimension tables.
What’s the Problem With This Approach?
In some solutions with a manageable scale and a reasonable tolerance for a certain amount of data loss and inconsistency, this approach may be just fine. There are very good reasons for inconsistencies between sets of data that come from different sources. When 20 million fact rows are obtained from an online ordering system and .1% don’t have matching customers records that come from the system used to manage the master customer list, it may simply be a question of timing. Or, maybe the definition of “customer” is slightly different in the two systems. Chances are that there are also legitimate data quality issues affecting a small percentage of these records.
Whatever the case may be, a data conformity or potential data quality issue in this design scenario falls on the model designer to either fix or ignore. Again, this may or may not be a concern but it is a decision point and might raise the question of ownership when questions about data quality are raised.
Before I continue, let’s revisit how this common issue is handled in more conventional Business Intelligence solutions…
Managing Data Conformity in Traditional BI
Here’s a quick overview of the conventional Business Intelligence approach. For the past twenty years or so, we’ve promoted strict dimensional modeling principles that can be summarized in the following diagram. This diagram is my interpretation of The Kimball Method, applied to my experience and tools available from Microsoft. I learned to orchestrate BI solutions using this approach; however tedious, time-consuming and expensive. This is my frame of reference for any BI project. Many variables can influence adaptations from this model such as product choices, big data sources, real-time analysis, data volume, budget or the stakeholder’s realm of influence or ownership – but this is my starting point for a BI project done “right”.
If we take this general approach and scale it down to a departmental or business unit level, where an enterprise data warehouse doesn’t exist to satisfy analytic reporting requirements; we commonly end-up with a solution that looks more like the following diagram. Data is still procured from multiple sources through extract and transform processes, loaded into a staging database and/or directly into a dimensional data mart. Even when using an Analysis Services Tabular project to build the semantic model, the model only has one data source with all the fact and dimension tables populated from the relational data mart. The dimensionalized data is then surfaced through a semantic model and then presented through ad-hoc analysis, analytic and operational reports, and executive dashboards.
Different Solution Architecture – Same Problem
The most significant difference between the previous two diagrams and the one I started with is that the solution developer has control over the data before it enters the semantic model. Data conformity and quality can be managed in different places which include:
At the source
In the ETL process after source data is staged, and before it enters the data warehouse/data mart
Within the semantic model
In theory, fixing “bad” or non-conforming data at the source seems to be ideal for everyone. In a utopian setting, the business can incorporate a master data management solution to let data stewards correct erroneous or unmatched records and then feed them back into source systems. Operational reports produced by business applications should match analytical reports and dashboards from the data warehouse/data marts and the models they feed. Please don’t misunderstand – many businesses incorporate MDM solutions and they work well. The reality in most scenarios is that this is a difficult undertaking. It requires business process changes and an enormous level of cooperation and coordination across business units, leadership and interdependent teams.
Weeding out bad records and making corrections as data flows downstream to the data warehouse is often a practical trade-off that has many advantages. Reports produced from the data warehouse or data marts should match-up to analytic reports and dashboards fed by semantic models which uses congruent data.
Any provisions to compensate for data anomalies within the data model will produce some degree of variation between reports produced upstream and analytic reports produced from the semantic model. This may be acceptable as long as users understand and expect certain discrepancies. Reconciling financial results and auditing to a narrow margin can be a difficult and time-consuming exercise. Often times user confidence can be maintained when the analytic model can produce the same bottom-line figure and discrepancies are evident only in different grouping buckets or categories – like a “catch-all” or “Unknown” member for records that don’t match.
How Analysis Services Multidimensional and Tabular Deal With Data Non-Conformity
SQL Server Analysis Services (which we now call “Multidimensional SSAS”) has handled mismatched keys the same way for several years. When fact rows are processed with key values that don’t match a related dimension, errors are reported and the model processing fails. The developer must make a deliberate exception for these records to match a specially-generated “Unknown” member of the dimension. Under the default behavior, a solution developer following The Kimball Method would have defined Unknown member records in every dimension table and then mapped the unmatched keys in the ETL process.
For the purpose of this discussion and as far as we’re concerned; SSAS Tabular, Power Pivot and Power BI Desktop are the same. All these tools, which are derived from different versions of the same code base, all deal with mismatched keys the same way.
SSAS Tabular is significantly different from multidimensional in that unmatched fact rows simply flow through model processing and a blank member is generated in each affected dimension table. Visually, this shows up as an aggregated value for an unlabeled heading. Casual users will miss this and won’t notice that a certain number of records have fallen through the cracks. There are currently no properties or settings to change this behavior. In fact, a Connect Item filed back in Feb 2014 was dismissed as a low priority suggestion. I’m not debating the issue but just informing you that this is how it works. I’d personally like to at least have the option to label the blank dimension record as “Unknown”. I’ve found some creative ways to work-around this limitation. The best documented approach is this one that Javier Guillen posted in 2012.
One of the reasons I like SSAS Tabular is that it’s simple. But, “simple” means that is has less “stuff”, less “clutter” and fewer features than its more complicated ancestor. I supposed that if every SSAS practitioner were able to get the product developers to add their favorite feature, we might end up right where we started – and that’s a dilemma. This is an over-simplification as there is always room for product improvement.
This is a topic worthy of deeper discussion so it’s on the list of items for me to explore in follow-up posts. Please let me know what you think, post your questions and provide some feedback.
The DAX function documentation in MSDN can be found on separate pages, organized by function categories and then details for every function. This is useful but you can’t view all of the functions in one searchable page. If you don’t know exactly what you’re looking for, it can be a bit of a hunting expedition. I find that sometimes it is convenient to have all of the documentation in one list, making it easier to find the function I’m looking for.
I’ve organized the complete set of 2016 DAX functions for SSAS Tabular, Power Pivot and Power BI in two different views; grouped by function category and in one single list. Using Excel Online, the lists can be sorted, searched and filtered. I hope this is useful.
Spending the past two weeks at the annual PASS Global Summit and the Microsoft MVP Summit, I’ve consumed a literal firehouse of information about the Microsoft BI platform. I’ve participated in the PASS Summit for twelve years and the MVP Summit for seven years thus far and in that time, I don’t recall as much innovative change and product momentum as we have seen lately. The pace of significant additions to the business intelligence offering is truly astounding. Attending both of these events, I often learn about product feature investments on different levels which include those that are publically announced and those that are part of the roadmap and things that the product teams are working on or seriously considering in the near and longer-term future.
At both of these events, we saw a lot of very exciting functionality and heard some very bold statements about what the BI product teams are working on. Citing the product team official blog: “Our goal is simple – we want to put the power of data in the hands of every business and person on the planet. It is our objective to serve over a billion users with the Microsoft business intelligence (BI) platform.” To reach that goal, there are several very specific efforts in motion. The first objective is to harmonize all of the disparate report types that are currently surfaced in different tools like Reporting Services, Excel, Datazen and Power BI. This objective is to standardize reporting content types across Microsoft on-premises, cloud and hybrid systems. These include paginated reports, interactive reports, mobile reports and analytical reports & charts. There will be two different places for users to get to all their report content: Power BI dashboards and the new Reporting Services Portal. Both of these “portals” will serve up a variety of reports and visual content whether hosted on-premises or in the cloud. Additionally, users will have access to visual reports on mobile devices for all the platforms. Some details are still in flux but the direction and roadmap are quite clear.
Power BI: Past, Present & Future
Power BI was front and center in many sessions and discussions and the pace of new feature delivery is dizzying. The online Power BI service is updated weekly and the Power BI Desktop application is updated every month. What was crystal clear is that Power BI is getting tremendous attention from the product teams and leadership as the go-to visualization, modeling and data mash-up tool for many important scenarios. To-date, Power BI has been primarily marketed as a self-service analysis tool in the same way that Power Query, Power Pivot and Power View work in Excel. However, it is quickly becoming an enterprise product with the ability to integrate, automate and use it in a variety of managed business scenarios.
Consider each of these possible options:
Power BI Desktop running on a user’s computer with data connected from many sources; across the Internet and on-premises.
Power BI datasets, reports and dashboards published to the secure cloud subscription services, shared with different permissions for different groups of organization users.
Power BI dashboards containing PBI report visuals, on-prem SSRS report visuals, mobile reports & KPIs, and Excel visuals.
On prem Reporting Services Portal containing Power BI reports & dashboards, SSRS reports, mobile dashboards & KPIs, and Excel visuals.
Power BI reports and dashboards in the cloud connected to live, on-prem SSAS tabular, SSAS multidimensional, or relational data sources using DirectQuery.
Although we can’t say exactly how or when, we will have the ability to publish Power BI Desktop Reports on-premises. This is very welcome news. We will also have the ability to pin on-premises reports and other content to a Power BI dashboard in the cloud. Some of that capability was just added to SQL Server 2016 CTP 3.0.
Custom Visuals – This is one of several examples of how Microsoft is embracing open source code and community development. In addition to the dozens of D3-based visuals that community developers have contributed, the product team is releasing one new visual per week. Any custom visual in the gallery can be downloaded, added to a Power BI Desktop report and published to the service. This is quickly making most common and special-purpose BI visuals available to everyone in one tool.
DirectQuery – connections to live data sources use query folding to optimize performance and real-time results from SQL Server, Azure SQL Database and Azure SQL Data Warehouse. This also includes connectivity to MDX/multidimensional SSAS in SQL Server 2016!
Real-time dashboards & reports – Options for real-time data interaction include DirectQuery and direct connect through the Analysis Service connecter. On-prem data can also be managed with the Personal Gateway. According to team leadership at PASS, gateway enhancements for personal and enterprise applications should be announced soon.
Pinning SSRS Reports – A Reporting Services 2016 server can be registered to integrate with a Power BI subscription using the Reporting Services Configuration Manager. This allows SSRS report visuals to be pinned to a Power BI dashboard. I just tried this in the 2016 CTP 3.0 and it works quite well. When viewing a report in Report Manager, a toolbar icon appears. This prompts the report user to select a visual on the SSRS report and then for a dashboard in their Power BI subscription.
There is SO MUCH more coming!! The product team members are so excited to share what they’re working on and lines between what they “eluded” to in their PASS session presentations and what they told us and demonstrated at the MVP Summit under strict NDA were very fine. Just keep watching and you will see an astounding release velocity of new and impressive features in the next few months.
New Reporting Services
In SQL Server 2016, Reporting Services is getting a significant face lift on several fronts. The HTML renderer has been completely rewritten to emit pure HTML 5 to produce consistent output in every modern browser on every device. This capability is in the current CTP today.
Report Manager is replaced with a modern report portal that will host content from SSRS, mobile dashboards (formerly known as Datazen), Excel and eventually Power BI on premises reports. This raises questions about the availability of Power BI on-prem and the only comment I can make based on public information is that it is planned for some time in the future.
Report parameters can be moved around and organized with greater precision in the parameter bar at the top of the report.
The printing capability in SSRS will use a “plug-in free” PDF renderer rather than the old ActiveX control that will no longer require a plug-in download or special permissions to run.
Two new chart types are added (treemap and sunburst) and changes are planned to modernize the default look of report visuals.
This post is a teaser for an article I just published to SQL Server Pro Magazine. This is the third and not the final article about using Datazen to build mobile dashboards and business intelligence solutions. The entire article is available here.
(excerpt)
A few months ago, shortly after Microsoft announced their acquisition of the Datazen mobile dashboard product, I wrote two articles titled Getting Started with Datazen and Setting Up a Datazen Server. Datazen is a simple dashboard tool with native apps for all the major mobile device platforms. I mentioned that the tool had shortcomings with Analysis Services as a data source. When writing MDX queries, it was necessary to create calculated members for all of the row members and place them on the columns axis. An update released in August, 2015 improved the MDX query functionality where the work-around is no longer necessary. In this article, I’ll demonstrate a pair of drill-through dashboards using MDX queries and navigation that pass parameters to an MDX query. The is an overview rather than a tutorial. My focus is on queries, parameters and drill-through commands and not so much the dashboard layout. In the next article, I’ll show you how to go one step further and drill-through to an operational report created with Reporting Services.
A Quick Tour
To set the stage, my dataset is the airline on-time performance database from the Bureau of Transportation Statistics…
What! NO? Get with it, my friend. Power BI is for everyone. All the kids are doing it… not to mention the business owners, marketers, sales campaign planners and scientists, IT Pros and common folk. Power BI is all the rage and can be used to analyze everything under the sun – oh, and that Pluto mission tracking thing that Microsoft did for NASA. Yea, pretty much everybody’s using Power BI!
On the 27th of August, I’ll lead an online virtual workshop “Building Business Intelligence Solutions with Power BI” that will show you how to put Power BI to work. I’ll demonstrate techniques to use with serious business data but we’re also going to have some fun with data from the National UFO Reporting Center. This database contains all of the publically available UFO sighting information, since Thomas Jefferson reported seeing a flying light in the sky up to the latest flying saucer sighting today. You’ll learn to import and mash-up data from different sources, create a data model or analysis and reporting and then create compelling, interactive, highly-visual report and dashboards. You’ll ask you dashboard questions in plain English and she’ll answer you in magically-crafted prose of wisdom and insightful knowledge, investing artistic charts and visualizations while you do nothing but talk to your data. Yep, really.
There’s still time to signup for the training sessions:
Choosing the right visualization control for a set of data can sometimes be challenging. I find some aspects of designing dashboard solutions in Datazen on the surface to be very simple but it’s easy to get stuck in the details and fine points. The purpose of this guide to help simplify and categorize visual control choices in mobile dashboard design. Datazen supports five categories of visual controls:
Navigators
Gauges
Charts
Maps
DataGrids
Navigator Controls
These controls are used to filter the data displayed in other controls
Time Navigator displays a range of time/date values
Supports years, quarters, months, days, hours
Auto-generates each date/time level value (doesn’t require a date lookup table)
Supports one or multiple metric fields
Scorecard Grid combines a selection list with multi-field value KPI scorecard
Selection List supports single-select, multi-select and All
Gauge Visuals
This group of controls display a single numeric field value rather than multiple values
The Number control displays a single value with no target
All other controls display a value compared to a target value, which requires that the dataset has two separate metric fields for this purpose.
Chart Visuals
Time Chart & Comparison Time Chart:
Similar to Time Navigator but also supports auto, decade levels
These controls segment aggregated metrics by a category field:
These controls use multiple metric fields (doesn’t segment)
Totals Chart, Pie Chart
Map Visuals
Geographic region names map to embedded shape names in the map definition
Shapes use well-known region names but not well documented. All the shape names are exposed in a JSON structure within the dashboard definition file (to view the definition, rename a copy of the .datazen file with a .zip extension.)
Single metric field values:
Map, Bubble Map
Metric & comparison (KPI target) values:
Range Stop Heat Map
DataGrid Visuals
Simple DataGrid:
Based on one table, displays selected fields
Indicator DataGrid:
Based on one table, displays selected fields as either columns or indicators/gauges.
Indicators require 2 fields for metric & target/comparison
On Friday, James Phillips, Microsoft VP of Business Intelligence, announced the general availability of the new Power BI Business Analytics Service will be July 24th. Also referred to as “Power BI 2.0”, the new service will eventually replace the 1.0 Power BI extension to Office 365. In a nutshell, this means that data transformations, data modeling, dashboard visualization and report design can be performed in the free Power BI Designer stand-alone application rather than requiring you to license Excel 2013 ProPlus and/or Office 365. A Power BI online subscription is free and advanced features are about $10 per month per user. Models and reports created using the Excel add-ins can still be published to the Power BI hosted service but Excel is no longer a requirement. Read the complete announcement here.
There are many fine points to the Power BI equation that will be important factors for companies considering the adoption of Power BI. Interest in the industry is very high and I’ve had several conversations lately with business leaders from companies who want to learn more and and are considering using Power BI to empower analysts and business users instead of relying on traditional IT solution design. To clear the muddy water, here are some popular questions and answers:
To use Power BI, are we required to put our data in the cloud? Technically, Yes; but maybe not. To use all of the latest features of Power BI, including Q&A natural language queries, new visuals, sharing published reports and dashboards; you will need to publish the queries, models & reports you author with the Power BI Designer or Excel 2013 ProPlus add-ins to the cloud-based service at PowerBI.com. The content is secured and encrypted but this is the biggest blocking point for many large businesses who, either for regulatory reasons or just out of fear for losing control of their important data or intellectual property. For other companies who have learned to trust cloud services, this is no big deal. Heck, just about all of us move sensitive data over the Internet every day but it’s still a hang-up for a lot of big companies.
What options are there to use Power BI features on-premises? Quite a few, actually. Most of the features of Power BI were originally created as add-ins for Excel. This isn’t to say that the features in Excel are the same as the latest Power BI service… they’re not but most are either the same or very similar. To use the more recent versions of Power Query, Power Pivot and Power View, you’ll need Office 2013 ProPlus and you should be using the 64 bit flavor rather than 32 bit. This is also a bit of a hang-up in big companies who don’t upgrade software very quickly. I’ve found that even in big, stodging businesses, getting the right version of Office installed isn’t impossible if you can make a case for the business value it brings. If compatibility with old Excel add-ins is a concern, you can use a virtual machine or install new Office on a second computer.
Once you’re using Excel 2013 to author models and reports, you have a few options to share Power Pivot data models, Power View reports and Excel reports with other team members:1. Excel, Power Pivot and Power View integrate with SharePoint 2013 Enterprise edition. This isn’t cheap or real easy to setup but it will allow many users to share and manage content. The latest Power BI cloud features won’t always be supported but we can likely expect occasional updates.
2. Just stick the Excel file on a shared folder so others running the right Excel version can open and view it. This isn’t exactly a collaborative solution but will work with a small group and zero additional effort.
3. Suck it up and use the cloud. This is where everything is going. Microsoft’s directive for all their products is “cloud-first, mobile-first” and most of their competitors are now doing the same – including the major BI vendors. For most companies, adopting the cloud isn’t really a security or regulatory problem so much but it requires a shift in business culture and accepting new options and different challenges.
4. Wait for the on-prem version. Microsoft has promised that most new capabilities introduced in the cloud will eventually make it into installed editions of their software. We probably won’t see on-prem parity with Power BI real soon but individual features will most likely be migrated when they can.
What is the real cost of Power BI? When Microsoft says Power BI is “free”, they mean it – really. According to the pricing information here, at no cost, users can manage 1 GB of compressed data. That’s actually a lot. There are other factors and limitations that won’t affect casual data analysts with moderate needs. Having a development and demo sandbox with no expiration date is pretty cool. For about $10/month, users can collaborate with team members, manage and integrate larger sets of data. That’s pretty cheap.
Can we integrate Power BI with on-prem data and IT systems? Yes. The Power BI REST API lets you programmatically push data into a published model at-will rather than scheduling data refresh or requiring users to do it manually.
The SSAS connector allows report and dashboard visuals to query an on-premises instance of SSAS Tabular rather than pushing the model into the cloud.
The Data Management Gateway allows a data model to refresh data from your on-premises SQL Server databases at scheduled intervals.
Data models can be scheduled to refresh once-per-day for users with the free Power BI subscription and up to once-per-hour with the Pro ($10) subscription.
How does Power BI measure up to competing services like Tabeau and Qlik?
In my opinion, Microsoft is the underdog with the best integration story but they have lost ground to make up. The article from Martin Heller below is a great comparison of features & capabilities. In the past few years, Microsoft has taken it’s BI and reporting platform in all kinds of crazy, confusing directions and have created several interesting but disconnected products. Under new leadership and vision, they’re moving quickly toward a new goal and I’m encouraged by what I see. Power BI development is moving faster than other products at Microsoft can keep-up with (part of the reason they disconnected Power BI from Excel). We’re seeing features added to Power BI just about every month.
As a rule, don’t ever bet on what might be but do pay attention to how quickly Power BI has progressed in the past few months – and what they’re likely accomplish in the near-term future with this product.
How can I see what features are coming, what others are asking for and how can I voice my opinion about new Power BI features?
A new Power BI Community site is managed and monitored by the Power BI leadership and development team. Check out http://community.powerbi.com/
Here are a few good articles & posts related to the new Power BI offering:
I’d like to make you aware of a three-hour training event that I will be presenting, hosted by SQL Server Pro Magazine on Thursday, July 30th beginning at 11 AM Eastern Time. Each one-hour session has a one hour break between and registered attendees can watch the sessions again after the event. You can get details about each session and register using the links in this event summary:
Microsoft DataZen is a new tool from Microsoft built on a mature and time-tested foundation of enterprise services and visualization components. Analytic dashboards created by data professionals are available to users on desktops and practically all modern mobile device platforms. Enabling a stunning user experience requires BI solution developers to set up servers, configure services and data sources, write queries and design dashboards.
This class will guide students through this complex process and show them how to create a comprehensive dashboard solution; beginning with the essentials and then exploring advanced capabilities to integrate dashboards into an enterprise BI ecosystem.
By the end of the course you will learn:
How DataZen fits into an enterprise BI strategy & works with other Microsoft BI & reporting technologies
How to build and configure a secure DataZen Enterprise Server
Dashboard design basics and best-practices
How to apply advanced features like parameters and expressions to create dashboard navigation & drill-through capabilities
In addition to the technical sessions, Paul will be available during each session for LIVE Q&A. PLUS if you miss a live session or want to review the recordings, you will have access to the complete course recordings and materials for 12 months.
In this introductory session, we tour DataZen mobile dashboard solutions, learn about scenarios where the tool is effectively used to provide business value and enable users to visualize data on various mobile devices. Learn how DataZen fits in the Microsoft BI product spectrum and how it can be used to create an integrated BI and reporting solution.
Get started building a solution by installing and configuring DataZen Enterprise Server. Enable email delivery, dashboard hubs, users, branding and caching options. Prepare for dashboard design by creating data sources and data view queries.
Using the DataZen Publisher and integrated dashboard designer, learn to create functional dashboards for tablet, phone and desktop devices. You will learn to use several visualization controls to create an interactive dashboard experience with navigators and data selectors, gauges, charts, maps and data grids.
This session builds on the foundation of server objects created in the first session, using data sources and data views to drive dashboards and KPIs based on server-hosted data in SQL Server and Analysis Services.
We begin with a tour of sophisticated capabilities used to integrated dashboards into a complete Business Intelligence and enterprise reporting solution. You will learn how parameters and expressions are used to pass selections and context to another dashboard, web site or report. Use expressions and variables to customize the user’s navigation experience, providing drill-through paths to dashboards and detail reports.
This post is really a collection of field notes and some lessons learned from recent project experience. I’ve done plenty of SSAS Tabular projects over the past few years – usually starting with a Visual Studio project rather than Power Pivot. I’ve also done a bit of Power Pivot work for clients. These projects were either delivered to an analyst using Excel on their desktop or some business users through SharePoint. But, authoring in Excel Power Pivot and deploying to a server-hosted Tabular model has been mainly theoretical up to this point so I thought I’d share my experience. Continue reading →