Tag Archives

22 Articles

Master Data Services 2014 Add-in for Excel published

Just a bit over a week ago a new version of the Master Data Services Add-in for Excel was released. It is the 2014 release, which will also work for SQL Server 2012. Main benefit is up to 4x performance improvement without any server configuration changes. If you need extra performance you can get 2x extra by enabling Dynamic Content Compression in IIS on your MDS server.

Here is the download page: http://www.microsoft.com/en-us/download/details.aspx?id=42298.

Power BI Pro Tip: Dealing with errors when reading Excel files

If you use Power Query to read an Excel file that has errors (such as #VALUE, #REF, #N/A, etc) the rows that contain the error will not be loaded into the data model but instead will be flagged as error rows. But what to do if you really want to keep the data rows with the errors with some replacement value for the error instead of skipping the error row altogether?

To demonstrate this I created this simple Excel table that contains only errors (actually all errors that Excel can generate that I am aware of):

How they are generated and what they mean is not the point here, but I am sure most of you have seen some if not all of them before.

On loading this table using Power Query all five rows will be marked as error rows:

As a result, no data is loaded into your data model. This is as expected since Power Query filters out the rows that have an error, which in this case are all of the input rows.

However, in some cases this is not what you want. Suppose this was a 50 column dataset and that in one column sometimes an error occurs. Then do you want to disregard the rows with errors and thus delete all possible valuable info in the rows or do you want to somehow fix the errors and flag them as problematic but still load them? The latter is sometimes the better choice.

This can easily be done by editing the Power Query and using the Replace Errors function (on the Transform tab):

Make sure you have the column with the errors in it selected, enter a valid value for the data type in the column (in my case it was Any since there is really no other data available in the MyCol) and hit OK.

Power Query will replace the error with the value you entered:

Now, on loading the rows with errors will end up in the data model and your data will be flagged with the label you chose.

Hope this helps! Until next time!

Combining text (csv) files using Power Query – follow-up

I have been getting quite a few responses on my original post on how to combine text (csv) files using Power Query. One of the FAQs is how to keep the filename where the data came from in your result set. Said differently: what you want is the contents of all files in a folder plus the filename of the originating files all in one table. I thought it was simply a matter of adding a column, but Nicolas pointed out in a comment that adding a column would create a cross product of all data and all filenames. So, I needed to come up with another solution. The solution I present here might not be the best one, but it works. The magic trick here is knowing that the Csv.Document() function exists. Allow me to explain. First of all, I followed the “normal” approach to list files in the folder using Power Query. What you get is: Now, it might be tempting to expand the ‘Content’ column (as I did in the original post). However, as in this scenario the goal is to get the contents as well as keep the originating filename, we need a different approach. What we need to do is add a custom column that equals the following:

What this is doing is opening the contents in a single column and it expects CSV format. The custom column shows up like this: Next step is to expand the table. The contents will be displayed in your custom column: Then, I did a split on the CSV separator (semi-column in this case), so I ended up with three columns: The only thing left to do is clean up (remove columns) and filter rows (since my CSVs had headers the header from the second CSV is still in my data). The end result is: For your reference, here is my code:

With this I hope Sunflowers and Nicolas are happy J

Power BI Pro Tip: Name your linked tables

Just a quick Power BI Pro Tip this time: if you use linked tables to add data to your data model in Excel, before you press the ‘Add To Data Model’ button be sure to go to the table properties in Excel and give your table a better name (better than Table X). This makes figuring out which data you are looking at so much easier. You will thank me later J

Power BI Pro Tip: Show Top x results with RANKX() function

One of the more frequent scenarios is listing the top X results, such as most profitable products, biggest customers, top 10 best selling stores, etc. Also doing a top X selection helps reduce clutter in charts: a lot of data points can work as noise and obscure the data points that really matter and make the biggest impact.

In this post I describe an approach to implementing these scenarios using Power Pivot’s RANKX() function.

Let’s start with a simple dataset consisting of products (P1…P20 in my sample), Cities, Sales Amount and Number of products sold:

 

After adding this table to the Power Pivot data model, we can use the RANKX() function to get the best selling products / cities etc. I added the following measures to my table:

Sum of Sales Amount:=SUM([Sales Amount])

Sum of Number Sold:=SUM([Number Sold])

Rank of products by sales amount:=RANKX(ALL(Sales[Product]);[Sum of Sales Amount])

Rank of city by number sold:=RANKX(ALL(Sales[City]);[Sum of Number Sold])

 

These measures allow me to determine the top selling products by sales amount and best cities by number of products sold.

Only thing left to do is to use a Pivot Table / Pivot Graph or Power View / Power Map visualization and display the results.

 

If you create a new Pivot Table and add the Product column and the ‘Rank of product by sales amount’ measure you get the following:


 

So how do we get the top 10 selling products by sales amount is a nice ordered fashion? Very easy, just a matter of the right sorting and filtering. Click on the little downwards pointing triangle button at Row Labels and choose ‘More Sort Options’. There I chose Ascending and then selected the rank measure:


 

Now the Pivot Table is sorted by rank with the highest ranking product at the top. Now, to filter out only the top ten, we press the same button again and choose Value Filters and then Top 10. Here I made the following selections:


 

This seems maybe a bit counter intuitive, but what this does is return the lowest ten ranks (which would be 1 to 10 or the highest ranking products). Alternatively I could have used a Lower Than or Equal To Value Filter with these settings to produce the same result:

And here it is: a top 10 of products by sales amount.

 

Of course, you can also use Power View or Power Map to visualize these results. Here is a Power View based on the same information:

The trick here is to create the visualization just as normal (as above). Above displays the sales amount by product and the number sold by city. However, the catch here is that both the graph as well as the map have a filter on them that utilizes the rank measures I created. Here is the filter for the chart. The ‘Rank of products by sales amount’ measure is filtered to showing only values less than or equal to 10, i.e. the top 10.

What’s best about this is that it is very easy to change from top 10 to top 15 to top 5 or anything you desire. Also, the Power View is fully interactive. For example, clicking on one of the cities on the right shows which products are sold in that city. Note that it does not show the top 10 products in that city however.

Hope you liked this Power BI Pro Tip!

Version compatibility between Power Pivot Data Models in Excel 2010 / 2013 and SharePoint 2010 / 2013

I have been getting a lot of questions on compatibility around Excel 2010 / Excel 2013 / SharePoint 2010 / SharePoint 2013. To be honest, I have been confused myself.

I encourage you to check out our Excel help page, which makes it crystal clear:

Note to SharePoint Server 2010 customers: Client applications used to create a Data Model need to align with the server applications that host a Data Model. For SharePoint Server 2010, this means you’ll need to continue to use Excel 2010 and a Power Pivot for Excel add-in to create and maintain a Data Model. Excel 2013 cannot be used to create Data Models that run on SharePoint Server 2010.

This means that you can upload an Excel 2013 file with a Power Pivot data model in it to SharePoint. However, if you interact with it (click refresh, click a slicer, etc) you will get an error message.

It does not get any more specific than that, right?

 

 

Power BI Pro Tip: Role playing dimensions with UseRelationship

One frequent need in BI is something called a role playing dimension (see http://en.wikipedia.org/wiki/Dimension_(data_warehouse)#Role-playing_dimension for a definition). The classic example of this is a Date dimension that is related to a Sales fact table and used either as date of order and date of shipping. One obvious solution would be to add the date table twice to your data model, but that would not only result in a larger data model but also in more work: if you add a calculated column to the date table (such as to use the sort by column trick I explained earlier) you would have to do that on every single date dimension in your model.

Luckily Power Pivot has a handy function called USERELATIONSHIP (explained here: http://technet.microsoft.com/en-us/library/hh230952.aspx)). In this post I will walk you through an example.

Let’s say you have this simple data model again:

This is a Date dimension related to a Sales fact, just like in my last Power BI pro tip.

The relationship between the two tables is on the Order Date:

Now, notice that the Sales table also has a column for Shipping Date. Let’s create a relationship based on that column to the Date table. You can either drag and drop the columns or right click on ShippingDate and choose ‘Create Relationship’.

Here are the settings I made:

 

When you hit create you end up with two relationships between Sales and Date as expected, but one is a dotted line, which means it is inactive. This will be the relationship on shipping date.

No, create a report on this where you display SalesAmount per Month, like this:

The question now is which Date we are looking at. Is it Order Date or Shipping Date? It turns out that the active relationships is used, which is the one on order date.

Now, I would like to add a graph that shows the sales amount by shipping date. For this we will need to add a measure to our data model. Go to Power Pivot and add a measure to your Sales table by selecting a field in the bottom part of your Power Pivot window:

In the formula bar enter the following formula: SalesAmountByShippingDate:=CALCULATE(SUM(Sales[SalesAmount]);USERELATIONSHIP(Sales[ShippingDate];’Date'[Date]))

(Note that you might need to use comma’s instead of semi colons to separate parameters).

Now add another graph to your Power View report where you use this measure and plot that against months:

Now the graph on the right shows the sales amount by month using the shipping date as the relationship, while the graph on the left still shows the sales amount by month using the active relationship, which is order date.

And that’s how you use USERELATIONSHIP to show data when using a role playing dimension in Power BI.

That’s it for this Power BI Pro Tip. Until next time!

Power BI Pro Tip: Sort By Column

Here is an easy solution to a very common problem when making reports: how do I change the sort of some items from alphabetically to something else?

The most apparent sample of this is Months. Let’s say you have to following (very simple) data model in Power Pivot:

That is, you have a Sales table that reports SalesAmount on a Date and related to that Sales table is a Date table (dimension) which stores dates and month name.

Now you create a PivotTable and provide a slicer to filter:

Or you create a Power View report:

What’s wrong here? Your users probably want the months ordered correctly not alphabetically by month name as they are now in both the slicer and the Power View graph.

Of course you can give the months a numeric prefix like ’01 – January’, ’02 – February’, etc. This may work perfectly fine for you but I think this approach is impacting the user interface to much.

There is, naturally, a better way. And it is very easy to implement. All you need is two modifications to the date table.

  1. In Power Pivot go to the date table and add a calculated column with the following formula: =Month([Date]) :
  2. Then click on the column that contains the Monthname (my second column) and choose Sort By Column:
  3. In the dialog choose MonthNumber as your column to sort by and click OK:
  4. Done.

 

Now go back to Excel and look at the slicer and your Power View report. Both of them now sort correctly thanks to the power of Sort by Column. This of course is applicable to anything, not just months or dates.

 

That’s it for this Power BI Pro Tip. Until next time!

CEO Announcement Twitter Analysis

Last week Satya Nadella was announced as the new CEO of Microsoft (see http://www.microsoft.com/ceo). This was met with a lot of reaction in the world and also on the social networks. I decided to analyze the tweets on Twitter using my favourite tool Excel and a special Twitter Analysis add-in that we have made available through http://husting.com/twitter-analytics-for-excel/ . I opened the twitter analytics sheet and did a new query. Here’s what I searched on:

#ceo #nadella #microsoft #satyanadelle @satyanadella

There is really a lot of info that you can get from the Excel sheet. I cannot cover all of it, but I’ll share some screenshots:

 

What I found interesting is the hashtags and mentions. Of course #microsoft, #ceo and #satyanadella are the top three hashtags but #Manual is on fourth place. #cybersecurity and #hacking take sixth and seventh place respectively. Looking at the mentions I noticed @satyanadella, @billgates and @Microsoft, but also @bill_nizzle. Not sure what he is doing there J

In the top screenshot some slicers are shown that enable you to filter the data at will to slice and dice to more insight.

One thing I especially like is the tone map that displays whether a tweet was positive or negative. It also has a map so you can see where the tweet came from and allows you to play back the number of tweets over time:

Only a small amount of the tweets were negative and when I selected negative tweets only I saw the following:

It seems like in Europe we have not been tweeting negatively on this subject (at least in the timeframe I recorded). The US is most negative and also there were some negative reactions from Nadella’s home country India, from Australia and Egypt.

This is a very easy way to do any kind of on the spot Twitter analysis using Excel. The Excel sheet used is freely available via the link above. Enjoy!

Top 3 of what is new in Power Query for Excel

Just recently the December update of Power Query for Excel has been published. See this blog post on the Power BI for the details: http://blogs.msdn.com/b/powerbi/archive/2013/12/13/one-step-closer-to-simplifying-data-analysis-and-visualization-new-features-added-to-power-bi-for-office-365-and-power-query-add-in-for-excel-preview.aspx.

In this post I want to give you my personal top 3:

1. Automatic detection of table relationships. Like PowerPivot if you import multiple tables from the same source Power Query now detects the relationships and brings them over to the Power Pivot Data Model automatically.

2. Connectivity to new data sources. Now supported: Exchange, Dynamics CRM Online, JSON light and Sybase IQ.

3. Fill Down transformation. Fills all empty cells in a column with the value of the first non-empty cell above them. Very handy for those matrix like Excel sheets!

I will come back to some of these in later posts. Happy Power Querying!

%d bloggers like this: