Jeroen ter Heerdt

Using Power BI to visualize your music collection

As some of you might recall, I have been talking about visualizing digital music collections using Microsoft tools. It all started back in the days of Pivot Viewer (At some point I was even part of a lustrous duo called ‘the Pivot Brothers’).
Now, with Power BI being just around the corner I started thinking about taking on this old habit again. Here it goes…

If you are like me you have a digital music collection all nice and tidy complete with ID3 tags. A lot of tools exist that enable you to work with these tags and fix them. I particularly like Mp3tag ( which is free and is extensible. It also comes with an export function to CSV, although it does not export enough to my taste. I created my own export file and I suggest you do too. Keep reading if you want to know how and maybe rip mine.

Of course you can use any other tool that has an export function, as long as you end up with something like a CSV. If you want to follow along in my step-by-step scenario I suggest installing Mp3tag.


Creating the export

Again, please note that you do not have to follow these steps exactly. If you use another tool (not Mp3tag) or already have an export file you would like to use skip to the next step.

To create the export with Mp3tag, open the program and navigate to where you music is stored (my collection sits on \\\Music, which is a folder on my server). Once Mp3tag has listed all your files (might take a while if you have a large collection), select all files (EditàSelect all files or CTRL+A) and then choose File à Export (CTRL+E). You will get the Export dialog. Now, hit the first button on the right (the little page with a star to create a new export configuration and give it a name (I named mine “My CSV Export”) and click OK. Notepad will open and allow you to edit the export configuration.

At this point you can copy and paste the code below into Notepad and hit save.

Then, return to Mp3tag, select your export configuration, set up the export file name and click OK. Wait for a while and then click No to not display the export file.


Loading the file

If you followed along you will have a CSV file that contains the export of your music collection delimited by semicolons. Your delimiter may be different. As a consequence the steps you need to take to get the data loaded may vary.

Open up Excel 2013, click on Power Query in the ribbon and select From File à From CSV. Select the CSV file you just created and click OK. Now Power Query will do its best to give you what you want but as you can see it is not very successful (with the build available at the time of this writing).


All is not lost however, because now we get to work with the magic of Power Query!

First off, we need to get rid of all the errors we get in the data rows. Click open the steps fly-out on the right and click the first step (Source) and click the icon right next to it. Then in the ‘Open file as’ drop down box select ‘Text File’ instead of ‘CSV Document’ and click OK. This already looks a lot better, doesn’t it?

Now we need to split the column in by the semicolon delimiter. Make sure the ‘Source’ step is still selected and then right click on the ‘Column1’ header and choose ‘Split Column’ à ‘By Delimiter…’

Make sure to select the correct delimiter (semicolon in my case) and make sure that ‘At each occurrence of the delimiter’ under ‘Split’ and click OK.

Then you screen should look like this:

Now select the ‘FirstRowAsHeader’ step and check the output.

We now need to delete the last step (we will redo it ourselves later on). To do this mouse over and click on the little cross icon in front ‘ChangedType’.


Transforming the data

  1. Change data types

Now we need to change some data types but right clicking on the header of the column and choosing ‘Change Type’ and choosing the type you want to change to.

Change these columns to the indicated types:

Column Change to type
Last Modified Date
BPM Number (select ‘Using locale’ first and then select ‘Number’ and ‘English (United States)’ if your country does not use a . as decimal separator.
Bitrate Number
Year Number
  1. Fix the Size column

Now, split the Size column by right clicking on the ‘Size’ column header and selecting ‘Split Column’ and then ‘By Delimiter…’. Select ‘Space’ as delimiter and select ‘At the left-most delimiter’ as split option and click OK.

You will now get two size columns: Size.1 which contains the numerical value and Size.2 which contains MB or KB.

First, change Size.1 to a number (right click on header, Change Type à Number). Then insert a new column (right click on a column header, Insert Column à Custom…) and enter the following custom column formula:

Rename the Custom column by right-clicking the header and choosing ‘Rename…’. Enter ‘Size’ as your column name and hit enter. You now have got a single column that reports the size of the song in Kb.

To wrap things up right click ‘Size.1’ and click ‘Remove’. Do the same for ‘Size.2’.

  1. Add cover art

One of the nicest things of albums is the cover art. Luckily Windows stores the cover art in the same directory as your music files as a hidden file named ‘Folder.jpg. Let’s add this by adding another column (right click on a column header, choose Insert Column à Custom…) and entering the following custom formula:

Rename the column to Cover.


Now we’re done transforming the data. The steps you performed form a script, which you can see by clicking on the title scroll icon at the formula bar. Here is my script:

As you can see every little step we took is represented in the script.

Now click ‘Done’ to get the transformed data in Excel, where we will start on the visualization.


Fixing the covers using PowerPivot and adding release date

Now we need to tell Power BI to read the cover column as a reference to an image. This is something we cannot do in Power Query at the moment, but we can in PowerPivot. In Excel, click PowerPivot à Manage to open up the PowerPivot window. In here, select your Cover column, go to Advanced and set the Data Category to be Image URL.

Also, add a new column (by clicking on the empty column at the right of your table) and enter the following formula:



Hit enter and name this column ReleaseDate.

Building the visualization

Back in Excel, click Insert à Power View and start building your visualizations. Here are some examples I built:

  1. Number of tracks per release year

    Count of track by year in a line graph


    Apparently most the tracks in my collection were released around 2000, while the oldest track I have has been released in 1930.



  2. Number of tracks per genre over time

    This clearly shows that in 1978 the rock genre was most popular (at least in the albums in my collection). However, the Disco songs were the longest on average.

    The play axis at the bottom allows me to play through my collection over time.


  3. Number of albums per genre per year

    In 1983 the most popular artist was Kajagoogoo and you also get an idea of what the popular genre was


  4. Albums by genre and artist of time

    This shows the progressive rock genre and when it was popular. Also it shows which artists released albums categorized as progressive rock and which albums I have of that artist (I selected Yes).


That’s it

I will revisit visualizing my music collection when I have access to a BI Site, so I can show the latest visualization live. For now, this it it. Enjoy!

SQL Server Data Tools Business Intelligence Project Templates in Visual Studio 2012

If you installed SQL 2012 you have probably noticed that you development environment for Integration Services, Analysis Services and Reporting Services is still hosted in a Visual Studio 2010 shell (SQL Server Data Tools). However, with a free download you can get the Microsoft Business Intelligence Project Templates for SSIS, SSAS and SSRS in your Visual Studio 2012 installation. All you need to do is to download and install it.

Combining Excel files using Power Query for Excel

For the information in this post I am heavily indebted to Michiel Rozema who originally figured this out.

This post is an extension to my previous post on combining text (CSV) files using Power Query for Excel. In this post we will go through the same steps again, but this time for Excel sheets. Combining Excel sheets is a bit harder than text files however.

Being able to do this is very handy if you have multiple Excel sheets reporting on different periods, regions or products and you want to combine the data from those sheets into one table to use in your reporting.

First we need to make sure that ‘Advanced Query Editing’ is enabled. To do this, open Excel, go to the Power Query tab and choose Options. Then make sure the checkmark for ‘Advanced Query Editing’ is enabled. In order to successfully combine multiple Excel sheets we will need to modify the query’s script.

Just like with combining text files using Power Query, get started by clicking ‘From File’ and then ‘From Folder’.

Specify the folder where your Excel files are located and choose ‘OK’.

As before Power Query returns the list of Excel files in the folder:

Lets click on the first cell of the first row to get the sheets of the Excel workbook:

If you now click on ‘Table’ in ‘Data’ column of the Sheet1 row you will get the contents of the sheet itself:

Let’s first fix the headers. Right-click on the little table icon in the top left and choose ‘Use First Row As Headers’.

Now, in order to load the rest of the Excel files we need to modify the code a bit. To do this click on the little script icon just above the table:



The query editor opens and shows the M-script you have generated in the previous steps. Here is mine:

If we take a closer look at this M-script we see the following structure:

What happens is that we define A and then in steps apply functions on A. In each step we take as input the output of the previous step. In the end we return the result of the last function.

The M-script above is quite static in the sense that it applies only to the Excel sheet we selected (in this case ‘Sales Data 1.xlsx’). What we need to do is find a way to provide a parameter to this code and execute the code for every Excel sheet in the directory.

To make this work we will need to define a function based on the M-script we currently have. To do that we need to change the M-script to:

What we do here is defining a function with two parameters (the path to the files and a filename). Make sure you match the casing of ‘in’ and ‘let’ correctly! Do not use any capitals. Change your M-script accordingly and click ‘Done’.

Your Query Editor should now look like this:

Now, let’s make sure we understand what this query does by changing the name from ‘Query1’ to something like ‘GetExcel’ by double clicking on ‘Query1’ in this screen and entering the new name. We will use this name to invoke the function.

Click ‘Done’. We will invoke this function in a just a little while.

On the Power Query tab click ‘From File’ and choose ‘From Folder’ again. Re-enter the folder where your Excel files are stored and click ‘OK’. Now right-click a colum header and choose ‘Insert Column’ and choose ‘Custom…’. A formula editing screen opens:

Here we will need to invoke the function we defined earlier by entering the following:

Click ‘OK’ and rename the column by right-clicking the column name and choosing ‘Rename’. I renamed the column to ‘FileContents’:

Now click the icon to select the columns to expand from your Excel sheet and click ‘OK’.

Now we have the contents of the files combined with information about the files in one table. To get rid of all columns except for the contents select the three contents columns, right-click and select ‘Remove Other Columns’:

Optionally you can rename the columns. When done click ‘Done’ to get the data in Excel.

The best part is that if you add a sheet to the directory and click ‘Refresh’ in the Query tab of the ribbon, the data will get added to the result set:

So, if you have set this up and new data needs to be added as long as the structure of your Excel sheets does not change you can just click ‘Refresh’ and it works! This is the amazing power of Power Query.





Combining text (csv) files using Power Query for Excel

For the information in this post I am heavily indebted to Michiel Rozema who originally figured this out.

In this post we will have a look at how text (CSV) files in a folder can be combined using Power Query for Excel. In a later post will we extend this and load Excel files. Having this capability is very handy if you have files reporting on the same info (say sales figures) from different regions or multiple periods. You can keep the files as is but load them into Excel / Power Pivot as one table.

To start, load Excel and make sure Power Query is installed. Then on the Power Query tab click From File and then choose From Folder.

Next, specify the path where the text files are stored and choose Ok. What you will get is a list of files in the folder:

In this sample I have two csv files in the folder. The columns contain information about the file, such as date modified and file name. Some columns are special however and the column we are interested in is the first column (Content). Clicking on ‘Binary’ in that column in any row gets you to the content of the file you clicked. That is nice, but what we really want is to load all content of the two files. To do that we need to click the little icon next to the Content column header that looks a bit like two downward pointing arrows: . Click it and the contents of the files is combined together and shown in a new column (Column1):

The information from all files in the folder is now present, but all in one column. We need to split it by right-clicking the column, and choosing ‘Split Column’ and ‘By Delimiter’. I chose ‘Semicolumn’ as delimiter and clicked ‘OK’.

Now we still have the column headers from the CSV files in our set. In order to use the first row as column headers click the little table icon in the top left () and choose ‘Use First Row As Headers’.

Now we need to filter out the column headers from the second CSV file. I did this by filtering the first column just like you would in Excel. The result I ended up with is this:

Click ‘Done’ and your data will load into Excel, ready to be used!

Automatic monitoring of Excel and Access files with Audit and Control Management Server

In my previous post I introduced the Inquire add-in for Excel and discussed it at length. In this blog post will we look at Microsoft Office Audit and Control Management Server 2013 (which I will call ACM from now on), which is in some sense the server equivalent of Inquire for Excel. Where Inquire analyses only Excel sheets and requires you to do so by hand, ACM automates this process by monitoring file shares and SharePoint libraries for changed files, both Excel and Access files.

Setting ACM up is pretty easy and I will walk you through the process in this post. A later post will then discuss how to use ACM.

First off you will need the software and install it. You will need to have .Net Framework 4.0 and Visual C++ 2005 Redistributable Package (x86) installed. Also you will need to have a SQL Server available. You could for example use an existing SQL server and store the database there or use SQL Express on the ACM Server itself.

Last requirement is having IIS configured on your machine. Take care to enable ASP.NET and required role services as well as Windows Authentication and Management Tools and all options under there (including IIS 6 Management Compatibility). Make sure the following gets installed:

  • Web Server
    • Common HTTP Features
      • Static Content
      • Default Document
      • Directory Browsing
      • HTTP Errors
    • Application Development
      • ASP.NET
      • .NET Extensibility
      • ISAPI Extensions
      • ISAPI Filters
    • Health and Diagnostics
      • HTTP Logging
      • Request Monitor
  • Security
    • Windows Authentication
    • Request Filtering
  • Performance
  • Static Content Compression
  • Management Tools
    • IIS Management Console
    • IIS Management Scripts and Tools
    • Management Service
    • IIS Management Compatibility
      • IIS 6 Metabase Compatibility
      • IIS 6 WMI Compatibility
      • IIS 6 Scripting Tools
      • IIS 6 Management Console

When you have successfully installed ACM, you will have two new programs available: Microsoft Office ACM Configuration Utility and Microsoft Office ACM Service Manager.

You will need to setup ACM using the Configuration Utility first, so start it.

Once the Utility opens you will see this screen:

Let’s start by creating a new ACM database. Click ‘Create new ACM database’, enter the database server\instance name and the name of the database to create. Finally, click ‘Create’ and wait.

When done, click ‘Connect to ACM database’ to verify that the connection info for the database has been successfully stored there. If required set up the connection here and hit ‘Save’.

Now that we have configured the database and the connection to the database, we need to setup the ACM Web Server and ACM Application Server.

Click ‘ACM Web Server’ to get started and choose where you will create the Web Site. Since I had an empty IIS install, I chose Default Web Site. I entered a new for the web application and made sure to use correct credentials for the Application Pool Identity. Also, specify an initial Central Administrator account and click ‘Create’.

Then we can continue to the ACM Application Server configuration; click ‘ACM Application Server’ to get started. Here we will need to specify the URL of the Web Server you have just configured (in my case it was just http://localhost/ACM). Optionally you can specify users whose file saves you would like to ignore. When done do not forget to click ‘Save’. Now you can click on ‘Show Service Manager’ to jump to the other tool we will need (Microsoft Office ACM Service Manager).

When the ACM Service Manager opens it should show you four services (see screenshot below).

You will need to configure the services correctly. In my experience it works best if you first stop the service (right-click, Stop) before editing them (right-click, Edit). You will need to specify a logon account and password for each service on the Settings tap of the properties dialog box. Also I chose to change the start-up mode for each service to Automatic. Once done your services should all be started (you can check the status in the service manager). If not you may have to start them by hand.

Note that you do not have to run all services. For example if you do not want to monitor SharePoint libraries, you do not have to configure this service. Also, if you do not want to monitor Access files, you do not have to configure the Access Processor service. The NTFS Event Processor service is required is you want to monitor file shares. The Spreadsheet Processor service is require for monitoring Excel files.

Now open up your browser and navigate to your ACM Web Server (in my case http://localhost/ACM).

You will see a page titled ‘My Files’ and it will be empty. Don’t worry, we will fix that very soon.

Click on the little gear icon right next to your name in the top right and click ‘Site Settings’ to open the settings. You can always go here if you need to troubleshoot. Service Status and Event Log are two very helpful items to check out first if you are having problems.

For now, click on Processing Folder. Here, specify a UNC path where the Excel and Access processor can store files while processing. This folder will be emptied regularly, so don’t worry too much about this. Anyway, I choose to create a file share on my local server for this purpose and entered the share name here. When done, click ‘OK’.

Next up is the File Processor Aliases option. Here you can specify the aliases of processors to use to scan items. The default alias (AppSrvAlias) should already be added. If not, add it and return to the previous screen.

The last item we need to look at is Monitored Folders. Here you can set up the folders to monitor. I used a file share to monitor for files since I do not have SharePoint installed on this machine. You can add multiple folders to monitor and mix both types. When specifying a file share folder you will of course need to enter the UNC path and you could change the file types to monitor (although I recommend leaving it like it is). Also you can specify if you also want to monitor subfolders. The Change Tracking option specifies the tracking level for Excel files, which determines how deeply you want to investigate Excel files. I chose ‘Functional, Formatting and Data Entry’ but ‘Functional’ is recommended, since you will probably not be interested in formatting changes like colors and fonts. However, I found Data Entry to be interesting to track, so that is why I chose that level of tracking.

You can specify if you want to track changes each time the file is saved or once a day at a certain moment.

An important setting is the processing folder, which is used to store versions of monitored files. Enter a UNC path in here which is different from the processing folder and the monitored folders specified earlier. Also bear in mind that users should not be given access to this folder. The number of versions to store per file is configurable here also.

Finally, we need to specify folder manager and viewer permissions and click ‘OK’.

Repeat this process for every folder you want to monitor and you’re done configuring ACM.

If you now create a new file or edit any existing file in one of the monitored locations you should see it showing up on the My Files page. In the case of Excel files you will need to make a change that actually triggers the tracking level you specified. If you went with the ‘Functional’ setting and make a formatting change it will not be tracked.

In my next blog I will show what you can do with monitored files, so stay tuned!


Investigate Excel sheets using Inquire

Excel sheets are notorious for their many formulas, references, cutting, pasting, macro’s, hidden columns and sheets and overall complexity.

My story about Paul is an (very painful) example of what people can build using Excel and how organizations can become dependent on that without even knowing it, ending up in a so-called Excel Hell.

Those “magic Excel sheets” or end-user generated applications contain a lot of information and a lot of business knowledge. They are a bit like spaghetti-code: it is hard to understand how exactly how they are structured and what they do. Something goes in and a result comes out, but what happens in between is in many cases a black box. The reason for this is that they are the product of a long time of development by a non-developer; in the case of Paul this sheet was a product of a career and was built by someone with absolutely no idea about coding standards or even an understanding of what benefit methods in your code could bring.

The problem is however (as per the saying “people trust people, people do not trust data”) that in many cases these exact same magic sheets are used to base decisions on. And I mean not only operational, small-scale decisions but strategic, sometimes life-threatening decisions (heck, even the Nasa is believed to use magic Excel sheets to do some last minute calculations while in mid-flight).

With Excel 2013 Microsoft has taken big steps in turning Excel into a trustworthy BI tool. Not only do we provide great and powerful visualization and analysis capabilities, but also we provide a way of getting data and transforming it that is both powerful and easy to use as well as completely traceable and auditable.

However, it would be silly to believe that from one day to the next everyone would re-build their Excel sheets. They are just to complex to do that or to important or there is simply no time or knowledge available.

In those cases the best we can do is try to understand what happened in those Excel sheets and make it visible in case anything breaks or comes out the wrong way. That’s why Inquire was introduced in Excel 2013. It is an add-in in Excel 2013 which is disabled by default. Once it is activated, an “Inquire” tab is added to the ribbon.

Here we can start our research into understanding the Excel sheet.


Workbook Analysis

When the window opens we get a big list of information about your Excel file and what is in it.

Clicking any header on the left side gives you the exact items and locations in the worksheet. Rather ironically, you can export the results to an Excel file J

Some of the items I find most interesting are:

  • linked workbooks
  • data connections
  • hidden sheets
  • very hidden sheets (you can very hide a sheet by going to Visual Basic for Applications (ALT+F11) and change the property of the sheet to very hidden)
  • Formulas with errors
  • Formulas with numeric / textual constants
  • Formulas without cell references
  • Formulas referencing blank cells
  • Formulas referencing hidden cells
  • Formulas referencing external workbooks
  • Duplicate formulas
  • Inconsistent formulas
  • Unused input cells
  • Blank referenced cells
  • Invisible cells
  • Hidden rows and columns
  • Named items with errors
  • Warnings


Workbook relationship

The workbook relationship function explores dependencies on other workbooks, for example through linked workbooks and data connections.

As you can see this sheet has 9 direct dependencies, 7 of which are Excel files, one HTML file and one Access database. The red indicates that the dependency is broken, i.e. the file is no longer accessible. You can right click and fix the relation or dive deeper into the tree of dependencies as I did in the screenshot above.

Worksheet relationship

This provides a tree of how worksheets depend on each other. Sometimes you will see relationships between worksheets, which means that formulas on one worksheet reference the other. Additionally, dependencies to other worksheets are shown.


Cell relationship

By far the one of the two most powerful functions is the Cell Relationship Diagram, which you open by click ‘Cell relationship’. It may take a while to generate, as it shows all relations the currently selected cell has with other items. In this example I clicked a cell that contained a formula which resulted in a division by zero error. The cell relationship diagram for this cell is humongous:

Not looking too good huh? Most of the lines however, point to one problem cell (D15). So I collapsed that for now and then the diagram looks much better:

Cells that are marked red have errors. If you hover over a cell in this diagram you get the formula in that cell as shown in the screenshot above. Also note that this tree shows another workbook being referenced. By clicking on the little + signs you can explore further down the tree. In this case I would like to understand why B6 is showing an error. Turns out that B6 references B68, which in turn references E68 (F68 is not referenced by B68 but rather references it, so that explains the error there). It seems like if I could figure out why E68 is erroring, I solve the B68 error and thus the error in B6 and F68. Going a little futher I suspect E67 to be the problem (it is used to divide value D68 in cell E68 but it does not return any value). Double clicking on E67 takes me to the cell in the worksheet. Immediately I see that this is an error: the cell is empty, which causes the chain of division by zero. Now the next step would be to actually change the formula definitions.


Compare Files

Compare files is an interesting function in that it compares not only for structure or content, but also for functions, macro’s and even layout. To use it just open two Excel files and click on ‘Compare Files’. After just a while a new window opens in which you can browse the comparison results (And again export them to Excel J).

This window is a great tool to understand just how big the differences between the sheets are and how big the potential problem is. On the bottom right we see that some entered values have been changed between these two versions of the same file, some calculated values have changed but the majority of the changes have been in formatting. Now that does not tell me a lot about a problem scenario (formatting normally does not introduce auditability and trust problems now does it?), so I removed it from the diagram by de-selecting it in the list on the left. Now, the diagram shows much more context of the problem:

We can now see that some formulas have changed and some structural changes have been made (renaming of sheets, deleting rows and columns). Additionally, one change in a macro has been made (among other things).

Clicking on the structural changes takes me to a list of the structural changes, which tells me exactly where what has been changed:

Then for the other problem area: the macro. Selecting the macro change from the list and double clicking on the item brings up another window and here we see the exact line in the macro where the change happened. Apparently somebody changed the 10200 to 10300. This could be a serious mistake!


Clean Excess Cell Formatting

This removes unused cell formatting information. Unused formatting information can dramatically increase the size of your workbook, so removing unused information is always a good idea.

Workbook Passwords

Inquire needs to store workbook passwords if you want to analyze and compare workbooks that are password protected. Using this button you can safely store the workbook passwords so Inquire can open the workbooks on your behalf.


The header says it all: this opens the help file…


In summary: Inquire is a great tool to analyze individual Excel sheets or compare versions of Excel sheets. However, you will need to go through every single Excel sheet one by one. To implement are more automatic check of your Excel (and Access!) files the Audit and Control Management Server is available. ACM Server monitors file shares or SharePoint libraries and automatically tracks versions and changes between versions. I will introduce it and discuss how to set it up and work with it in future blog posts.

More info on Inquire is available at


Taking the new version of Power Map for a spin

One of the most exciting tools in Excel has recently released a new version! Get the updated version of Power Map here: (Excel 2013 only).

This update comes with a ton of new features, including some of the most popular feature requests I have come across.

One of the first things that you will see when you add data to Power Map is that Power Map is now more intelligent and automatically suggests how to interpret columns. In the screenshot below it automatically mapped the ‘Gemeente’ (County / Municipalty) and ‘Provincie’ correctly. This goes to show that this automatic mapping not only works with English terms J


The way you choose between types of column charts has changed. You just choose the ‘Column’ type and then on ‘Category’ you can change the chart sub type (clustered or stacked).

Power Map can now also use calculated columns and supports hidden fields. Annotations can now not only contain custom text or fields but can also display an image.

A new visualization type is ‘Region’, which is actually really cool. It provides shapes as available from Bing to overall items on the map. As far as I can see it not only works for the obvious countries, but also for example for provinces in the Netherlands and even the municipalities.

If you have chosen a category for this new type of visualization you can also change the way regions are shaded, for example by relative values in the same category or across all items.

Then, with a tick of a button you can change from a 3D map to a 2D / flat map (note the cool animation).

Also, it is now possible to save your tour to a video right in the tool. Video types can be optimized to presentations and HD displays, computers and tablets and for mobile devices.

One great new feature is that it is now possible to change the coloring of charts, regions and bubbles:

Here is a sample tour I created in 20 minutes based on population data that I loaded from CBS (Central Bureau of Statistics in the Netherlands) using Power Query (hint: make your selection in CBS’ Stat line and use the “from web” option in Power Query to get the data in Excel).


Generating perfect test data using Visual Studio and custom Data Generators

In many development projects the need arises to fill a database with some random data. With random data I do not mean mere gibberish, but actual data such as an email address, phone number, SSN, numeric value, dates, datetimes, etc.

With Visual Studio 2010 you can do this very easily. All you need is an edition of Visual Studio 2010 that includes the database projects (I think you will need Ultimate, but am not sure). And of course you will need a database, I used SQL Express in this example.

However, turns out that the data generators included in Visual Studio are somewhat limited. Visual Studio can generate data for many datatypes in SQL Server, but for example generating a phone number, SSN or email address is not possible.

However, one can easily create custom data generators, which is exactly what we will do. I will build custom data generators for email addresses and one that generators Lorem Ipsum text.

Building the custom data generators
Start with creating a new class library project. In the solution explorer, right-click the project and click ‘Add Reference’. Click the ‘.NET’ tab and select the following components: Microsoft.Data.Schema, Microsoft.Data.Schema.SQL and Microsoft.Data.Schema.Tools. Click ‘OK’ to add these references.

Open your class (I renamed my class and class file to LoremIpsumGenerator) and add the following using statements:

Now we need to inherit the Generator class by changing our class definition to:

Also add the DatabaseSchemaProviderCompatibilityAttribute to your class declaration as so:

Now we need to add an output property. This will be used to wire up the output of your generator to Visual Studio Data Generation Plans.

Also, if you need to specify some parameters to your data generator we need to create some input properties. I have created three: one for the number of words to generate, one to indicate if the data needs to start with ‘Lorem Ipsum…’ every time and one to specify if just random words can be selected or if we are more interested in sentences. Also note the default settings at the bottom.

Then all we need to do is override the OnGenerateNextValues() method. Since I wanted to Unit test my generator I have just one line in this method, which class a public void method (which my testing command line project can call). In this function (called Generate) I built the output based on Lorem Ipsum and set my Output property.

I will not discuss how exactly I managed to get the Lorem Ipsum words returned. If you want to know you can download the source code. Also, my email address generator follows the same structure of output, inputs (in this case none) and a generate function that creates the email address string and sets the output property.

Registering the data generators
If your data generators build successfully, we need to sing the generators with a strong name. To do this, right-click your project’s name and choose properties. Go to the Signing tab and select the ‘Sign the assembly’ check box and under ‘Choose a strong name key file’ choose ‘New…’. Follow the wizard and enter a password. Your screen now looks like this:

Now save all, and rebuild your solution.

Next we will need to create an XML file to accompany your DLL. Just add a XML file to your project and name it [YourProjectsName].Extensions.xml. My project (and thus my dll) is named MyDataGeneratorsLibrary, so the filename of the XML is: MyDataGeneratorsLibrary.Extensions.xml

Open the XML and replace the contents with this (you will have to fill in your own values).

As you can see, we need an Extension element for each generator. In my library I have created two generators (one for Lorem Ipsum text, and one for email addresses). Also you will have to specify the correct type (which is your classname and namespace) and the assembly (which again is your DLL file / project name). Last item to enter is the public key token. To get it open your Visual Studio Command prompt and use

to get the public key token returned to you. You will need to enter the correct path and file name. Copy paste the public key token into the XML.

One last thing I did is right click the XML and chose Properties. I changed the ‘Build Action’ to ‘Content’ and the ‘Copy to Output Directory’ property to ‘Copy always’. It does not matter in any way but it makes the next step just a bit easier.

Now, build your solution and navigate to your solutions directory. In there go to the bin\Debug folder, and copy your DLL and XML file.

Now it is time to register the data generators with Visual Studio. Be sure your Visual Studio 2010 instance is closed and navigate to the following folder: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VSTSDB\Extensions . It is good practice to keep your own extensions in a separate folder, so I created a folder ‘CustomGenerators’ and I advise you do the same. Next, paste your DLL and XML files into this folder.

Start Visual Studio.

Using the data generators to generate data
In Visual Studio, add a new project to your solution or create a new solution. Select SQL Server Database Project as the project type and enter a fancy name. I assume you already have built your database schema. Right click on your project name and choose ‘Import Objects and Settings’ and follow the wizard to get your database schema into your project. Next, add a new Data Generation Plan and open it. Select the table you want to use and specify the number of rows to insert.

You will see a list of the columns in the table. To the right (under ‘Generator’) you can specify which generator you want to use to generate data with. It should contain your custom generators now. Select it and click the preview button to test your generator: . Here is my screenshot:

If you have any input properties specified you can change the values by selecting the row and opening the properties window, see below screenshot for my Lorem Ipsum generator.

That concludes my introduction into custom data generators. Download the source code (with the Lorem Ipsum and email address generators) on Github.

Creating a SSRS report template

One of the most-heard questions related to SSRS is if it possible to somehow create SSRS report templates so reports get the same look and feel.

SSRS does not work with some kind of stylesheet or CSS sheet to change the layout of new and existing reports. However, with a bit of creativity we can create a template for new reports. Note that this does not solve the implicit requirement of the question: deployed reports will still require rework if the template changes.

To create your own template, simply create a report and save it in a folder in your Visual Studio installation. The folder you will be looking for is C:\Program Files (x86)\Microsoft Visual Studio 11.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject for Visual Studio 2012 and C:\Program Files (x86)\Microsoft Visual Studio 10.0\Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject for Visual Studio 2010.

I created a simple template with a header with a logo and a footer and saved the report in the folder. Now when I add a new item I can choose my template to base my report on:

Again, this is a one-of template in the sense that it is not as flexible as a stylesheet. If you need to change the template after reports have been deployed you will need to re-do the reports.

Multilingual SSRS reports – Scenario 3: Report Definition Customization Extension

This is the third post in my series about multilingual SSRS reports. If you missed the introduction, you can find it here.

I am looking for feedback! Please let me know what you think!

The full code of the solution I describe here is available on Github.

What we will be doing in this post is developing something which called a Report Definition Customization Extension (RDCE), which is available since SQL Server 2008. The MSDN page is here:

Let’s first start with the basics: what is a RDCE and what can it be used for?
Simply put a RDCE transforms an existing report definition (RDL) and modifies it just before the SSRS rendering engine renders the report. In other words building an RDCE allows you to interfere with the report rendering process and make some last minute changes just before rendering starts. In a schema it looks like this:

As this post is about making reports multilingual you might already have guessed that one of the uses of an RDCE is doing translations. However, you can also change the report’s look by hiding elements and even change the data set returned. This is useful if you not only need to translate labels in your report but also want to actually return text strings from your dataset in a different language. Another option would be for example to deal with right-to-left languages by changing your report’s layout.

Developing a RDCE
An RDCE is a .NET class library, where you will have to select .NET framework 3.5 as the target framework. This class library will need a reference to Microsoft.ReportingServices.Interfaces.dll. This contains the interface you will need to implement (see below). So let’s get started.

First off, start Visual Studio and create a new class library project (targeting .NET framework 3.5).

Include a reference to the aforementioned Microsoft.ReportingServices.Interfaces.dll file by right clicking ‘References’ in your Solution Explorer and choosing ‘Add Reference…’

The file is located under your SSRS installation directory. In my case the file was in C:\Program Files\Microsoft SQL Server\MSRS11.MSSQLSERVER\Reporting Services\ReportServer\bin.

Now open your class file (I renamed it to MyRDCE.cs) and add the following line:

Then, implement IReportDefinitionCustomizationExtension by changing your class declaration to:

You can explicitly implement this interface to get an idea of the methods you will need to implement by right-clicking the interface name and choosing ‘Implement Interface Explicitly’.

Your code should now look like this:

Let’s start with the easiest method: IExtension.LocalizedName, which just returns the name for your RDCE. I replaced the line in this method with:

Yes, I know I should not have hardcoded the name here, but for demonstration purposes this will do.

As developing an RDCE is as hard as it is (never mind the debugging, testing, logging and deployment challenges you will have to deal with) I suggest adopting the framework Holger Schmeling introduced in his blog on RDCE’s. This framework allows you to componentize your transformations into smaller transformations instead of one lengthy bit of code. By doing so the RDCE can be more easily managed and understood. Notice that in this blog I will not deal with logging and error handling; however, in production that should be included. To use this framework we add an interface called ITransformation with one method named Transform. Just add a new Interface file to your solution, name it ITransform and copy paste the following code in (adapted from Holger’s blog):

The Transform() method takes a report definition in XML format and returns the modified version. The other two parameters provide the report and user context as returned from SSRS. This is handy to dynamically apply transforms based on these contexts.

Now it is time to add a list to hold all the transformations (Again thanks to Holger’s blog). Add the following code directly under the class statement in your RDCE class:


Also implement the ProcessReportDefinition() method that will be called by reporting services. This is the mother-transformation process. What we will do here is load the report and one-by-one call the registered transformations on the report. Find your ProcessReportDefinition() method and replace it with the following (you will need extra using statements).

This method first retrieves the report definition. Then all transformations are applied, after which the resulting report definition is returned and SSRS is informed about what we changed.

Following Holger’s framework we use the SSRS config file to define transformations. This might not be the ideal solution for your scenario since it involves editing the SSRS config file. However, on the plus side this allows you to register new transformations as required without having to redeploy or change the RDCE. Basically we need to change the rsreportserver.config (mine was in C:\Program Files\Microsoft SQL Server\MSRS11.MSSQLSERVER\Reporting Services\ReportServer) file and add something along these lines (do not forget to make a backup first!) as last entry in the <Extensions> node:

You can specify multiple Transformations by repeating the Transformation node as necessary. Note that Name and Type specify the name of the class and the namespace. Also, each Transformation can have custom properties set (name, value pairs).

Now, we can use the SSRS SetConfiguration() methods to get to the configuration settings in the rsreportserver.config file. This enables us to read the list of transformations registered and fill the list created earlier. We will do this by implementing our final function of the IReportDefinitionCustomizationExtension: SetConfiguration() (you will need another using statement at the top of your class).


Now that we have a functioning RDCE framework it is time to put it to work. In this sample I will build a simple translator transformation which will translate some texts in a report by looking them up in a database table. I added a new class to my solution, named it MyTranslator and put it in the SSRSMultilingualRDCE.Transformations namespace as configured in the reportserver.config file above. The code pretty much speaks for itself. It implements the Transform method from the ITransformation interface, retrieves the user language preference setting and retrieves the translation from the database:



Deploying a RDCE

Now that we have successfully built a translator we’re done with development and are ready for deployment. To deploy your RDCE to the SSRS server follow these steps:

First, copy you RDCE assembly (.dll file, in my case SSRSMultilingualRDCE.dll) into the Reporting Services bin directory (which in my case was C:\Program Files\Microsoft SQL Server\MSRS11.MSSQLSERVER\Reporting Services\ReportServer\bin). If you have referenced any assemblies not in the Global Assembly Cache (maybe you implemented some logging framework) do not forget to copy them here as well.

Next we need to modify the SSRS configuration, starting with the rsreportserver.config file. It is located in C:\Program Files\Microsoft SQL Server\MSRS11.MSSQLSERVER\Reporting Services\ReportServer. Open it and lookup the <Service> element and add the following to that element (I added it to the bottom):





Next configuration change will be done in the rssrvpolicy.config file in the same directory as before. Open the file and add a CodeGroup in the PolicyLevel element for your RDCE assembly, like so:

You will need to specify the correct Name for the CodeGroup (just let it match the nam of your RDCE) and the correct location and name of the DLL you copied.

You may have to set up the permissions in the web.config file as Holger reports in his blog, although I have not had to do that.

To make your reports work with the RDCE there is a little thing we need to do; we need to link our reports with the RDCE deployed. The official way of doing this is using the Reporting Services Management Service. However, to use it you either need to have a deployment tool or be willing to write one. The unsupported way however is directly updating the report properties in the SSRS catalog. Please note that this is unsupported and I give no guarantees whatsoever regarding damage afflicted or the working of this on future versions. If you want to follow along take the following steps:

Open SQL Server Management Studio and execute the following query on your SSRS Service Database (not the TempDB) (you will have to adapt the where clause to match the name of your report):



This returns among others the ItemID, which you will need for the next statement. What we need to do is add <RDCE>Name_Of_Your_RDCE</RDCE> to the properties of the reports that need to use the RDCE.

Your will need to execute a query that looks like mine:

I have simply copied and pasted the ItemID in the where clause and also copied and pasted the value of property from the previous query into this query. I have appended the <RDCE> element just before the closing of the properties element. My report will now use the RDCE.


When rendering a report that uses a RDCE SSRS will display any errors thrown by your code. This helps in debugging. Also note that if you make any changes to the code you will have to recopy your assembly and restart the SSRS service for SSRS to pick up the new version.


Putting it all together

All that remains now is showing that this works. I have created a very simple report for this:

This report contains two textboxes with labels in them. These labels were defined in the database and the database contains a translation for it. As a result my RDCE will translate these labels, depending on the user’s preferred language.

So, when a user with preference for English logs in this will be displayed:

And when a user with preference setting for Dutch opens the report it displays as follows:

My solution works with a configuration database setting for the user’s preference. However, just as easily one could adapt the code to read the setting from the SharePoint profile (if using SSRS in SharePoint integrated mode) or for example from Active Directory.

That concludes my lengthy blog on RDCEs. You can download my solution on Github. It includes a database project that will create the simple database I used and enter some sample data. Also, it includes all code discussed and the sample report I used. The configuration changes are for security reasons not included.

I am looking for feedback! Please let me know what you think!

%d bloggers like this: