Power BI Refresh scenarios against data storage solutions explained

A recurring theme with customers and partners is Power BI data refresh. More specifically, there is some confusion on what refresh scenario requires a Pro version and what can be done with the Free version of Power BI. I made the diagram below to help explain this. It shows the refresh scenarios against data storage solutions, such as SQL Azure, SQL in a virtual machine in Azure or SQL server on premises. I used these as examples, there are other options as well. I think the overall time carries over to other data storage solutions. The diagram shows the refresh that can be done using a Power BI Free account as orange and the refresh scenarios that need Power BI Pro as green lines. As shown in the diagram, if  you want to refresh against on-premises sources or a database running in a VM in Azure you will need a gateway and Power BI Pro. This applies not only to the creator of the report and schedule but also to every consumer. If you use PAAS solutions for data storage in Azure such as SQL Azure, it becomes a bit more difficult and it is really dependent on the type of refresh required. If you need a refresh cycle higher than once a day (either max 8 times per 24 hours or live) you will need Power BI Pro. If you just want to refresh against such as SQL Azure and once a day is enough you can do that using Power BI Free. Again, the license requirement carries over from author to viewer; if the author of the report requires Pro, then the viewers also need Pro.

Power BI Refresh scenarios against data storage solutions

Hope this helps. If you have any questions or feedback, please comment below!

Roles in a data driven organization

All this talk all the time about Big Data and Advanced Analytics is all well and good, in fact it is something I do most of my time. The technology is there and has great potential. The biggest question now is how to use these technologies to their full extent and maximize the benefits of the technologies for your organization. The answer lies in becoming a data driven organization.

A data driven organization is an organization that breathes data, not only in the sense of producing data, but also in the sense of analyzing, consuming and really understanding data, both their own as well as the data others can provide. In order to have a sense to become a data driven organization, you will need to change People, Process and Technology. There is enough talk about the Technology in the market already (and on this blog), so I will come back to that later and not go into much detail now. Let’s look at the other two: People and Process. I view Process as very much related to People: bringing in new skills without the proper Process in place for how to work with them and for the new People to work together will not be very useful.

So, what People do you need? In other words: what roles do you need in a data driven organization? I see four required roles in any organization that wants to be more data-driven. This is not to say that these four roles should be four different people; it is very well possible that someone might take on more than one role. I am however confident that there exist very few people who will able to do all four roles since each requires specific skills, focus and passion.

The four roles are: Wrangler, Scientist, Artist and Communicator. Let’s look at the four roles in more detail.

Wrangler

The role Wrangler, or data wrangler as others call this role is responsible for identifying, qualifying and providing access to data sets. In this sense the data is the wild horse that the wrangler tames. This role is a need for the Scientist role to work with qualified, trustable and managed data sources. In much situations, this looks a lot like the current data management roles already present in organizations. This role lives mostly in IT. Keywords here are databases, connection strings, Hadoop, protocols, file formats, data quality, master data management, data classification.

Scientist

More popularly called the Data Scientist, a lot of people seem to believe that as long as you hire a Data Scientist you are a data driven company. This is much the same as saying that if you have Hadoop you ‘do Big Data’. This is about as smart as saying that if you got your driver’s license you make an excellent Formula 1 driver. It is just not true, sorry. Note also, that the opposite applies; if you are a great Formula 1 driver you could be a very bad driver on open roads. Running Hadoop does say you use Big Data. Hiring a Data Scientist does not mean you are a data driven company.

A Scientist is someone who applies maths, a lot of maths, to convert data into information. He or she applies statistical models and things like deep learning, data mining and machine learning to make this happen. Scientists are the rock stars of this data-focused world since they are the once actually making the magic happen. However, they cannot do it alone. They need good quality and trustable data, which is what the Wrangler supplies. Also, these Scientists happen to be ill-understood by the rest of the organization. This do this experiment: have your (Data) Scientist stick around the water cooler for 15 minutes every day and let him / her talk to people (I know, for some this is hard already). Then, check how quickly the person the Scientist is talking to disconnects. My experience is that someone who is not a fellow Scientist or Communicator will not make it for 15 minutes. Just try it, you will see what I mean.

Keywords here are data mining, R, Python, machine learning, statistics, algorithms.

Artist

The Artist role converts information the Scientist brewed up to insight that the consumers can understand and use. This role focusses on esthetics and the best way of data visualization to bring the message across in the best possible way. While the Wrangler is a very IT focused role and the Scientist is very mathematical, the Artist often comes from the creative arts world. The Artist just loves making things understandable and loves making the world a better place by creating beautiful things, such as great looking reports and dashboards. They often employ storytelling and other powerful visual methods such as infographics to convey their message to the consumers.

Keywords: data visualization, dashboard design, signaling colors, storytelling.

Communicator

The last role in data driven organizations is a chameleon; If you look at the types of person in the Wrangler, Scientist and Artist role it is clear to see that these are very different people, with different backgrounds and different passions. Just as much as some of them find it hard to talk to the rest of the organization they can find it difficult to talk among their own and work together. In order to make sure there is no communications breakdown, many organizations invest in a Communicator; someone who has enough understanding of the passion of the people in the other roles to be able to level with them, understand their needs and explain the needs of others to them. Sub types of the Communicator is the Wrangler-Scientist communicator and the Scientist-Artist communicator.

This concludes the roles I see in a data driven organization; of course these roles with need the be supported with the right Processes and Technology. Having a Technology platform instead of disparate tools will help you to achieve this and make the best out of the investments you are making in these roles.

Azure Machine Learning pricing explained

Many customers asked me questions on Azure Machine Learning (Microsoft’s fully managed machine learning and data mining solution) and more specifically on it’s pricing. In this post I will try to explain how the pricing works and what components you need to be aware of.

Azure Machine Learning is offered in two tiers: Free and Standard. The Free tier is obviously, well, free. It is however as you could expect limited compared to Standard. Differences are mostly in performance (multiple nodes for execution in standard vs. just one node in free) or storage (10 gb in free, unlimited in standard). There is no SLA for the free version, you cannot set up a production Web API to automate experiments in free and the staging web API is throttled.

For the standard tier, the following items need to be taken into consideration:

  • Seat; Azure ML has a monthly fee per seat, which translates to a user (mostly your data scientist) using the Azure ML web interface to develop and tune experiments. This price is per month per subscription/seat.
  • Studio usage; This is an hourly price for usage of the Azure ML web interface; so you will need to estimate the number of hours the users will be logged in. If they work 40 hours per week on the project then you would need to take into consideration 40 hours of studio usage time per week per data scientist.
  • API Usage; Azure ML allows you to bring an experiment online through the use of RESTful web services. This means you can automate score and training and have applications, websites, etc. use the experiment without human interference. With this you could do an automated credit scoring, recommendation or churn prediction directly from your app or website. In order to make this work you will need to create a web service in Azure ML (also called API). Azure ML charges per hour for compute used in an API that is production, so that is the fee you will need to pay per hour the web service / API is ‘online’ and usable. Also, you will need to pay per 1000 transactions. Transactions in this case are interactions with the API, such as one recommendation, one churn or one credit score.

 

Hope this clarifies a bit. Please refer to the official page linked above for more details and for the pricing details.

 

SQL Server op Linux? Ja dus!

Als iemand een paar jaar terug tegen mij had gezegd dat Microsoft Multi platform zou gaan en dat ook SQL Server op Linux zou uitkomen, dan had ik hem voor gek verklaard. Echter sinds de komst van Satya Nadella als CEO en mensen als Scott Guthrie (CVP voor Cloud & Enterprise) waait er een frisse wind door Microsoft!

En als Microsoft employee voelde ik natuurlijk al wat aankomen, maar het is toch altijd weer top om het nieuws dan nu in de buitenwereld te zien landen: Gister heeft Microsoft de Linux versie van SQL Server geannonceerd in preview. Dit betreft voorlopig alleen het RDBMS en niet de overige SQL tools als SSRS en SSAS, maar een mooie start om nu ook op Linux de (volgens Gartner) de leading database te kunnen runnen!

Ik zou het leuk vinden als jullie na het testen van deze versie op het blog zouden willen reageren met de eerste indruk van de performance van deze preview versie.

De blogpost van Scott over de annoncering: http://blogs.microsoft.com/blog/2016/03/07/announcing-sql-server-on-linux/

SQL groeten,

Harry

Inception style insight: analyzing Gartner Magic Quadrants for BI, Analytics and Databases using Power BI

Disclaimer: Yes, I know the Gartner Magic Quadrants normally should be bought. The information in this post however is based on the web searches much like this: http://www.bing.com/images/search?q=gartner+magic+quadrant+bi&FORM=AWIR which happens to return the Gartner Magic Quadrants…

A while ago I used Power BI to analyze the BI market. Back then I did that in Excel. Given the recent movement in the Magic Quadrants for databases, analytics and BI I figured I needed to do a re-run. This also enables me to show off the new Publish to Web feature of Power BI. You can see the result below; I found images of the following Gartner Magic Quadrants:

Then I followed the same process as in the previous blog post on this subject (that means a lot of manual labor :)) and made a Power BI report on top of it. Also I added the logos of the companies involved. The report has two tabs: the first focusses on one magic quadrant at a time, while the second allows you to see the position of a certain vendor in all five MQs I analyzed. Both tabs allow you to look back in time and see the development of them over time.

Enjoy this bit of Inception-style Insight :). If you have ideas on how this could be improved, let me know! Have a look below or open the report full screen.

 

 

Automatically building a Microsoft BI machine using PowerShell – Final post (post #14)

This post is #14 in the series to automatically build a Microsoft BI machine using PowerShell – see the start of series.

In this series:

Start of series – introduction and layout of subjects
Post #2 – Preparation: install files using Azure disk
Post #3 – Preparation: install files using Azure File Service
Post #4 –Preparation: logging infrastructure
Post #5 – Master script
Post #6 – Disabling Internet Explorer Enhanced Security Configuration
Post #7 – Active Directory setup
Post #8 – Configuring Password policy
Post #9 – Installing System Center Endpoint Protection
Post #10 – Installing SQL Server
Post #11 – Installing SharePoint Server
Post #12 – Installing PowerPivot for SharePoint
Post #13 – Configuring PowerPivot for SharePoint

Wow. This has been a long and wild ride. But you and I made it together. We now have the full recipe to automatically configure a Microsoft BI demo machine with PowerShell. Of course there is more to be done, such as configuring other Service Accounts and deploying demo content; this script however saves me a lot of time every time I need to stand up a new demo machine.

You can download the script here. Please note (again) that the code is provided as-is and you should use it at your own risk. It is probably still buggy but should give you a good starting point to adapt it to your needs.

I enjoyed the ride with you; hope I made your life a bit easier of the course of this series. Enjoy!

Automatically building a Microsoft BI machine using PowerShell – Configuring PowerPivot (post #13)

This post is #13 in the series to automatically build a Microsoft BI machine using PowerShell – see the start of series.

In this series so far:

Start of series – introduction and layout of subjects
Post #2 – Preparation: install files using Azure disk
Post #3 – Preparation: install files using Azure File Service
Post #4 –Preparation: logging infrastructure
Post #5 – Master script
Post #6 – Disabling Internet Explorer Enhanced Security Configuration
Post #7 – Active Directory setup
Post #8 – Configuring Password policy
Post #9 – Installing System Center Endpoint Protection
Post #10 – Installing SQL Server
Post #11 – Installing SharePoint Server
Post #12 – Installing PowerPivot for SharePoint

Now that PowerPivot for SharePoint has been installed, we need to configure it. I split the configuration into two parts since we need a reboot in between and used MSDN for reference: http://msdn.microsoft.com/en-us/library/hh230903.aspx.

Step A: configuring SharePoint and deploying PowerPivot features

In Post #11 we talked about installing SharePoint, but the actual SharePoint provisioning was not done then. We will do it here in one go with installing PowerPivot features.

 

Step B: updating farm credentials and starting service applications

After the PowerPivot features have been deployed we need to configure Service Applications to get PowerPivot to work.

 

Now we have seen all the steps required to build a Microsoft BI demo machine! The next post will serve as a wrap up and present a download for the full script.

Automatically building a Microsoft BI machine using PowerShell – Installing PowerPivot for SharePoint (post #12)

This post is #12 in the series to automatically build a Microsoft BI machine using PowerShell – see the start of series.

In this series so far:

Start of series – introduction and layout of subjects
Post #2 – Preparation: install files using Azure disk
Post #3 – Preparation: install files using Azure File Service
Post #4 –Preparation: logging infrastructure
Post #5 – Master script
Post #6 – Disabling Internet Explorer Enhanced Security Configuration
Post #7 – Active Directory setup
Post #8 – Configuring Password policy
Post #9 – Installing System Center Endpoint Protection
Post #10 – Installing SQL Server
Post #11 – Installing SharePoint Server

Ok, now that both SQL Server and SharePoint Server are installed, we just need to set up PowerPivot for SharePoint and configure it. Easy huh? Well, it turns out it is pretty difficult to get it right. Installation is not difficult (this post) but the configuration is harder (the next post). Here is how to install PowerPivot. I used MSDN for the info: http://msdn.microsoft.com/en-us/library/ee210645.aspx.

Installing PowerPivot involves mounting the SQL Server Installation Media and calling the setup with the right parameters.

Next time: configuring PowerPivot.

Automatically building a Microsoft BI machine using PowerShell – Installing SharePoint (post #11)

This post is #11 in the series to automatically build a Microsoft BI machine using PowerShell – see the start of series.

In this series so far:

Start of series – introduction and layout of subjects
Post #2 – Preparation: install files using Azure disk
Post #3 – Preparation: install files using Azure File Service
Post #4 –Preparation: logging infrastructure
Post #5 – Master script
Post #6 – Disabling Internet Explorer Enhanced Security Configuration
Post #7 – Active Directory setup
Post #8 – Configuring Password policy
Post #9 – Installing System Center Endpoint Protection
Post #10 – Installing SQL Server

Wow, so the last post was pretty intense, wasn’t it? I think we are ready for the next one: installing SharePoint. To build this script I used the following sources: http://social.technet.microsoft.com/wiki/contents/articles/14582.sharepoint-2013-install-prerequisites-offline-or-manually-on-windows-server-2012-a-comprehensive-guide.aspx#Installing_the_Roles_and_Features_for_SharePoint_2013_on_Windows_Server_2012_Offline_with_PowerShell and http://blogs.msdn.com/b/uksharepoint/archive/2013/03/18/scripted-installation-of-sharepoint-2013-and-office-web-apps-server-from-the-field-part-2.aspx.

Since this is again a quite lengthly script we will split it up in steps.

Step A: enabling IIS and other features

This step enables a whole load of features on Windows that are required by SharePoint, including IIS. If a restart is required, the script will reboot after the setup of the features. Some times your machine might reboot more than once to complete the setup of all these features.

 

Step B: Installing SharePoint Prerequisites

SharePoint itself has a number of prerequisites; in this still we will install them all.

 

Step C: Installing SharePoint

SharePoint is installed in this step from the ISO that the script mounts.

 

Step D: Cleaning up

This step simply unmounts the SharePoint installation media.

 

Pff, are we done yet? No! Next up: Installing PowerPivot for SharePoint.

Automatically building a Microsoft BI machine using PowerShell – Installing SQL Server (post #10)

This post is #10 in the series to automatically build a Microsoft BI machine using PowerShell – see the start of series.

In this series so far:

Start of series – introduction and layout of subjects
Post #2 – Preparation: install files using Azure disk
Post #3 – Preparation: install files using Azure File Service
Post #4 –Preparation: logging infrastructure
Post #5 – Master script
Post #6 – Disabling Internet Explorer Enhanced Security Configuration
Post #7 – Active Directory setup
Post #8 – Configuring Password policy
Post #9 – Installing System Center Endpoint Protection

In this tenth post we will get to the heart of it: installing SQL Server. After this script completes we will have SQL Agent, SQL Database, Analysis Services (multidimensional and tabular), Integration Services, Data Quality Services, Master Data Services, FullText search, Filestreaming, Development and Management tools and Reporting Services (both native and SharePoint integrated mode) installed. This script will be lengthier than earlier scripts simply because there is a lot more to do. Info I used to create this script: http://msdn.microsoft.com/en-us/library/ms144259.aspx. Here we go.

Step A: creating new service accounts

In this step we first remove any service account that starts with ‘SQL Server’ and then create new serviceaccounts using the configured password.

 

Step B: making sure required features are installed

In this step we make sure .NET 3.5 feature is enabled in Windows.

 

Step C: Mounting the ISO and set up the parameters

We can now mount the SQL Server installation ISO and set up parameters for the setup to run with. We will do two phases (passes) since we cannot install both SSRS Native and SharePoint integrated mode and SSAS Multidimensional and Tabular mode in one go.

 

Step D: do the actual installations

Now we execute SQL Server setup with the right argument list. This configures instance names, service accounts and passwords and the features to install. The install will be silent.

 

Step E: wrapping up

In this step we unmount the SQL Server installation media and write to the log.

 

 

Next step: installing SharePoint

%d bloggers like this: