Tag Archives

2 Articles

Generating perfect test data using Visual Studio and custom Data Generators

In many development projects the need arises to fill a database with some random data. With random data I do not mean mere gibberish, but actual data such as an email address, phone number, SSN, numeric value, dates, datetimes, etc.

With Visual Studio 2010 you can do this very easily. All you need is an edition of Visual Studio 2010 that includes the database projects (I think you will need Ultimate, but am not sure). And of course you will need a database, I used SQL Express in this example.

However, turns out that the data generators included in Visual Studio are somewhat limited. Visual Studio can generate data for many datatypes in SQL Server, but for example generating a phone number, SSN or email address is not possible.

However, one can easily create custom data generators, which is exactly what we will do. I will build custom data generators for email addresses and one that generators Lorem Ipsum text.

Building the custom data generators
Start with creating a new class library project. In the solution explorer, right-click the project and click ‘Add Reference’. Click the ‘.NET’ tab and select the following components: Microsoft.Data.Schema, Microsoft.Data.Schema.SQL and Microsoft.Data.Schema.Tools. Click ‘OK’ to add these references.

Open your class (I renamed my class and class file to LoremIpsumGenerator) and add the following using statements:

Now we need to inherit the Generator class by changing our class definition to:

Also add the DatabaseSchemaProviderCompatibilityAttribute to your class declaration as so:

Now we need to add an output property. This will be used to wire up the output of your generator to Visual Studio Data Generation Plans.

Also, if you need to specify some parameters to your data generator we need to create some input properties. I have created three: one for the number of words to generate, one to indicate if the data needs to start with ‘Lorem Ipsum…’ every time and one to specify if just random words can be selected or if we are more interested in sentences. Also note the default settings at the bottom.

Then all we need to do is override the OnGenerateNextValues() method. Since I wanted to Unit test my generator I have just one line in this method, which class a public void method (which my testing command line project can call). In this function (called Generate) I built the output based on Lorem Ipsum and set my Output property.

I will not discuss how exactly I managed to get the Lorem Ipsum words returned. If you want to know you can download the source code. Also, my email address generator follows the same structure of output, inputs (in this case none) and a generate function that creates the email address string and sets the output property.

Registering the data generators
If your data generators build successfully, we need to sing the generators with a strong name. To do this, right-click your project’s name and choose properties. Go to the Signing tab and select the ‘Sign the assembly’ check box and under ‘Choose a strong name key file’ choose ‘New…’. Follow the wizard and enter a password. Your screen now looks like this:

Now save all, and rebuild your solution.

Next we will need to create an XML file to accompany your DLL. Just add a XML file to your project and name it [YourProjectsName].Extensions.xml. My project (and thus my dll) is named MyDataGeneratorsLibrary, so the filename of the XML is: MyDataGeneratorsLibrary.Extensions.xml

Open the XML and replace the contents with this (you will have to fill in your own values).

As you can see, we need an Extension element for each generator. In my library I have created two generators (one for Lorem Ipsum text, and one for email addresses). Also you will have to specify the correct type (which is your classname and namespace) and the assembly (which again is your DLL file / project name). Last item to enter is the public key token. To get it open your Visual Studio Command prompt and use

to get the public key token returned to you. You will need to enter the correct path and file name. Copy paste the public key token into the XML.

One last thing I did is right click the XML and chose Properties. I changed the ‘Build Action’ to ‘Content’ and the ‘Copy to Output Directory’ property to ‘Copy always’. It does not matter in any way but it makes the next step just a bit easier.

Now, build your solution and navigate to your solutions directory. In there go to the bin\Debug folder, and copy your DLL and XML file.

Now it is time to register the data generators with Visual Studio. Be sure your Visual Studio 2010 instance is closed and navigate to the following folder: C:\Program Files (x86)\Microsoft Visual Studio 10.0\VSTSDB\Extensions . It is good practice to keep your own extensions in a separate folder, so I created a folder ‘CustomGenerators’ and I advise you do the same. Next, paste your DLL and XML files into this folder.

Start Visual Studio.

Using the data generators to generate data
In Visual Studio, add a new project to your solution or create a new solution. Select SQL Server Database Project as the project type and enter a fancy name. I assume you already have built your database schema. Right click on your project name and choose ‘Import Objects and Settings’ and follow the wizard to get your database schema into your project. Next, add a new Data Generation Plan and open it. Select the table you want to use and specify the number of rows to insert.

You will see a list of the columns in the table. To the right (under ‘Generator’) you can specify which generator you want to use to generate data with. It should contain your custom generators now. Select it and click the preview button to test your generator: . Here is my screenshot:

If you have any input properties specified you can change the values by selecting the row and opening the properties window, see below screenshot for my Lorem Ipsum generator.

That concludes my introduction into custom data generators. Download the source code (with the Lorem Ipsum and email address generators) on Github.

SQL Server 2012 Unboxing

Just about every new consumer technology device will be greeted with “unboxing” videos on YouTube. A lot of the people I talk to really need to start unboxing SQL Server 2012 and start to understand what is in the box. Most of them already have access to SQL Server 2012 and still think it is just a database. There is so much more! This post is aimed to providing a quick overview of what exactly is in the box with pointers to where you can find documentation.

  1. Database Engine (SSDE)
    First off, let’s start with the product that gave SQL its name: the database. This is without doubt the best known product of the whole SQL suite and also the most used. More often than not this is also the only product people use and know. Find out more here: http://technet.microsoft.com/en-us/library/ms187875.aspx
  2. Data Quality Services (DQS / SSDQS)
    Introduced with SQL Server 2012, DQS is a knowledge-driven data quality solution that works on the premise of specifying what defines data quality in a knowledge base and using to cleanse data automatically during ETL (see SSIS below), Master Data Management (see MDS below) processes or manually.
    See: http://technet.microsoft.com/en-us/library/ff877925.aspx
  3. Analyis Services (SSAS)
    Analysis Services is SQL Server’s analytical database or cube. It features both more traditional cubes and tabular models, provides self-service analysis capabilities and includes data mining. See http://technet.microsoft.com/en-us/library/bb522607.aspx
  4. Integration Services (SSIS)
    Integration Services is a full-blown ETL tool and can be used for all sorts of data integration solution. SSIS features a drag and drop interface to build the solution and provides a lot of components out of the box with connectors to and from just about any database, file storage or file format. If need be, you can also use the power of .NET to build the exact behavior required. SSIS also integrates with DQS to use data quality knowledge bases during ETL processes. For more info visit: http://technet.microsoft.com/en-us/library/ms141026.aspx
  5. Master Data Services (MDS)
    Master Data Services enables users to build a Master Data Management solution on top of SQL Server. MDS integrates with DQS to make data quality aspects a part of the overall MDM solution. See http://technet.microsoft.com/en-us/library/ee633763.aspx
  6. Reporting Services (SSRS)
    Reporting Services is the enterprise reporting solution that delivers web-enabled reports that can get information from a variety sources and be rendered in various formats (including Excel, Word and PDF). Also, reports can be retrieved on demand, on subscription bases or based on a alert. Find out more here: http://technet.microsoft.com/en-us/library/ms159106.aspx
  7. StreamInsight
    StreamInsight is Microsoft’s Complex Event Processor (CEP). CEP technology enables high throughput and real-time (low latency) processing of streams of data (events). Examples include financial trading, Web analytics, sensor data, etc. StreamInsight is provides a familiar development platform based on .NET to quickly start using real-time information. See: http://technet.microsoft.com/en-us/library/ee391416.aspx

That concludes the quick unboxing of SQL Server 2012. Although there is a lot more to say (about features, but also around editions and capabilities) , this should give you a good idea of what is in the box. Bottom line: there is a lot more to SQL Server than just a database!

%d bloggers like this: