Devin Knight, a co-worker of mine at Pragmatic Works, presented last night at our local JSSUG Jacksonville SQL Server User Group meeting.
Power Query (Formally known as Data Explorer) is a free add-in for Excel 2010 and up that provide users an easy way to discover, combine and refine data all within the familiar Excel interface. With Power Query you can now combine and transform data from a variety of unusual data sources all within Excel, which would have previously required a complex ETL job.
This was a well organized presentation with good introduction using PowerPoint to describe how Power Query fit into the landscape of self-service business intelligence (data extraction, modeling, and presentation). Demos were useful in understanding a good range of options for handling common scenarios.
How comfortable are you with your current process for designing data warehouses? Have you encountered difficulties determining how to better gather, organize, and use requirements to ensure a successful data warehouse design? This session will help by providing an agile data warehouse design process that gets targeted and meaningful requirements. The session will start by reviewing common\current data warehouse design elements and architectures. Next, the session will detail primary agile techniques and terminology. Lastly, the session will detail the use of Modelstorming using BEAM* (Business Event Analysis & Modeling) and the 7 Ws Framework.
I enjoyed presenting this to some really great folks in Indianapolis Saturday August 10th. I wish all attendees best of luck in tackling your future data warehouse and business intelligence projects!
NOTE: A PDF version of the PowerPoint slides is available from the SQLSaturday schedule page at www.sqlsaturday.com/schedule.aspx?eventid=334
New to SQL Server Integration Services 2012 is deployment as a project to the Integration Services Catalog (physically storing packages and metadata in the SSISDB database on an instance of a SQL Server database). This along with an enhancement for managing connection managers permits much easier configuration than previous methods and in one case in particular, data sources which require using passwords as part of the connection, gives us new found ease and power.
SSIS packages to be deployed and run on a server that is being hosted and is on a different Windows domain than the primary data source are problematic due to domain trust impacts on windows authentication normally used by SSIS connection managers.
Others have solved this scenario by configuring one-way domain trust and continuing to use window authentication within their SSIS connection managers. While this method is impressive it is not permitted oftentimes when hosting via third-parties or just because it does have security implications that cause network administrators to hesitate. This post will demonstrate how we can leverage SQL Authentication and SSIS 2012 new features to successfully handle connectivity scenarios where windows authentication is not available.
- Create a SQL login account on the source and destination SQL Server (using the same password)
- Assign only “public” server role to the new SQL login account
- Assign (minimally) db_datareader database role on source SQL Server databases
- Assign (minimally) db_datawriter database role on target SQL Server databases
Permissions assigned to the databases will vary depending on what the SSIS packages are doing. A common pattern of truncate and load, for example, will require the account also have db_ddladmin permission on the database in order to execute the TRUNCATE TABLE statement.
- Implement all (related) connection managers in an SSIS 2012 solution as project type
- Ensure SSIS 2012 project is configured as the project deployment model (unsupported for package or database deployment models)
- Ensure SSIS 2012 project security – protection level is set (desired not required) as DontSaveSensitive
- Ensure SSIS 2012 package security – protection level is set (desired not required) as DontSaveSensitive
With all the configuration of prerequisites completed let’s see how this works when deploying to the Integration Services catalog. Please note that no parameters (new to SSIS 2012) or environments (also new to SSIS 2012) or any other special configuration is required. Under the covers SSIS 2012 project deployment mode considers connection manager properties, parameters, and logging level as parameters and stores their information in the SSISDB database tables related to parameters (internal.object_parameters among others). As a result of this the SSIS catalog will automatically take care to save our sensitive password encrypted within the catalog enabling simple deployment and use.
The SSIS project is ready for deployment to the Integration Services catalog. Once deployed we can configure the project by opening the Integration Services Catalogs folder on our SQL Server instance (located below Databases, Security, and all other database objects and above SQL Server Agent) right-clicking on the project and selecting “Configure…”.
Once the Configure dialog is displayed, three things need to be setup:
- Scope of the configuration change (All packages and project, Entry-point packages and project, deployed project, specific package)
- Select the Connection Managers tab and the desired Connection Manager
- Enter a Password by clicking the ellipsis button beside the Password property
Once the deployed SSIS project connection manager passwords have been entered via the Integration Services Catalog configuration the project is ready to be run. An example of running using TSQL follows:
We can see what the configuration did under the covers by looking at some of SSISDB database tables. Most notably we can view [SSISDB].[internal].[object_parameters] to determine what this configuration stored and how it stored the sensitive password value that was set.
Immediately we can note that object_type column values for connection managers is set to 20 which differs from our normal SSIS parameters that would also be stored in this table. The parameter name describes the SSIS object and corresponding property in which the value is stored for configuration. In the case of our connection manager this parameter_name takes the format of “CM.NameOfConnectionManager}.ConnectionManagerProperty}”
As we have a password stored, note that the sensitive column is flagged with a 1, the design_default_values is set to NULL (removed when deployed since we set protection level to DontSaveSensitive), the sensitive_default_value is set to an encrypted value (not readable), and lastly the value_set is set to 1 (meaning a default value has been configured).
Now we can easily manage these connections by rerunning the configuration or using the [catalog].[set_object_parameter_value] stored procedure located in the SSISDB database.
This blog demonstrated how SSIS 2012 can manage sensitive property values such as connection manager passwords in an easy to understand, maintain and use manner while keeping them encrypted and safe. No special effort other than simple SSIS project deployment and subsequent project configuration from within the Integration Services Catalog is needed to ensure this works successfully.
When attempting to reuse the name of a previously deleted Team Project you will like get the following error:
“TF218017: A SharePoint site could not be created for use as the team project portal. The following error occurred: TF250038: You specified the following URL: http://ServerName/sites/ProjectCollectionName/NewTeamProjectName. However, a site already exists at that URL. Specifiy another site.”
The reason for this error is that the Team Project site in SharePoint must be deleted and it must also be removed from the SharePoint Project Collection Site Collection recycle bin.
NOTE: The user trying to perform these steps must be defined as one of the two site collection administrators within SharePoint.
The first step is to delete the Team Project site. In SharePoint this is done by accessing the site and selecting Site Settings. On the “Site Settings” page under the “Site Actions” section choose the option to “Delete this site”.
This last step successfully deletes the site; but, it will be held in the site collection recycle bin. Next we need to empty this site deletion from the site collection (project collection site) recycle bin. Start by accessing the site collection in which the team project site was contained (the site for the TFS Project Collection). Then access the “Site Settings” and the “Recycle Bin”. Within this recycle bin should be the site that was deleted in the previous step. Select the previously deleted site and choose the “Delete Selection” option.
Now the previous Team Project site has been permanently deleted from SharePoint we can proceed to create a new TFS Team Project with the same name as the one just deleted from SharePoint.
Just wanted to share some interesting details regarding configuring Analysis Services (SSAS) security via Roles on a local computer. Turns out that the Change User functionality found within both SQL Server Management Studio (SSMS) and Business Intelligence Developer’s Studio (BIDS) seem to rely upon the domain to which the computer running the applications is joined and as a result does not permit changing to a local user account.
The scenario is that I have setup local accounts such as MachineName\EOne in order to test dynamic dimension security within Analysis Services (SSAS). I launch SSMS normally and it shows me connected using my domain account MyDomain\MyAccount. I then click on a cube and select browse. From the browse window, I then click the change user tool, select Other User, and provide the MachineName\EOne user account.
The first observation to note is that after browsing and selecting the account, the Security Context dialog returns to the selection; but, does not append the local machine name domain.
After accepting the setup I am returned to the cube browser and greeted by the error that “There are currently no logon servers available to service the logon request.”
After changing my Analysis Services (SSAS) server’s service account (tried Local System, Local Service, Network Service, and a custom machine account), I was left with the same problem connecting via the Change User option.
I finally resolved this by connecting to SSMS via the runas command from a command window using the /user: option to change what account should run the application. The specific command used was: runas /user:MachineName\EOne "C:\Program Files (x86)\Microsoft SQL Server\100\Tools\Binn\VSShell\Common7\IDE\Ssms.exe"
The result in SSMS was a successful connection using a local machine user account as shown in the following:
When we have been developing data integration solutions using SQL Server™ Integration Services (SSIS) for a long enough time (or frankly any other type of design or development for that matter), we start to realize that many of our solutions employ the same patterns of handling information or the manipulation of the information.
Case in point, when loading data warehouse solutions we begin to apply patterns to well known data warehouse data integration processes such as extraction of source data to staging, loading of dimension tables from staging, and loading of fact tables from staging. For a very long time I’ve employed a bit of a copy and paste approach to handling SSIS packages designed based on the patterns for each type of data warehouse data integration process.
Disclaimer: At the time of this writing, I am presently employed by Pragmatic Works; but, even though I am paid by the company, I am providing my own professional observations about a wickedly cool product we happen to have created called BI xPress.
I now have the ability to use my company’s BI xPress product to really help speed development using patterns for the design of each package. Some really useful features were added in a direct response to my own field usage when trying to make templates as powerful and productive as possible. Notably, we have added a concept of parameterization that enables me to set up token replacement values for often repeating values that will change when generating an SSIS package using the template. Prior to this enhancement I might have the same value repeated in many different properties in the SSIS package which required a lot of extra steps to ensure the values were correctly assigned when using the template; but, now with the parameterization these repeated uses are consolidated into a single input value to be changed when using the template.
Let me demonstrate how the use of a BI xPress template with the new parameterization feature helps speed our SSIS package development.
After installing BiXpress the product configures many different useful Business Intelligence Developer Studio (BIDS) add-ins. We can access the Package Builder Wizard many ways including selecting the BI xPress main menu bar’s Package Builder Wizard menu item or, as in the following screenshot, right click on the SSIS Packages folder in BIDS Solution Explorer pane and select the Package Builder Wizard menu item.
For this demonstration I will simply show the design of the template that I created to handle the repetitive pattern design applied to my source data extract SSIS packages. I have the following design requirements for each package:
- Use two tiers of package configuration – the first tier reads a Windows Environment Variable to get the path to an XML dtsconfig file (Indirect XML configuration) which contains the configuration for my SSISConfigSqlDb connection manager and the second tier (applied to all remaining configuration entries) uses the SSISConfigSqlDb connection manager to connect to my SQL Server database’s [dbo].[SSIS Configurations] table to configure other connection managers and variables
- Use four connection managers that must be configured using package configuration (see prior details)
- Use nine variables that are either configured using package configuration or using property expressions most notably the SQLQuickBookSource variable (scoped to the data flow task) will be identical among the packages with the only exception being the actual table name.
- Use a data flow task to move data from the original source system to a staging database
- Within the data flow task use Pragmatic Works Task Factory Upsert Destination to manage Updating and Adding New values to the staging database
The following screenshot shows the overall design of the package.
With these design requirements in hand, we quickly identify that all the packages can use this generic pattern to create SSIS packages and more specifically that only a very few things need to change when generating each new package such as:
- The name of the package (and perhaps the description – you do add descriptions to your packages, tasks, and data flow components don’t you)
- The SQLQuickBookSource variable expression for the value property which contains our source SQL statement
- The name of the data flow task (I prefer descriptive names to assist with understanding logging and auditing)
After launching the BI xPress Package Builder Wizard we select the option to create a new package from a template and provide the desired package name and folder location as seen in the following screenshot.
Now we select the desired template from the options presented. Note how BI xPress templates permit providing a screenshot and description to help others to understand the usage of the template.
Now we are at the really important and fun part of using these templates, the configuration of values to be used to generate the new package based on the template. The following screenshot shows the various options that were configured during the setup of the template.
Specific to our scenario, the SQLQuickBooksSource variable’s value which contains our SELECT statement needs to be changed to reference the new table being processed by our new package. Additionally, the naming of the data flow task will also be renamed to include the new table name. The template uses the new parameters capability to permit the user of the template to enter a name of the new structure once and this value will then be pushed into every configurable property that has the parameter as part of the current value (e.g. “SELECT * FROM <@QuickBooksSourceTableName>”). I have had clients where one parameter ended up being pushed into over twenty different configurable properties and as a result they saved having to enter the same value consistently twenty times in order to create the package from the template.
As shown in the following screenshot, the Set Parameters button can be used to set the value of the parameter to the desired value or alternatively this will also be prompted for when proceeding to the next step of the wizard when selecting Next.
After the package is generated, in our scenario we will have expected data validation errors due to the fact that the table used as a source and destination in the template differs from the one generated by our template. We simply open the source and destination components in our data flow to make corrections by accepting the option in the source to have the component fix these errors automatically and in the destination by setting the column mapping options in the Restore Invalid Column References Editor to “Delete invalid column reference” as shown in the following screenshots.
With these steps completed we now have done all that is required to create a new package that meets all of our requirements and permits us to extract another source table. BI xPress templates and the parameterization feature really cut down on the effort needed to develop this package and many others that share common design patterns.
More details about this and the many other features within BIxPress can be found at the product’s web site page http://pragmaticworks.com/Products/Business-Intelligence/BIxPress/Default.aspx.
In an effort to help avoid many long wasted hours researching interesting join “features” found within the SQL Server Integration Services (SSIS) Merge Join Data Flow Component, I offer the following discoveries (more embellishments and links to follow).
- Using SQL Server Uniqueidentifer (GUID) data type columns as join columns will yield very strange results with the only solid work around being to cast (or convert as your TSQL preference dictates) the Uniqueidentifer to NVARCHAR(36) followed by performing LTRIM, RTRIM, and UPPER to be safe and ensuring sucessful character by character matching. Example: [CaseBusnessKey] = LTRIM(RTRIM(UPPER(CAST([CaseID] AS NVARCHAR(36)))))
- When more than one sorting/matching column is used you have to be careful which you want to include in the results. Specifically, you may select all of the sorting/matching columns, none of the sorting/matching columns, or the first several (in sequence starting from the first) sorting/matching columns. So based on these requirements, you may not select the 2nd sorting/matching column without selecting the first sorting/matching column. Should you inadvertently try to select the 2nd sorting/matching column without selecting the first sorting/matching column, the component will produce an error message “The component is not in a valid state. The validation errors are: The column with the SortKeyPostion value of 0 is not valid. It should be 2.”