SQL 2008 Backup Compression -- Real World

by Rob 23. May 2010 16:37

One new feature in the SQL 2008 product line is on-the-fly compression of backups as a built-in feature of the SQL Relational database.  I've been excited about this feature, and anecdotally happy with the results.

Today I took a few minutes to do a head-to-head comparison of compressed vs. non-compressed backups.  And just for good measure I added in a test to compare the on-the-fly compression with using a secondary compression (WinRar).  This latter approach is one that we've all used for years, and I was interested to see whether I would be "forever done" with using secondary compression now that compression is "baked in" to the product.

The following tests are against a 10.2GB SQL database running on SQL 2008R2 x64 on an 8-CPU database server.

Here are my results:

  Time to make backup Time for secondary compression Total Time Final size of backup on disk
Uncompressed Backup 5 minutes N/A 5 minutes 5.4GB
Compressed by SQL during Backup 1 minute N/A 1 minute 1.25GB
Make uncompressed backup, then compress with WinRar 5 minutes 13 minutes 18 minutes 0.75GB

Tags: ,

SQL Server 2008

Oracle Driver configuration for SSIS, SSRS and SSAS in a 64-bit Environment

by Rob 28. March 2010 00:54

Today I finished a white paper on configuring Oracle drivers for Microsoft BI applications (like Integration Services, Analysis Services and Reporting Services).  I've configured these environments quite often, but like so many things my brain has a hard time remembering all the nuances every time.

So to help myself--and hopefully some others as well--I put all the basics down on paper.  Beyond being a "white paper", this one has complete configuration instructions (with screen shots) to install, configure and test Oracle drivers in a 64-bit environment.  

Please feel free to download the full document as a PDF.  Hope this helps!

From the white-paper intro:

Summary: This white paper describes the procedures needed to configure Oracle database drivers for use in Microsoft BI services (SSIS, SSRS, SSAS and BIDS) in a 64-bit Windows Server 2008 R2 environment. 

This article covers the selection and configuration of Oracle database drivers, including detailed configuration procedures.

Specific coverage is given to configuration requirements for scenarios where BIDS is installed and used on 64-bit SSIS servers.

This document includes configuration procedures for both the 64-bit and 32-bit versions of the Oracle Provider for OLEDB and the Microsoft Connector for Oracle by Attunity.

While the primary target for this document is SQL Server 2008 R2, the techniques and discussion also apply to SQL Server 2008 and SQL Server 2005 product versions.

 

Tags: , , , ,

Configuration | Operations

PowerPivot Retail Analysis Example

by Rob 8. February 2010 01:29

Today's blog is about using PowerPivot to solve real-world problems.  

While PowerPivot hasn't hit the street as a shipping product yet, I'm intrigued by it, and I think it has a good future.  However, I see its value differently than some.  

I see it as th tool you use to add a degree of flexibility and agility to a larger data warehouse strategy. Sure, I see how it can be used to meet entirely standalone needs, but I don't think that's where its greatest potential lies.

No matter how well we plan, real-world requests and requirements always seem to stay just ahead of us, and there's always a creative power user out there ready to build a data mart with Excel and VBScript to fill those gaps.  

Wouldn't it be great if we could have a flexible tool to fill in the (hopefully) small number of requests that our data warehouse can't yet meet, but do it in a way that preserves the integrity of what's already in production?  I think PowerPivot may be that tool, and I think you should take a look.

The below embedded video walks through using PowerPivot to turn around an analysis request that doesn't quite meet the structure of an in-place data warehouse.

Note: You can view this video full screen by pressing the full screen button on the bottom toolbar. It's the second item from the right-hand side.

Tags: ,

PowerPivot

SSAS -- to Partition or not?

by Rob 20. January 2010 01:16

Recently I had a question from a client whether there was really any performance improvement in SSAS cube processing when using partitioning strategies vs. just using a single partition.

While intuitively I "knew" that partitioning cube fact tables results in better parallelism and thus increased performance (when adequate CPU, memory and I/O resources can support it), I realized I really didn't have any empirical "proof" to backup my professional opinion.  So--I setup a simple test.

The following is a very simple comparison of processing the same data set with a cube design that's identical--except for the introduction of multiple partitions. For this test I used Analysis Services 2008R2 x64 running on a 4-core Xeon server

The data set processed includes 37 million fact rows, and several dimensions--the largest including around 4,000 members.  In the first test, a single partition is used.

In the second, the cube is partitioned along months, resulting in about 1.3 million rows per partition.  This is below the recommended threshold for partition sizing, but provides a decent (if basic) evaluation of the effect of parallelism on cube processing.

Results:

  • Time to process the cube using a single partition: 21 minutes
  • Time to process the cube using 23 partitions: 12.7 minutes
  • Bottom line: by partitioning the processing of the cube, there's a time savings of around 40% in this case
Observation: The server I used is in my lab and has 4 Xeon cores, 4GB RAM, and 1.3TB RAID 1E storage. During the single partition processing, memory demand didn't exceed 3GB, and CPU utilization didn't exceed 60% or so.  With partitioned processing, both CPU and memory were completely saturated.  When I get the chance (and the required loose change) to upgrade the server to 8 cores and more memory, I'd like to re-run this to see whether increased resources would provide an even greater benefit.  However, having at least this much empirical evidence is a good start.



Single partition processing

23 Partitions processing

Tags:

Analysis Services | SQL Server 2008

Publishing PerformancePoint to Extranet Users

by Rob 3. January 2010 00:35

Today's blog is about connecting Internet/Extranet users to a PerformancePoint solution that uses Kerberos delegation to pass end-user credentials through the application layers to back-end databases. This article is about how to do it in a way that doesn't require VPN deployment, is easy to use and convenient for end-users, and adds no additional burden on SharePoint administrators or DBAs.

What? Impossible you say?  Not at all.  In fact it can be relatively easy to implement without the commonly suggested security trap-doors.  The technique has really been around for quite a while, and it's accomplished through the use of a reverse-proxy solution such as Forefront TMG or ISA server (Forefront is the name of the latest version of the product formerly known as ISA Server).

The video below is an overview and demonstration of a working solution combining the following components.

1. Windows Server 2008 R2 x64
2. SharePoint Server 2010 (CTP)
3. PerformancePoint Services (part of SharePoint 2010)
4. SQL Server 2008 R2 (CTP)
5. Forefront TMG 2010

Note: You can view this video full screen by pressing the full screen button on the bottom toolbar. It's the second item from the right-hand side.

Tags: , , ,

Configuration | Security | SharePoint

BI Dashboard for the Transportation Industry - Demo

by Rob 4. August 2009 18:34

At BlueGranite we've been considering the best way to provide better visibility to all the great capabilities BI and Dashboards deliver.

More often than not, when I do an in-person demo of a dashboard solution, it's the first time the audience has seen this kind of solution before--except for maybe a screen print or a sketch.  Personally I don't think an interactive dashboard can be explained or even understood by seeing a screen print! 

 So, today I made a quick (5 minutes) demo and uploaded it to YouTube...kind of a test of whether this is a good format to share quick ideas and demos.  Afterall, there are only so many opportunities to visit in person, and who can be everywhere at once? (hopefully with the Internet, I can?)

The video below demonstrates a BI solution which is a reduced scope demo of a solution our firm implemented in the transportation (trucking) industry.  The technology is primarily SharePoint, PerformancePoint, SQL Server Analysis Services, Silverlight and Bing Maps.

 If you like this way of communicating/blogging for BI demonstration and best practice, let me know.  It takes more time than writing, but if it's useful then in my view it's worth the time.

 --Rob

Tags: , ,

PerformancePoint | Video

TechEd 2009: Enterprise Mashups

by Rob 13. May 2009 11:06

BI Mashups

Tuesday I attended an interesting session presented by J.R. Arredondo and Dave Pae about putting together Enterprise Mashups using SharePoint Designer for WSS or SharePoint 2007. 

Mashups are one of those buzzwords, not unlike Twitter, FaceBook, etc., that sounds like a cool technology my 13-year old would be interested in but which I always try to approach with a bit of skepticism given my focus on business solutions that have ROI requirements.

And so it has been with "mashups" for me...on the radar but a bit unproven in terms of business value.

I think that needle has moved for me as a result of this session.  Of course, my primary focus is whether each new technique or technology is relevant to real-world BI solutions (not just something entertaining during a demo).

So how do I see mashups extending a traditional data-driven BI solutions?  Well, the ideal would be to take (A) traditional, planned data in a database or cube; (B) add in unstructured data (like sharepoint lists), and (C) access information on the web or from LOB systems using web services.

Modern BI solutions like PerformancePoint solve A+B, but C is not usually in the realm of end-users or analysts who assemble BI solutions.

While the Arredondo & Pae session didn't address a BI environment directly, I can see quite well how to adapt their techniques to do some interesting things. Using PerformancePoint we already have the ability to link to non-BI components, and by combining this with SharePoint designer mashup capabilities I can easily see integrating maps, internal web services and public services via various protocols.

If you haven't looked at Microsoft's Mashup page, take a look at it here: Enterprise Mashups.  And if you're not aware that SharePoint designer is now free, download it here and take a look at ways to design rich mashup pages in sharepoint.

 

Tags: , , ,

BI Strategy | SharePoint

TechEd 2009: Microsoft BI - Gemini

by Rob 11. May 2009 23:57

Today I had the pleasure to attend Donald Farmer and Kamal Hathi's session on the Gemini project.  I've seen several of Mr. Farmer's presentations on BI and Data Mining, and I'm never disappointed at his skills and depth of knowledge!

Gemini is an in-memory, Excel-based, analysis services orientated technology that brings the power of dimensional modeling to a primarily Excel skilled audience--with the promise that the output of Gemini can be forward-engineered into managed Analysis Services solutions. 

Donald's quite entertaining slide deck was a major departure from the typical corporate spin that we're accustomed to seeing (and that I'm personally accustomed to producing) when we educate potential technology users.   

The slide deck casts the analyst, boss and IT admin as silent movie characters, where the analyst and IT professional struggle to keep the boss happy within the limitations of their skills and the existing content in the corporate DW.  It's the all-too-familiar "I need that analysis by tomorrow" versus "it takes time to put that data together".  Terribly apt and consistent with what we see in the "BI trenches" every day.

The silent movie theme stresses that this situation hasn't changed that much in many decades (and in any event in the two decades I've been working on BI solutions).  Of course, the slides were endlessly entertaining while driving home these points!

As for the "Gemini" product, it's quite exciting.  BlueGranite has been in the business of helping our clients' analysts pull info together quickly and make it meaningful since the ProClarity days.  One of my lingering concerns has been that--since ProClarity was acquired and the future of its core technology became uncertain--what tools within the Microsoft product suite would serve these "data diggers"? 

Excel pivot tables against SSAS cubes are fantastic--but how to make sure analysts can get data pulled together by the cube designer fast enough to meet business needs?  If the corporate governed DW/Cubes evolve slowly, how can analysts fill in the gaps? 

Today I still recommend ProClarity for the deep data analysts (as a complement to Excel), as it still provides more flexibility and richer visualization than Excel pivot tables for advanced analysts.  But I think Gemini may one day (hopefully soon) deliver even more, and really become the data digger's tool of choice.

Gemini is clearly intended to fill the analyst gap.  And in so many ways I think it will. It's at once a more approachable way for non-OLAP users to build rich dimensional models, a way to make more dynamic data integrations using the familiar Excel environment, and a rich OLAP query tool to be used against these analyst-generated models.

In truth, I can easily see even seasoned SSAS pros (including myself) using Gemini during prototype and early development, in addition its intended less sophisticated audience.

The things I love about Gemini:

1. Users import data from all types of sources:
  
a. Structured relational
  
b.
Existing OLAP Cubes
  
c.
Subscribe to "Service Documents" (a form of RSS feed that contains tabular data)
  
d. Paste in any tabular data (e.g. copied to clipboard from a web page)

2. Fast processing of large data volumes (100M rows demonstrated, 20M demo'd on a netbook)

3.  Ability to add calculated measures at the Excel pivot-table layer (sweet!)

4.  Ability to connect to Gemini models as SSAS data sources (rocks!)

5.  Tight integration with SharePoint as a basic architectural construct

6.  Translation of most OLAP concepts to Excel terminology more familiar to analysts

Some things I would like to see improved, or clarified. 

I would have asked these questions, but Q&A was cut short today due to time constraints...

1. As simple as they seem to MDX people (like me), my sense is the DAX expressions are going to be too complex for many of the analysts I train.  It reminds me of PPS-P PEL, in that the expressions are intended to be simplified from MDX, but they're still complex and require multidimensional thinking--which isn't a gift many of us are born with.   PEL generated lots of push-back from analysts when I demonstrated it to "real customers", and I fear DAX may as well.

2. To address #1, I hope the product team will consider following the model ProClarity set with it's KPI Designer, which allows users to use wizards to build calculated measures such as ranking and bubble-up exceptions.  KPI designer users build complex MDX without knowing that's what they're doing.  I still train new users on these tools, and it's a positive for them to use wizards (rather than purely language constructs, as with DAX).

3. I'd like to see some MDM tie-in to ensure that already accepted calculations, data sources, etc., can be drawn upon and re-used in a Gemini solution.  I can see a BI governance issue (and IT objection!) if many analysts are building silo BI solutions without some centralized baseline to start from.  Gemini doesn't prohibit a DW/MDM baseline, but it doesn't appear to promote one either.

4. Security really wasn't addressed in the session.  If analysts will be pulling 100M rows of fact data into a desktop solution…how does a corporate IT policy ensure that that such huge volumes of data isn't lost in the back of a taxi?  BitLocker would be a convenient answer to this question, but I hope it isn't the only one.

5. Data mining--not sure if it's possible to incorporate DM models into the Gemini models, but the combination of capabilities on the desktop would be really fantastic!

In-memory OLAP is a hot technology, and as illustrated by Donald's slides, the world has been waiting far too long to put such powerful tools in the hands of typical analysts.  I can't wait to see how this possibly disruptive technology impacts the wide swath of users its intended to benefit!

Tags: , , ,

Analysis Services | Analytics

Windows Firewall and SQL Server 2008

by Rob 21. July 2008 00:58

Recently I've been working on deployments of Windows Server 2008 and SQL Server 2008.  I thought I'd start to post some of the nuances of these new product editions. 

One of the first things to encounter is, not surprisingly, security.  Security is always loads of fun for deployments (ha ha), but actually I kind of enjoy the challenge of working within the confines of good security practices.

Windows Server 2008 is based on the Vista core, and inherits a lot from it.  One of these is the Windows Firewall.  I think this is a really good thing...having that extra layer of security is definitely wise, and is a nice blanket I actually miss from my Sun/Linux days--so I'm actually glad to have it there.  But, it also means you have to configure security for just about every new application, port, etc.

For SQL Server, this really isn't too difficult.  However since I do so many deployments, I'm always interested in a shortcut...and I like to document changes, get them approved by client syadmin, then apply them by script whenever possible. 

So, below is a sample script I put together for applying firewall changes needed by SQL Server 2008 when running on Windows 2008. This is a work in progress but so far so good. Note that this script opens just about every port SQL Server might use, so make sure to use only those lines that apply to any given server (e.g. don't open HTTP/80 if you're not running anything reporting services, etc.).

Of course, if you're using named instances for SQL Services, those instances by default will have dynamic (i.e. random) ports.  Dynamic ports don't work that well with a server firewall (and neither do they work well for Kerberos delegation configurations--but that's another topic).  So, a best practice is probably to set static ports for each instance and manage them that way.

One thing to make sure to do if using command lines like this is to specify rule names on the command line, then those names show up in the GUI-based firewall control panel (firewall.cpl)--see the screen grab down below. If you don't--then each rule will simply be named "unspecified"...not a nice thing to leave for the sysadmin to figure out later!

@rem firewallconfig.cmd by Rob Kerr 

@echo =========  SQL Server Ports  ===================
@echo Enabling SQLServer default instance port 1433
netsh firewall set portopening TCP 1433 "SQLServer"

@echo Enabling Dedicated Admin Connection port 1434
netsh firewall set portopening TCP 1434 "SQL Admin Connection"

@echo Enabling conventional SQL Server Service Broker port 4022 
netsh firewall set portopening TCP 4022 "SQL Service Broker"

@echo Enabling Transact-SQL Debugger/RPC port 135
netsh firewall set portopening TCP 135 "SQL Debugger/RPC"

@echo =========  Analysis Services Ports  ==============
@echo Enabling SSAS Default Instance port 2383
netsh firewall set portopening TCP 2383 "Analysis Services"

@echo Enabling SQL Server Browser Service port 2382
netsh firewall set portopening TCP 2382 "SQL Browser"

@echo =========  Misc Applications  ==============
@echo Enabling HTTP port 80
netsh firewall set portopening TCP 80 "HTTP"

@echo Enabling SSL port 443
netsh firewall set portopening TCP 443 "SSL"

@echo Enabling port for SQL Server Browser Service's 'Browse' Button
netsh firewall set portopening UDP 1434 "SQL Browser"

@echo Allowing multicast broadcast response on UDP (Browser Service Enumerations OK)
netsh firewall set multicastbroadcastresponse ENABLE

Tags: , ,

Security | SQL Server | Windows Server

Little a vs. Big A

by Rob 24. May 2008 10:13

When planning a deployment of PerformancePoint Monitoring & Analytics (PPS M&A), an often overlooked component in the business solution is the ProClarity Analytics Server (PAS) application layer.  

PAS is sometimes not deployed by customers, and their analysts aren't trained in the effective use of these analytical tools. In my view, this is often a mistake.  Analytics has at least two constituencies--and all need to be considered when planning a successful BI deployment.

A sales specialist I have a great deal of respect for, Mac Hussey, once explained it this way...this isn't a direct quote but my interpretation of his ideas: PerformancePoint server M&A's native scorecards, analytic grids & views cover "Little a" (lightweight analytics), while PAS covers "Big A" (heavy analytics). 

Two A's...so what's the difference? "Little a" is flashy, cool and as easy to use as a web page--often requiring no training beyond a quick overview.  In most industrial and service industries, perhaps 85% of BI dashboard users are executives, managers and line of business users.  These users don't use analytics in their primary job function, but use analytics to support their "real" jobs. 

Even in the long-run, they may only want/need "Little a". If your organization fits this mold, it may seem that reaping the low-hanging fruit using "Little a" is really where all the value lies--and that's certainly true at first. Eventually, though, the "Little a" users want to know root causes, and analysts really need to dig and develop further insights.  That's when "Big A" is needed (ProClarity Analytics). 

"Little a" is certainly a "low hanging fruit picker" and brings in quick ROI in for any BI dashboard deployment.  Yet time spent deploying and training on "Big A" is where the game-changing ROI comes from over the mid/long-term. Currently this means deploying an additional application (PAS) and training end-users.  

Eventually I'm sure we'll see ProClarity Analytics fully integrated into PerformancePoint, and the "ProClarity" product brand will be just a fond memory. When that happens, every user will have access to both the "Big a" and "Little a", even if--as with Excel's features--they only know how to use the most common analytical capabilities offered by the product.  Until then, Microsoft BI practitioners who address the needs of users in the "here and now" will need to continue to pay close attention to the spectrum of user needs when planning BI deployments.

Tags: , ,

Analytics | BI Strategy | PerformancePoint

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2010 Rob Kerr's BI Blog