SQL 2008 Backup Compression -- Real World

by Rob 23. May 2010 16:37

One new feature in the SQL 2008 product line is on-the-fly compression of backups as a built-in feature of the SQL Relational database.  I've been excited about this feature, and anecdotally happy with the results.

Today I took a few minutes to do a head-to-head comparison of compressed vs. non-compressed backups.  And just for good measure I added in a test to compare the on-the-fly compression with using a secondary compression (WinRar).  This latter approach is one that we've all used for years, and I was interested to see whether I would be "forever done" with using secondary compression now that compression is "baked in" to the product.

The following tests are against a 10.2GB SQL database running on SQL 2008R2 x64 on an 8-CPU database server.

Here are my results:

  Time to make backup Time for secondary compression Total Time Final size of backup on disk
Uncompressed Backup 5 minutes N/A 5 minutes 5.4GB
Compressed by SQL during Backup 1 minute N/A 1 minute 1.25GB
Make uncompressed backup, then compress with WinRar 5 minutes 13 minutes 18 minutes 0.75GB

Tags: ,

SQL Server 2008

Oracle Driver configuration for SSIS, SSRS and SSAS in a 64-bit Environment

by Rob 28. March 2010 00:54

Today I finished a white paper on configuring Oracle drivers for Microsoft BI applications (like Integration Services, Analysis Services and Reporting Services).  I've configured these environments quite often, but like so many things my brain has a hard time remembering all the nuances every time.

So to help myself--and hopefully some others as well--I put all the basics down on paper.  Beyond being a "white paper", this one has complete configuration instructions (with screen shots) to install, configure and test Oracle drivers in a 64-bit environment.  

Please feel free to download the full document as a PDF.  Hope this helps!

From the white-paper intro:

Summary: This white paper describes the procedures needed to configure Oracle database drivers for use in Microsoft BI services (SSIS, SSRS, SSAS and BIDS) in a 64-bit Windows Server 2008 R2 environment. 

This article covers the selection and configuration of Oracle database drivers, including detailed configuration procedures.

Specific coverage is given to configuration requirements for scenarios where BIDS is installed and used on 64-bit SSIS servers.

This document includes configuration procedures for both the 64-bit and 32-bit versions of the Oracle Provider for OLEDB and the Microsoft Connector for Oracle by Attunity.

While the primary target for this document is SQL Server 2008 R2, the techniques and discussion also apply to SQL Server 2008 and SQL Server 2005 product versions.

 

Tags: , , , ,

Configuration | Operations

PowerPivot Retail Analysis Example

by Rob 8. February 2010 01:29

Today's blog is about using PowerPivot to solve real-world problems.  

While PowerPivot hasn't hit the street as a shipping product yet, I'm intrigued by it, and I think it has a good future.  However, I see its value differently than some.  

I see it as th tool you use to add a degree of flexibility and agility to a larger data warehouse strategy. Sure, I see how it can be used to meet entirely standalone needs, but I don't think that's where its greatest potential lies.

No matter how well we plan, real-world requests and requirements always seem to stay just ahead of us, and there's always a creative power user out there ready to build a data mart with Excel and VBScript to fill those gaps.  

Wouldn't it be great if we could have a flexible tool to fill in the (hopefully) small number of requests that our data warehouse can't yet meet, but do it in a way that preserves the integrity of what's already in production?  I think PowerPivot may be that tool, and I think you should take a look.

The below embedded video walks through using PowerPivot to turn around an analysis request that doesn't quite meet the structure of an in-place data warehouse.

Note: You can view this video full screen by pressing the full screen button on the bottom toolbar. It's the second item from the right-hand side.

Tags: ,

PowerPivot

SSAS -- to Partition or not?

by Rob 20. January 2010 01:16

Recently I had a question from a client whether there was really any performance improvement in SSAS cube processing when using partitioning strategies vs. just using a single partition.

While intuitively I "knew" that partitioning cube fact tables results in better parallelism and thus increased performance (when adequate CPU, memory and I/O resources can support it), I realized I really didn't have any empirical "proof" to backup my professional opinion.  So--I setup a simple test.

The following is a very simple comparison of processing the same data set with a cube design that's identical--except for the introduction of multiple partitions. For this test I used Analysis Services 2008R2 x64 running on a 4-core Xeon server

The data set processed includes 37 million fact rows, and several dimensions--the largest including around 4,000 members.  In the first test, a single partition is used.

In the second, the cube is partitioned along months, resulting in about 1.3 million rows per partition.  This is below the recommended threshold for partition sizing, but provides a decent (if basic) evaluation of the effect of parallelism on cube processing.

Results:

  • Time to process the cube using a single partition: 21 minutes
  • Time to process the cube using 23 partitions: 12.7 minutes
  • Bottom line: by partitioning the processing of the cube, there's a time savings of around 40% in this case
Observation: The server I used is in my lab and has 4 Xeon cores, 4GB RAM, and 1.3TB RAID 1E storage. During the single partition processing, memory demand didn't exceed 3GB, and CPU utilization didn't exceed 60% or so.  With partitioned processing, both CPU and memory were completely saturated.  When I get the chance (and the required loose change) to upgrade the server to 8 cores and more memory, I'd like to re-run this to see whether increased resources would provide an even greater benefit.  However, having at least this much empirical evidence is a good start.



Single partition processing

23 Partitions processing

Tags:

Analysis Services | SQL Server 2008

Publishing PerformancePoint to Extranet Users

by Rob 3. January 2010 00:35

Today's blog is about connecting Internet/Extranet users to a PerformancePoint solution that uses Kerberos delegation to pass end-user credentials through the application layers to back-end databases. This article is about how to do it in a way that doesn't require VPN deployment, is easy to use and convenient for end-users, and adds no additional burden on SharePoint administrators or DBAs.

What? Impossible you say?  Not at all.  In fact it can be relatively easy to implement without the commonly suggested security trap-doors.  The technique has really been around for quite a while, and it's accomplished through the use of a reverse-proxy solution such as Forefront TMG or ISA server (Forefront is the name of the latest version of the product formerly known as ISA Server).

The video below is an overview and demonstration of a working solution combining the following components.

1. Windows Server 2008 R2 x64
2. SharePoint Server 2010 (CTP)
3. PerformancePoint Services (part of SharePoint 2010)
4. SQL Server 2008 R2 (CTP)
5. Forefront TMG 2010

Note: You can view this video full screen by pressing the full screen button on the bottom toolbar. It's the second item from the right-hand side.

Tags: , , ,

Configuration | Security | SharePoint

BI Dashboard for the Transportation Industry - Demo

by Rob 4. August 2009 18:34

At BlueGranite we've been considering the best way to provide better visibility to all the great capabilities BI and Dashboards deliver.

More often than not, when I do an in-person demo of a dashboard solution, it's the first time the audience has seen this kind of solution before--except for maybe a screen print or a sketch.  Personally I don't think an interactive dashboard can be explained or even understood by seeing a screen print! 

 So, today I made a quick (5 minutes) demo and uploaded it to YouTube...kind of a test of whether this is a good format to share quick ideas and demos.  Afterall, there are only so many opportunities to visit in person, and who can be everywhere at once? (hopefully with the Internet, I can?)

The video below demonstrates a BI solution which is a reduced scope demo of a solution our firm implemented in the transportation (trucking) industry.  The technology is primarily SharePoint, PerformancePoint, SQL Server Analysis Services, Silverlight and Bing Maps.

 If you like this way of communicating/blogging for BI demonstration and best practice, let me know.  It takes more time than writing, but if it's useful then in my view it's worth the time.

 --Rob

Tags: , ,

PerformancePoint | Video

TechEd 2009: Enterprise Mashups

by Rob 13. May 2009 11:06

BI Mashups

Tuesday I attended an interesting session presented by J.R. Arredondo and Dave Pae about putting together Enterprise Mashups using SharePoint Designer for WSS or SharePoint 2007. 

Mashups are one of those buzzwords, not unlike Twitter, FaceBook, etc., that sounds like a cool technology my 13-year old would be interested in but which I always try to approach with a bit of skepticism given my focus on business solutions that have ROI requirements.

And so it has been with "mashups" for me...on the radar but a bit unproven in terms of business value.

I think that needle has moved for me as a result of this session.  Of course, my primary focus is whether each new technique or technology is relevant to real-world BI solutions (not just something entertaining during a demo).

So how do I see mashups extending a traditional data-driven BI solutions?  Well, the ideal would be to take (A) traditional, planned data in a database or cube; (B) add in unstructured data (like sharepoint lists), and (C) access information on the web or from LOB systems using web services.

Modern BI solutions like PerformancePoint solve A+B, but C is not usually in the realm of end-users or analysts who assemble BI solutions.

While the Arredondo & Pae session didn't address a BI environment directly, I can see quite well how to adapt their techniques to do some interesting things. Using PerformancePoint we already have the ability to link to non-BI components, and by combining this with SharePoint designer mashup capabilities I can easily see integrating maps, internal web services and public services via various protocols.

If you haven't looked at Microsoft's Mashup page, take a look at it here: Enterprise Mashups.  And if you're not aware that SharePoint designer is now free, download it here and take a look at ways to design rich mashup pages in sharepoint.

 

Tags: , , ,

BI Strategy | SharePoint

TechEd 2009: Microsoft BI - Gemini

by Rob 11. May 2009 23:57

Today I had the pleasure to attend Donald Farmer and Kamal Hathi's session on the Gemini project.  I've seen several of Mr. Farmer's presentations on BI and Data Mining, and I'm never disappointed at his skills and depth of knowledge!

Gemini is an in-memory, Excel-based, analysis services orientated technology that brings the power of dimensional modeling to a primarily Excel skilled audience--with the promise that the output of Gemini can be forward-engineered into managed Analysis Services solutions. 

Donald's quite entertaining slide deck was a major departure from the typical corporate spin that we're accustomed to seeing (and that I'm personally accustomed to producing) when we educate potential technology users.   

The slide deck casts the analyst, boss and IT admin as silent movie characters, where the analyst and IT professional struggle to keep the boss happy within the limitations of their skills and the existing content in the corporate DW.  It's the all-too-familiar "I need that analysis by tomorrow" versus "it takes time to put that data together".  Terribly apt and consistent with what we see in the "BI trenches" every day.

The silent movie theme stresses that this situation hasn't changed that much in many decades (and in any event in the two decades I've been working on BI solutions).  Of course, the slides were endlessly entertaining while driving home these points!

As for the "Gemini" product, it's quite exciting.  BlueGranite has been in the business of helping our clients' analysts pull info together quickly and make it meaningful since the ProClarity days.  One of my lingering concerns has been that--since ProClarity was acquired and the future of its core technology became uncertain--what tools within the Microsoft product suite would serve these "data diggers"? 

Excel pivot tables against SSAS cubes are fantastic--but how to make sure analysts can get data pulled together by the cube designer fast enough to meet business needs?  If the corporate governed DW/Cubes evolve slowly, how can analysts fill in the gaps? 

Today I still recommend ProClarity for the deep data analysts (as a complement to Excel), as it still provides more flexibility and richer visualization than Excel pivot tables for advanced analysts.  But I think Gemini may one day (hopefully soon) deliver even more, and really become the data digger's tool of choice.

Gemini is clearly intended to fill the analyst gap.  And in so many ways I think it will. It's at once a more approachable way for non-OLAP users to build rich dimensional models, a way to make more dynamic data integrations using the familiar Excel environment, and a rich OLAP query tool to be used against these analyst-generated models.

In truth, I can easily see even seasoned SSAS pros (including myself) using Gemini during prototype and early development, in addition its intended less sophisticated audience.

The things I love about Gemini:

1. Users import data from all types of sources:
  
a. Structured relational
  
b.
Existing OLAP Cubes
  
c.
Subscribe to "Service Documents" (a form of RSS feed that contains tabular data)
  
d. Paste in any tabular data (e.g. copied to clipboard from a web page)

2. Fast processing of large data volumes (100M rows demonstrated, 20M demo'd on a netbook)

3.  Ability to add calculated measures at the Excel pivot-table layer (sweet!)

4.  Ability to connect to Gemini models as SSAS data sources (rocks!)

5.  Tight integration with SharePoint as a basic architectural construct

6.  Translation of most OLAP concepts to Excel terminology more familiar to analysts

Some things I would like to see improved, or clarified. 

I would have asked these questions, but Q&A was cut short today due to time constraints...

1. As simple as they seem to MDX people (like me), my sense is the DAX expressions are going to be too complex for many of the analysts I train.  It reminds me of PPS-P PEL, in that the expressions are intended to be simplified from MDX, but they're still complex and require multidimensional thinking--which isn't a gift many of us are born with.   PEL generated lots of push-back from analysts when I demonstrated it to "real customers", and I fear DAX may as well.

2. To address #1, I hope the product team will consider following the model ProClarity set with it's KPI Designer, which allows users to use wizards to build calculated measures such as ranking and bubble-up exceptions.  KPI designer users build complex MDX without knowing that's what they're doing.  I still train new users on these tools, and it's a positive for them to use wizards (rather than purely language constructs, as with DAX).

3. I'd like to see some MDM tie-in to ensure that already accepted calculations, data sources, etc., can be drawn upon and re-used in a Gemini solution.  I can see a BI governance issue (and IT objection!) if many analysts are building silo BI solutions without some centralized baseline to start from.  Gemini doesn't prohibit a DW/MDM baseline, but it doesn't appear to promote one either.

4. Security really wasn't addressed in the session.  If analysts will be pulling 100M rows of fact data into a desktop solution…how does a corporate IT policy ensure that that such huge volumes of data isn't lost in the back of a taxi?  BitLocker would be a convenient answer to this question, but I hope it isn't the only one.

5. Data mining--not sure if it's possible to incorporate DM models into the Gemini models, but the combination of capabilities on the desktop would be really fantastic!

In-memory OLAP is a hot technology, and as illustrated by Donald's slides, the world has been waiting far too long to put such powerful tools in the hands of typical analysts.  I can't wait to see how this possibly disruptive technology impacts the wide swath of users its intended to benefit!

Tags: , , ,

Analysis Services | Analytics

SSAS 2008 Deployment: The connection either timed out or was lost

by keruibo 16. April 2009 01:32

The following issue is a problem in SSAS that you might run into either in test or deployment environments (see references). 

In a nutshell, when a client session running Vista or Windows 2008 talks over the network to an SSAS server running on Windows 2008, and they use Kerberos for authentication, there likely will at some point be connectivity problems that resemble timeout or firewall blocking problems. 

Errors might look like one of the following:

The connection either timed out or was lost.
Unable to read data from the transport connection: An existing connection was forcibly closed by the remote host.
An existing connection was forcibly closed by the remote host

The two places we’ve observed these problems are:

1.       MDX queries submitted to the server that are reasonably large, such as those PPS submits, or perhaps those coded by hand that become large.  “Large” might be something like 1000 bytes or so.

2.       Deploying databases over the network from a Vista/2008 client to a 2008 server.

This problem still exists as of SQL 2008 SP1, so applying that update isn’t a resolution. This problem definitely exists when client & server are on different machines; may not be a problem when client/server are on the same computer (e.g. VPC or BIDS on a remote desktop session). 

From what I understand, MS is aware of this and we should expect a post-SP1 CU soon.  Whether this is to be resolved as a Windows patch or a SSAS patch is unclear to me. 

The following are work-arounds in the mean-time:

1.       If either the server or client is running XP or Windows 2003, the problem should not occur in any case.

2.       If you can edit the connect string (e.g. SSRS/Excel), specify ;SSPI=NTLM as a parameter.    Be advised that user security delegation will not succeed if this workaround is used.

3.       Connect with the IP address instead of machine name or FQDN.  Since Kerberos is unsupported with IP address connections, this also forces a fallback to NTLM.  Kerberos delegation also will not succeed when this workaround is used.

 Of particular note is that if SSAS is to be deployed on a Windows 2008 server, all available workarounds currently imply kerberos delegation will be impossible (a short-term driver for SSAS deployment on Server 2003?). 

 References:

http://blogs.msdn.com/psssql/archive/2009/04/03/errors-may-occur-after-configuring-analysis-services-to-use-kerberos-authentication-on-advanced-encryption-standard-aware-operating-systems.aspx

http://denglishbi.spaces.live.com/Blog/cns!CD3E77E793DF6178!1214.entry

http://sqlblogcasts.com/blogs/drjohn/archive/2009/03/28/kerberos-kills-large-mdx-queries-on-windows-server-2008-ssas.aspx

Tags: , , ,

Analysis Services | Configuration | Security

PerformancePoint Now included with SharePoint eCal

by keruibo 7. April 2009 14:05

I've been heavily involved in PerformancePoint implementations since the beginning of that product's lifecycle...our company has done many deployments for clients.  It's a great dashboarding technology, and I'm continually amazed how quickly we can put together compelling solutions with it.  The efficiency and elegance of the underlying design strikes a great balance between power and simplicity of use. 

 Recently Microsoft folded this product into SharePoint as part of the Enterprise CAL.  The net effect for eCal customers is a "free" enterprise-class BI platform; and for customers that would have invested in the BI platform anyway--all the benefits of eCal, including Business Data Catalog, Excel Services and so on.

Here's a Silverlight video going into some additional detail of the benefits of combining PerformancePoint's monitoring & analytics with SharePoint:

 

 

 

Tags: , , , ,

PerformancePoint

Disclaimer
The opinions expressed herein are my own personal opinions and do not represent my employer's view in anyway.

© Copyright 2010 Rob Kerr's BI Blog