SSAS Named Sets – Static Vs Dynamic

Named sets were there in MS SQL Server since beginning. They were only static in SQL Server 2005, but with SQL Server 2008, we have dynamic sets as well.

Also, this is one of the popular questions which are usually asked in BI interviews. The intent in asking this question is not to ascertain that candidate knows the difference between static and dynamic named sets, but to find out whether the candidate knows the intricacies involved in using named sets. What business problems it does solves for us. What are the guidelines to create named sets?

So before we delve into named sets, let’s see how the different parts in an MDX query are resolved:

Here if you notice that sets are evaluated after “FROM” and “WHERE” clauses are evaluated. Thus the concept of STATIC and DYNAMIC named sets shows up.

So to summarize:

STATIC Sets: The MDX query execution context is not passed from “FROM” and “WHERE” clauses to “WITH SET”. They are evaluated during CREATE SET statement execution or at query run time if defined with WITH SET clause. If static named sets are defined in the cube, they are evaluated at the time of cube processing.

DYNAMIC Sets: The MDX query execution context is passed from “FROM” and “WHERE” clauses to “WITH SET”. They are evaluated every time the query is run. It really doesn’t matter where they are defined.

Let’s see above in working, before we delve into remaining intricacies:

STATIC SET

DYNAMIC SET

CREATE
SET [Adventure Works].[TOP 5 PRODUCTS BY SALES] AS
TOPCOUNT(
[Product].[Product Model Categories].[Model Name].MEMBERS

,5

,[Measures].[Internet Sales Amount])

;

SELECT [Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM [Adventure Works];


SELECT

[Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM [Adventure Works]

WHERE

([Date].[Calendar].[Calendar Semester].&[2003]&[2]);


SELECT [Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM

(select [Date].[Calendar].[Calendar Semester].&[2003]&[2] on 0

FROM [Adventure Works]);


STATIC sets did not consider sub cubes

CREATE DYNAMIC
SET [Adventure Works].[TOP 5 PRODUCTS BY SALES] AS
TOPCOUNT(
[Product].[Product Model Categories].[Model Name].MEMBERS

,5

,[Measures].[Internet Sales Amount])

;

SELECT [Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM [Adventure Works];


SELECT

[Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM [Adventure Works]

WHERE

([Date].[Calendar].[Calendar Semester].&[2003]&[2]);


SELECT [Measures].[Internet Sales Amount] ON
COLUMNS,

[TOP 5 PRODUCTS BY SALES] ON
ROWS

FROM

(select [Date].[Calendar].[Calendar Semester].&[2003]&[2] on 0

FROM [Adventure Works]);


 Dynamic sets, did consider the sub cubes.

 So we saw that context set by WHERE clause is considered in evaluating DYNAMIC Sets. Also, when we specified, sub cubes, the context is appropriately set in dynamic set. Find out more about sub cubes versus where clause: here

So far so good. But the story doesn’t end here. Read on…

Before we get into more details let me summarize a simple term MULTI-SELECT. It means, more than one member is specified either in Sub Cube or in WHERE clause. We already saw that sub cube context is not passed on to the static sets. To pass the context, we need to use Dynamic sets.

So, where is the problem? The problem lies in the MDX that is sent from client apps to AS Server. Let us see couple of examples:

The cube browser or OWC component implicitly converts sub cube MDX to WHERE clause. And to pass the context set in WHERE clause to the calculated members, we need to use EXISTING clause. Refer to blog SSAS Sub Select vs Where Clause here.

Excel, and Report Builder uses the sub cubes, so to get correct numbers, we can use dynamic sets. But it may not return the correct results. This is very much relevant in implementing drill through actions.

SSRS generates the MDX (if using the designer) which will have both sub cubes and where clauses. So again we need to be careful in using dynamic sets.

So, I would suggest, named sets must be used with caution.

But all is not bad with named sets. Since AS caches the tuples while evaluating the named set expression, they provide better performance.

Will be back with more blogs, when I get time….

Please provide your feedback. I can be reached at jagdish_malani@hotmail.com

Advertisements

SSAS Sub Select Vs Where Clause

I have created various reports with MDX, and mostly I have used sub selects. Recently, in one of the report, for some reason, I swapped “Sub Select” with “Where Clause” and post the testing I sent the report to production. I felt about blogging on it, and here it goes….

Please note that, I will be using Sub Select and Sub Cube interchangeably.

Let’s see a simple MDX for each:

MDX with Sub-Select:

SELECT {[Measures].[Order Count]} on 0 ,[Product].[Category].Members
on 1

FROM     (select

{[Product].[Product Categories].[Subcategory].&[26]

,[Product].[Product Categories].[Subcategory].&[22]} on 0

from [Adventure Works] );

If you notice the numbers, the total for accessories and clothing does not match the number shown for “All Products”. It is not a rocket science and I leave it to readers to find that out. Also, here another concept of Attribute Relationship comes into play, which I am going to write a separate blog.

Sub Cube how it works:

  • All members from all dimensions are eligible for selection.
  • Cube context is defined by the default members of each dimension. However, you can refer to members not within that Data Space.
  • In case of sub selects, the cube is sliced based on the dimensions specified in sub selects, which actually limits the cube space. Thus the members on axis by default contain only related data that is now specified by the sub select. Thus the Categories belonging to sub categories in sub select are shown. The order count, also is for those sub categories.
  • Sub selects are intended to provide “visual totals” type functionality, where the value of the ‘All Member’ on a hierarchy changes to reflect the members actually selected in a query. What that means is the totals would be the totals of the numbers in the cube space restricted by the sub select.
  • Another point of interest is that calculated members have the ability to look outside the subcube. For example:

 with member measures.test as

[Product].[Category].currentmember.prevmember.name

select [Measures].test on 0, [Product].[Category].[Category].members on 1

from ( select ( {[Product].[Category].[Category].&[4],

[Product].[Category].[Category].&[3]} ) onfrom [Adventure Works])

  • Sub cubes cannot contain with clause, non empty clause and a having clause.
  • Sub cubes do not restrict levels as seen below:

 select

descendants([TIME].[yqmd].year.members, time.yqmd.month,self_and_before) on 0

from ( select [TIME].[YQMD].[Quarter].&[2001-10-01T00:00:00] on 0

from [Sales])

  • Tuple specifications for sub cube: When you specify the tuples in the sub cube, the data cells not part of the sub cube will not contribute to aggregates. Contrast this with autoexists.
SELECT

{[Date].[Calendar].[Calendar Quarter].members, [Date].[Calendar].[Calendar year].members} on 0

,[Product].[Category].Members on 1

FROM (select

{([Product].[Product Categories].[Subcategory].&[26]

,[Date].[Calendar].[Calendar Quarter].&[2004]&[1])

,([Product].[Product Categories].[Subcategory].&[22],[Date].[Calendar].[Calendar Quarter].&[2004]&[2])}

on 0

from [Adventure Works] )

where [Measures].[Internet Sales Amount];

  • Where clause specified in the sub cubes is not passed on to outer MDX query:

Let’s find out the total sales for Australia for year 2002:

select [Customer].[Customer Geography].[Country].&[Australia] on 0,

[Measures].[Internet Sales Amount] on 1

from [Adventure Works]

where ([Date].[Calendar].[Calendar Year].&[2002]);

Now let’s use this in the sub cube clause:

select [Customer].[Customer Geography].[Country].&[Australia] on 0,

[Measures].[Internet Sales Amount] on 1

from

( select [Customer].[Customer Geography].[Country].&[Australia] on 0,

[Measures].[Internet Sales Amount] on 1

from [Adventure Works]

where ([Date].[Calendar].[Calendar Year].&[2002]))

Notice that for Australia in 2002, the sales was 2.15 million. But, when the same query was used as sub cube, the slicer in the inner query was ignored by the outer query. It did not restrict the time dimension in any way. That implies that the where clause specified in the sub cubes are not passed on to outer query context.

  • Aggregate function does not restrict its values to the sub cube:
select [Customer].[Customer Geography].[Country].&[Australia] on 0,

[Measures].[Internet Sales Amount] on 1

from [Adventure Works]

where ([Date].[Calendar].[Calendar Year].&[2002]);

The Australia sales is:

The total sales for all countries is:

select

[Customer].[Customer Geography].[All Customers] on 0,

[Measures].[Internet Sales Amount] on 1

from [Adventure Works]

where ([Date].[Calendar].[Calendar Year].&[2002]);

Now when we use aggregate function:

with
member measures.m as

aggregate([Customer].[Customer Geography].[Country].members, [Measures].[Internet Sales Amount])

select [Customer].[Customer Geography].[Country].&[Australia] on 0,

measures.m on 1

from [Adventure Works] where ([Date].[Calendar].[Calendar Year].&[2002]);

  • Performance: When a query contains a sub select it means that the results of any calculations can only be cached for the lifetime of the query. When you get to multiple nested sub selects they do have a slight performance advantage.

MDX with Where Clause:

SELECT {[Measures].[Order Count]} on 0,

[Product].[Category].Members on 1

FROM [Adventure Works]

WHERE ({[Product].[Product Categories].[Subcategory].&[26],

[Product].[Product Categories].[Subcategory].&[22]} );

Before we go further, notice that the order count returned is for the product categories which contain the sub categories specified in the where clause. In other words, the MDX states that give me the sum of order count for categories that contain specified sub categories.

To validate it, see the following MDX:

SELECT {[Measures].[Order Count]}on 0

,{[Product].[Product Categories].[Subcategory].&[26]

,[Product].[Product Categories].[Subcategory].&[22]}

on 1 FROM [Adventure Works]

Compare these numbers with the MDX using Sub-Select above.

Where Clause how it works:

  • All members from all dimensions except the dimensions whose members are specified in the where clause are eligible for selection.
  • You cannot refer to members outside the Data Space hence you cannot use the <where member>’s dimension in the <columns set> or <rows set>.
  • No Visual total: It is important to note that the total (or ALL) member is the total sum of the whole dimension to which the <where member> belongs. Notice the effect of having a where clause here. Categories are pulled based on the members specified in the where clause. The measures are still rolled up by members selected on the axis. In other words, order count is for the product categories (including all sub categories).
  • Now to get the measures rolled up by the members specified in the where clause, we need to use the “existing” condition.

Before we look at EXISTING keyword, let’s first look at the plain MDX:

WITH MEMBER Measures.OrderCount as

SUM( {[Product].[Product Categories].[Subcategory].&[26]

    ,[Product].[Product Categories].[Subcategory].&[22]}

, [Measures].[Order Count])

SELECT {Measures.[OrderCount]}        on 0

        ,[Product].[Category].Members    on 1

FROM [Adventure Works]

WHERE{[Product].[Product Categories].[Subcategory].&[26]

,[Product].[Product Categories].[Subcategory].&[22]} );


Notice that 2573 is the sum of these two categories accessories and clothing as we saw in the previous MDX. We do get the same number for “All Products” and these two categories. This is because the context of where clause is not being used in evaluating calculated member. Also, while evaluating row axis, for each product category, sum returned same value.

Now, let’s put EXISTING keyword, and notice that the correct context is passed as is seen in the MDX below:

WITH MEMBER Measures.OrderCount as

SUM(existing {[Product].[Product Categories].[Subcategory].&[26]

            ,[Product].[Product Categories].[Subcategory].&[22]}

, [Measures].[Order Count])

SELECT {Measures.[OrderCount]}        on 0

        ,[Product].[Category].Members    on 1

FROM [Adventure Works]

WHERE ( {[Product].[Product Categories].[Subcategory].&[26]

,[Product].[Product Categories].[Subcategory].&[22]} );


  • If the WHERE clause does not contain one member but a <where set>, the “CurrentMember” will not work properly. You will need the help of the keyword “EXISTING” on the dimension/set used by the <where set> to ensure you stay within the context of the members defined by this <where set>.
  • May produce better performance
  • It is more readable
  • You might want to use the WHERE clause because it sets the context that is, it defines the “CurrentMember” and some specific MDX functions rely on this capacity (YTD, PP…).

Note:
When there are multiple members are specified in the where clause, it presents a popular MULTI-SELECT issue. I am going to blog about it very soon. Stay tuned…

Wrapping this up with a quick summary here:

SUB SELECT

WHERE CLAUSE

All members from all dimensions are eligible for selection.

<where member> context hence that specified member plus all dimensions EXCEPT the one to which the <where member> belongs are eligible for selection.

Limits the cube space

It does not limits the cube space.

This context is defined by the default members of each dimension.However, you can refer to member not within that Data Space with calculated members.

You cannot refer to members outside the Data Space hence you cannot use the <where member>’s dimension in the <columns set> or <rows set>.

Visual total: It is important to note that the total (or ALL) member will present you a total sum of the values for the members defined by the <subselect set>. This is a “Visual Total”.

No Visual total: It is important to note that the total (or ALL) member will present you the total sum of the whole dimension to which the <where member> belongs.

Aggregate function does not restricts its values to the subcube.

Now to get the measures rolled up by the members specified in the where clause, we need to use the “existing” condition.

When you get to multiple nested subselects they do have a slight performance advantage. Which is why you will see SSRS use both sub-selects and the WHERE clause. This way they get both the performance gain and get the context set, but it makes the query a real mess aesthetically.

You might want to use the WHERE clause because it sets the context that is, it defines the “CurrentMember” and some specific MDX functions rely on this capacity (YTD, PP…).

Sub cubes cannot contain with clauses, non empty clause or having clause.

If you are dealing with “raw” measures, they can often be used interchangeably

If you are dealing with “raw” measures, they can often be used interchangeably

Data cells not part of the subcube will not contribute to aggregates

More readable

Sub cubes do not restrict levels.
Where clause context specified in subcube is not passed on to outer query.
Comparing performance with these two is not considered here.

Please provide your feedback, if you have any. I can be reached at jagdish_malani@hotmail.com