Wednesday, March 28, 2012

New to MDX - Querying Dimensions

This seems like a basic need that I don't understand.

I would like to populate cells with "aggregated" attributes.

For example, I have a Store with measures such as Sales and Expenses. I'd like to include in the report the Location and Size, both of which are dimensions. I want one record per store. The size of the store may change, so if I add Size to the axis, then I may get duplicate rows. Instead, I want the most recent store size to populate the cell.

I would like the query to return Store, Sales, Expenses, Location and Size; the latter two being attributes I don't want to slice my cube with.

What is the best way to accomplish this in MDX?

Thanks.

So, Store is a Type 2 dimension. Do location and size describe the store? If so, these are attributes within your Store dimension, right?

If these are all part of the same dimension, then you need some property from which you can determine which is the most recent version of a store. In your cube, how would you determine that two stores reference the same object and that one is a newer version of the other?

Bryan

|||

If location and size are completely separate dimensions then you are going to have trouble. The only way that AS can figure out that a particular store has a particular size is to look for Sales or Expense records at the intersection of those two dimensions. I would think that Size and Location should be modelled as attributes of the store dimension (as type 2 changing attributes) and it is probably going to make things easier if you create a "Current Size" attribute.

|||

Thanks for the replies.

Actually Stores is not a slowly changing dimension. I avoided a Snowflake Schema to keep things simple/fast.

Size and location are materialized dimensions.

I don't mind revising the Schema if necessary, but I want to make sure that if I make such a low-level change to the design that it's the right choice.

I want to get the size of the store based on the latest transaction date (it's the logic for the report....). With my limited knowledge of MDX, this is what I came up with:

DistinctCount(Descendants([Size],1))

This gives me the count of the number of sizes for the store along the time dimension in my query for the given period. The problem is that Count (as opposed to DistinctCount) returns the total number of unique sizes. Only distinct count returns the correct number of intersecting sizes. I can only get the count. I can't get the members. The following variations don't work:

Count(Distinct(Descendants([Size],1)))

Count(NonEmpty(Descendants([Size],1)))

Count(Descendants(NonEmpty([Size]),1))

Note that I'm using Store, Size, and Location as an analogy. I can't give too many actual details about my schema as a matter of confidentiality.

Any ideas on how to accomplish my goal?

Thanks.

|||

It can be done with independant dimensions, but the perfoermance might not be that good. Basically, to find what you are after SSAS would have to look at all the possible combinations of Store, Size and Date and then find the last non-empty one.

Off the top of my head I think something like the following might work.

Generate([Store].[Store].[Store]

,TAIL(

NONEMPTY(

CROSSJOIN(

[Size].[Size].[Size]

,[Date].[Date].[Date]

)

,[Measures].[Sales]

)

,1

)

)

You don't necessarily need a snowflake to implement a slowly changing dimension. You should also note that snowflake and star don't really affect the performance for MOLAP based cubes. They do affect the processing time and in SSAS 2005, snowflake will usually be faster. It's hard to say if separate dimensions or attributes of one dimension will be faster without knowing more specifics, but you will often see performance improvments by reducing the number of distinct dimensions. In the Store analogy I would say that the sizes would not change all that often, so I would expect that implementing it as an SCD would result in a performance increase.

No comments:

Post a Comment