Friday, March 30, 2012

Prediction Join to MDX with nested table

If your prediction join is to a SQL datasource, you can easily write a SQL query which returns a nested table like:

SELECT
Predict([Subcategories],2) as [Subcategories]
FROM
[SubcategoryAssociations]
NATURAL PREDICTION JOIN
(SELECT
(SELECT 'Road Bikes' AS Subcategory
UNION SELECT 'Jerseys' AS Subcategory
) AS Subcategories
) AS t

What about if your datasource is a cube? Is there some special MDX syntax similar to the SQL syntax above? Or do you have to utilize the SHAPE/APPEND syntax as follows?

SELECT t.*, $Cluster as ClusterName
FROM [MyModel]
PREDICTION JOIN
SHAPE {
select [Measures].[My Measure] on 0,
[My Dimension].[My Attribute].[My Attribute].Members on 1
from MyCube
}
APPEND (
{
select [Measures].[Another Measure] on 0,
NON EMPTY [My Dimension].[My Attribute].[My Attribute].Members
*[Product].[Product].[Product].Members on 1
from MyCube
}
RELATE [[My Dimension]].[My Attribute]].[My Attribute]].[MEMBER_CAPTION]]]
TO [[My Dimension]].[My Attribute]].[My Attribute]].[MEMBER_CAPTION]]]
)
AS [My Nested Table] AS t
ON [MyModel].[Product].[Product] = t.[My Nested Table].[[Product]].[Product]].[Product]].[MEMBER_CAPTION]]]

Typically, for building models on top of cubes, it is much easier to use the tools (BI Dev Studio). This way you can define your model directly on top of the cube and lots of optimizations occur. With such models, you can even use the MDXPredict function to get prediction results inside MDX queries over the source cube.

The DMX SELECT statement supports as input rowset-returning Analysis Services statements (MDX or DMX). That means that dataset-returning statements are not supported. But many MDX queries can be flattened. Have you tried something like SELECT FLATTENED in the MDX query?

|||

Bogdan-

Thanks for the reply. Yes, BIDS worked great for building the model. I've got it trained. Now I want to do a prediction based upon data from a cube. From what I can tell, you can't do prediction queries off a cube using BIDS because it only lets you predict off a relational table source. Right?

I've been researching the MDX function "Predict" which you mentioned. But I'm having terrible trouble finding example queries using that function...

Here's what I'm looking for... we've built a clustering model to cluster our stores. Some of the attributes are just Store dimension attributes... some are from a nested table (stats about the sales volume from each product category). We trained the model with all the stores. Now we want to extract the cluster name for each store and save that to a table. So is there a straight MDX query using the Predict MDX function which will get me the cluster name for every store? I was having trouble seeing how the Predict MDX function was able to know how to do a prediction join to the Store dimension.

As a side note, we could almost do a natural prediction join back to (select * from Model.CASES) except that we don't want the Store Key to influence the clustering model so we didn't add that as an input to the model. (And marking Store Key as Ignore excludes it from the Model.CASES resultset.)

By the way, we're only talking about a couple hundred rows, so the performance of the SHAPE/APPEND syntax below is fine for my purposes... just seeing if there's a more elegant way to do it.

Thanks!

|||

Oh... and to answer your other question about trying "SELECT FLATTENED"...

It's my understanding that "SELECT FLATTENED" is DMX. I'm not sure how to write an MDX statement that starts with "SELECT FLATTENED". And I'm struggling to see how using the DMX "SELECT FLATTENED" would help me. The output of DMX prediction query I used in the examples at the beginning of the thread work fine. I suppose I could flatten the output, but that wouldn't help me much. It's the input to the prediction join that I'm concerned with.

Or did you mean that you can use an MDX query which is written to be flat and use that as input to a prediction query which expects nested tables? I just tried that but may not have been using the right syntax cause I couldn't get it to work. Suggestions?

|||

You kind of need to do it brute force -we use the flattening semantics of MDX when executing the query, so you have to reshape using SHAPE.

There is a little trick to help you out in building the queries. You can use DMX to examine the flattened structure of the MDX query. Just issue a query like this:

SELECT t.* FROM AnyModel NATURAL PREDICTION JOIN <My MDX Query> AS t

then you will be able to see how the DMX processor sees your MDX results.

|||

Jamie-

That trick is helpful for seeing how it refers to the results of an MDX query.

But how do I take a flat MDX query and shape it so it can be consumed by a prediction join which expects a nested table. See the MDX example at the top of this post. Is that the only way (tying two separate MDX queries together with SHAPE/APPEND)?

|||

Yes your original SHAPE/APPEND would be the way to go.

The implementation of SHAPE in the AS engine will cause the MDX query results to be automatically returned in a flattened manner without requiring any explicit flattening syntax in the query itself (in fact, there is no such syntax - flattening is requested as either a command property in XMLA or by requesting a rowset interface in OLE DB)..

No comments:

Post a Comment