A few days ago I published an article detailing a method to create regression line calculations in an Analysis Services OLAP cube using MDX. The article can be found by clicking here. It is the starting point for this article. My instructional video detailing the entire regression line and forecasting process can also be found at this link.
Regression lines are a great way to visualize trends over time. My previous article details how to view a regression line over time periods that have already happened. Regression lines are also frequently used to forecast values for future time periods. Basically, extending a regression line into the future is one method of forecasting. Below I will walk through one method to extend a linear regression line to future values using MDX in an SSAS OLAP cube.
I should qualify this article before walking through the methodology. First, the ideal way to do linear regressions and forecasting in Analysis Services is with Data Mining Models. Read more about Data Mining Models at this link. I do not know the full implications upon resource requirements for the MDX method I have outlined below, but I do know that it has processed and displayed on graphs and charts without any hitches during my own tests. In the real world SSAS Data Mining skill sets are far more scarce than SSAS MDX skill sets, so this method may be helpful for when a linear regression forecasting solution is desired without having to implement SSAS Data Mining Models.
As stated at the beginning of this article, I’ll pick up where I left off in my previous article about MDX linear regressions.
My first step to extend the linear regression for forecasting is to create a Sum measure on the Date table which I call [Measures].[Completed Days]. Basically, days that are in the past have a value of “1″ and days in the future have a value of “0.” Summing up the rows will give you the total number of completed days in the data set.
Next, I created a calculated measure using the MDX function LinRegSlope. The function calculates the slope of the regression line for a given data set. We’ll use the following for the calculation:
- [Measures].[Sales Amount] – A core Sum measure that calculates total sales for a data set
- [Measures].[Rank 12 Months] – A numeric value to assign intervals to months for the x-axis (see my previous article)
- [Rolling 12 Months] – A static set for the rolling 12 completed months (see my previous article)
The LinRegSlope MDX function is then used as follows to get the slope of the linear regression line for existing data:
CREATE MEMBER CURRENTCUBE.[Measures].[12 Month Slope] AS
LinRegSlope([Rolling 12 Months]
,[Measures].[Rank 12 Months]
FORMAT_STRING = “Standard”, VISIBLE= 1 , ASSOCIATED_MEASURE_GROUP = ‘Fact Internet Sales’;
Next is a calculated measures named [Measures].[Projection 12 Month Count]. When evaluating a future month, it will be used to determine how many other future months come before it. So if January is the last month with data, February will calculate to “1″ and March will be “2.” Since the Count also includes the last month with data, one unit will be subtracted in the expression:
CREATE MEMBER CURRENTCUBE.[Measures].[Projection 12 Month Count] AS
COUNT(TAIL([Rolling 12 Months],1).item(0):[Dim Date].[Month Year].CurrentMember) – 1
FORMAT_STRING = “0″,VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = ‘Fact Internet Sales’ ;
Next, the values for the future months can be evaluated by using the slope of the line and extending the regression line value from the last month with data. Notice that I added an IIF statement to make sure that the line only displays on months that don’t yet have 27 days worth of data:
CREATE MEMBER CURRENTCUBE.[Measures].[Proj Sales 12mo] AS
IIF([Measures].[Completed Days] >= 27
,(TAIL([Rolling 12 Months],1).item(0),[Measures].[12 Month Trend]) + ([Measures].[12 Month Slope] * [Measures].[Projection 12 Month Count])
FORMAT_STRING = “Currency”, VISIBLE = 1 , ASSOCIATED_MEASURE_GROUP = ‘Fact Internet Sales’ ;
Finally, a few other simple additions to the MDX. I created a static set for the report named [Rolling 12 plus Next 6], which is the rolling completed 12 months plus the next 6 months, and I also wrapped [Measures].[12 Month Trend] (from my previous regression article) in an IIF statement to make it NULL when not displaying completed months.
Now, the calculations can be used to display actual Sales Amounts, 12 Month Trend (regression line for existing data), and Proj Sales 12mo (Projected Sales based upon the last 12 months with data):
The calculations are all enabled for slice-and-dice capabilities and for multi-select with members that hook up to the Sales fact table. Notice that the “Mountain 200″ Sales Amount ($708,537 in June-2008) is greater than the first predicted month of data ($697,386 in July of 2008). Why is that? June-2008 showed growth that was much higher than historical averages, and the linear regression forecast uses 12 months of data. The predictions won’t be perfect, but can be helpful if historical trends hold true in the future.
So, what if a prediction line based upon the last 6 months of data was wanted along with values based upon the last 12 months of data? Well, following the same logic as the calculations for the 12 month linear regression, a 6 month linear regression forecast can also be created and displayed on the same chart or grid:
The method above can be used to create slice-and-dice linear regression forecast calculations based upon any duration of time from an SSAS OLAP cube using simple MDX.