This example illustrates how to use Analytic Solver Data Science's Exponential Smoothing technique to uncover trends in a time series. On the Data Science ribbon, from the Applying Your Model tab, select Help - Examples, then select Forecasting/Data Mining Examples, and open the example data set, Airpass.xlsx. This data set contains the monthly totals of international airline passengers from 1949-1960. After the example data set opens, click a cell in the data set, then on the Data Science ribbon, from the Time Series tab, select Partition to open the Time Series Partition Data dialog.

At Time Variable, select Month, and in the Variables in the Partition Data list, select Passengers. Click OK to partition the data into Training and Validation sets. (Partitioning is optional. Smoothing techniques may be run on full, unpartitioned data sets.)

Exponential Smoothing Dialog 

 

Click the Data_PartitionTS worksheet, then on the Data Science ribbon, from the Time Series tab, select Smoothing - Exponential to open the Exponential Smoothing dialog.

Month has already been selected as the Time variable. Select Passengers as the Selected variable, and under Output Options, select Produce forecast on validation.

Exponential Smoothing Dialog 

Click OK to apply the smoothing technique. The worksheets ExponentialOutput and Exponential_Stored are inserted immediately to the right of the Data_PartitionTS worksheet. For more information about the Exponential_Stored worksheet, see the Tools -Scoring New Data section.

Click the ExponentialOutput worksheet. The Time Plot of Actual Vs. Forecast (Training Data) chart shows that the Exponential smoothing technique does not result in a good fit, as the model does not effectively capture the seasonality in the data set. As a result, during the summer months, where the number of airline passengers are typically high, appear to be under forecasted (i.e., too low), and the forecasts for months with low passenger numbers are too high. Consequently, an exponential smoothing forecast should never be used when the data set includes seasonality. An alternative would be to perform a regression on the model, and then apply this technique to the residuals.

  Time Plot of Actual Vs Forecast (Training Data) 

The following example does not include seasonality.

On the Data Science ribbon, from the Applying Your Model tab, select Help - Examples, then select Forecasting/Data Mining Examples, and open the example data set Income.xlsx. This data set contains the average income of tax payers by state. First partition the data set into Training and Validation Sets using Year as the Time Variable, and CA as the Variables in the Partition Data.

Click OK to accept the partitioning defaults and create the Training and Validation Sets. 

Time Series Partition Data Dialog 

Select the Data_PartitionTS worksheet, then on the Data Science ribbon, from Time Series tab, select Smoothing - Exponential to open the Exponential Smoothing dialog.

Year has automatically been selected as the Time Variable. Select CA as the Selected variable, and under Output Options, select Produce forecast on validation.

The smoothing parameter (Alpha) determines the magnitude of weights assigned to the observations. For example, a value close to 1 would result in the most recent observations being assigned the largest weights, and the earliest observations being assigned the smallest weights. A value close to 0 would result in the earliest observations being assigned the largest weights, and the latest observations being assigned the smallest weights. As a result, the value of Alpha depends on how much influence the most recent observations should have on the model.

Analytic Solver Data Science includes the Optimize feature to choose the Alpha parameter value that results in the minimum residual mean squared error. It is recommended that this feature be used carefully, as it can often lead to a model that is over-fitted to the Training Set. An overfit model rarely exhibits high predictive accuracy in the Validation Set.

Exponential Smoothing Dialog 

Click OK to accept the default Alpha value of 0.2. The worksheets ExponentialOutput and Exponential_Stored are inserted to the right of the Data_PartitionTS worksheet. For more information on the Exponential_Stored worksheet, see the Applying Your Model - Scoring New Data section.

Click the ExponentialOutput worksheet and scroll down. The Training Error Measures and Validation Error Measures tables show a fitted model with an MSE of 166,936.72 for the Training Set, and an MSE of 9,182,228.9 for the Validation Set. These are fairly large numbers, which indicate that the model is not well fit.

  Training Error Measures

   Training Error Measures 

  Validation Error Measures

  Validation Error Measures

Click back to the Data_PartitionTS worksheet, then on the Data Science ribbon, from the Time Series tab, select Smoothing - Exponential Smoothing to run the technique a second time. Again, select CA as the Selected variable, and under Output Options, select Produce forecast on validation. Under Parameters - Weights, select Optimize, then click OK.

Exponential Smoothing Dialog 

Click the ExponentialOutput1 worksheet. Analytic Solver Data Science used an Alpha = 0.9976 that resulted in an MSE of 0.12655 for the Training Set, and an MSE of 2735.153 for the Validation Set -- much smaller values than when an Alpha = 0.2 was used. (Using the Optimize algorithm results in a much better model.)

  Inputs

 

  Time Plot of Actual Vs Forecast (Training Data)

  Training Error Measures   

  Time Plot of Actual Vs Forecast (Validation Data)

  Validation Error Measures