This example uses a partitioned data set to illustrate the use of the Manual Network Architecture selection. XLMiner provides four options when creating a Neural Network classifier: Boosting, Bagging (ensemble methods), Automatic, and Manual.
On th XLMiner ribbon, from the Data Mining tab, select Classify - Neural Network - Manual Network to open the Neural Network Classification (Manual Arch.) - Step 1 of 3 dialog, then select a cell on the Data_Partition worksheet.
At Output Variable, select Type, and from the Selected Variables list, select all remaining variables. Since the Output Variable contains three classes (A, B, and C), the options for Classes in the Output Variable are disabled. The options under Classes in the Output Variable are only enabled when the number of classes is equal to 2.
Click Next to advance to the Step 2 of 3 dialog.
The option Normalize input data is selected by default. Normalizing the data (subtracting the mean and dividing by the standard deviation) is important to ensure that the distance measure accords equal weight to each variable -- without normalization, the variable with the largest scale would dominate the measure.
If an integer value appears for Neuron weight initialization seed, XLMiner uses this value to set the neuron weight random number seed. Setting the random number seed to a nonzero value, ensures that the same sequence of random numbers is used each time the neuron weight are calculated. The default value is 12345. If left blank, the random number generator is initialized from the system clock, so the sequence of random numbers is different in each calculation. If you need the results from successive runs of the algorithm to another to be strictly comparable, enter a value for Set the Seed.
At # Hidden Layers, keep the default setting of 1. Up to four hidden layers can be specified for this option. At # Nodes Per Layer, keep the default setting of 12. (Since # hidden layers (max 4) is set to 1, only the first text box is enabled.)
At # Epochs, keep the default setting of 30. An epoch is one sweep through all records in the Training Set.
For Gradient Descent Step Size, keep the default setting of 0.1. This is the multiplying factor for the error correction during backpropagation; it is roughly equivalent to the learning rate for the neural network. A low value produces slow learning, a high value produces rapid (erratic) learning. Values for the step size typically range from 0.1 to 0.9.
For Weight Change Momentum, keep the default setting of 0.6. In each new round of error correction, some memory of the prior correction is retained so that an outlier does not spoil accumulated learning.
For Error tolerance, keep the default setting of 0.01. The error in a particular iteration is backpropagated only if it is greater than the error tolerance. Typically, error tolerance is a small value in the range from 0 to 1.
For Weight Decay, keep the default setting of 0. To prevent over-fitting of the network on the Training Set, select a weight decay to penalize the weight in each iteration. Each calculated weight will be multiplied by 1-decay.
Nodes in the hidden layer receive input from the input layer. The output of the hidden nodes is a weighted sum of the input values. This weighted sum is computed with weights that are initially set at random values. As the network learns, these weights are adjusted. This weighted sum is used to compute the hidden node's output using a transfer function, or activation function. Select Standard (default) to use a logistic function for the transfer function with a range of 0 and 1. This function has a squashing effect on very small or very large values, but is almost linear in the range where the value of the function is between 0.1 and 0.9. Select Symmetric to use the tanh (tangent) function for the transfer function, the range being -1 to 1. If more than one hidden layer exists, this function is used for all layers. Keep the default selection as Standard.
As in the hidden layer output calculation, the output layer is also computed using the same transfer function. Select Standard (default) to use a logistic function for the transfer function with a range of 0 and 1. Select Symmetric to use the tanh (tangent) function for the transfer function, the range being -1 to 1. Select Softmax to use a generalization of the logistic function
XLMiner V2015 provides the ability to partition a data set from within a classification or prediction method by selecting Partitioning Options on the Step 2 of 3 dialog. If this option is selected, XLMiner partitions data set immediately before running the prediction method. If partitioning has already occurred on the data set, this option is disabled. For more information on partitioning, see the Data Mining Partition section.
Click Next to advance to the Step 3 of 3 dialog.
Under Score Training Data and Score Validation Data, Summary report is selected by default. Under both Score Training Data and Score Validation, select Detailed Report. Lift Charts are disabled when the number of classes is greater than two.
Since a test partition was not created, the Score Test Data options are disabled. For information on how to create a test partition, see the Data Mining Partition section. For more information on the Score New Data options, see the Score New Data section.
Click Finish to produce the output.
Click the NNC_Output worksheet to view the Output Navigator.
Scroll down to the Classification Matrices to view the Neural Network Classification algorithm performed. The algorithm finished with 0 errors in the Training Set resulting in an overall error of 0%, and with 1 errors in the Validation Set resulting in an overall error of 2.78%.
On the Output Navigator, click the Training Log link to display the Neural Network Training Log. This log displays the Sum of Squared errors and Misclassification errors for each epoch or iteration of the Neural Network. Thirty epochs (iterations) were performed.
During an epoch, each training record is fed forward in the network and classified. The error is calculated and is back propagated for the weights correction. Weights are continuously adjusted during the epoch. The misclassification error is computed as the records pass through the network. This table does not report the misclassification error after the final weight adjustment. Scoring of the training data is performed using the final weights so the training classification error may not exactly match with the last epoch error in the Epoch log.
XLMiner provides intermediate information produced during the last pass through the network. Scroll down on the NNC_Output worksheet to the Inter-Layer Connection Weights table.
A key element in a neural network is the weights for the connections between nodes. In this example, we chose to have one hidden layer, and 25 nodes in that layer. XLMiner's output contains a section that contains the final values for the weights between the input layer and the hidden layer, between hidden layers, and between the last hidden layer and the output layer. This information is useful at viewing the insides of the neural network; however, it is unlikely to be of use to the data analyst.
Click the NNC_TrainingScore worksheet tab to view the classifications assigned to each record in the Training Set. The class with the largest assigned probability becomes the assigned class. Here you can compare the Predicted Class with the Actual Class. Highlighted records indicate the record was misclassified.
Click the NNC_ValidationScore worksheet tab to view the classifications assigned to each record in the Validation Set. Again, the class with the largest assigned probability becomes the assigned class.
See the Scoring New Data section for information on the Stored Model Sheet, NNC_Stored1.
With this particular data set, we have seen that the best fit to the data came from the ensemble methods.