Introduction

XLMiner supports all facets of the data mining process, including data partition, classification, prediction, and association. The third stage, prediction, is used to predict the response variable value based on a predictor variable. XLMiner functionality features four different prediction methodologies: multiple linear regression, k-nearest neighbors, regression tree, and neural network. Each method has its own unique features and the selection of one is typically determined by the nature of the variables involved.

 

How to Access Prediction Methods in Excel

  1. Launch Excel.
  2. In the toolbar, click XLMINER PLATFORM.
  3. In the ribbon's Data Mining section, click Predict.
  4. In the drop-down menu, select a prediction method.

How to Access Prediction Methods in Excel

 

Prediction Methods

 

Multiple Linear Regression

This method is performed on a dataset to predict the response variable based on a predictor variable or used to study the relationship between a response and predictor variable, for example, student test scores compared to demographic information such as income, education of parents, etc.

 

k-Nearest Neighbors

Like the classification method with the same name above, this prediction method divides a training dataset into groups of k observations using a Euclidean Distance measure to determine similarity between “neighbors”. These groups are used to predict the value of the response for each member of the validation set.

 

Regression Tree

A Regression tree may be considered a variant of a decision tree, designed to approximate real-valued functions instead of being used for classification methods. As with all regression techniques, XLMiner assumes the existence of a single output (response) variable and one or more input (predictor) variables. The output variable is numerical. The general regression tree building methodology allows input variables to be a mixture of continuous and categorical variables. A decision tree is generated when each decision node in the tree contains a test on some input variable's value. The terminal nodes of the tree contain the predicted output variable values.

 

Neural Network

Artificial neural networks are based on the operation and structure of the human brain. These networks process one record at a time and “learn” by comparing their prediction of the record (which as the beginning is largely arbitrary) with the known actual value of the response variable. Errors from the initial prediction of the first records are fed back into the network and used to modify the networks algorithm the second time around. This continues for many, many iterations.

 

Prediction Methods Summary

  • A technique performed on a database either to predict the response variable value based on a predictor variable or to study the relationship between the response variable and the predictor variables.
  • XLMiner supports the use of four prediction methods: multiple linear regression, k-nearest neighbors, regression tree, and neural network.

 

Resources