XLMiner provides eight types of charts to visually explore your data: Bar Charts, Line Charts, ScatterPlots, Boxplots, Histograms, Parallel Coordinates, ScatterPlot Matrices, and Variable Plots. On the XLMiner ribbon, from the Data Analysis tab, select Explore - Chart Wizard to use the Chart Wizard to create a chart. A description of each chart type follows.
The bar chart is one of the easiest plots to create. The best application for this type of chart is comparing an individual statistic (i.e., mean, count) across a group of variables. The bar height represents the statistic, while the bars represent the different groups. Following is an example of a bar chart.
Box Whisker Plot
A box plot graph summarizes a data set and is often used in exploratory data analysis. This type of graph illustrates the shape of the distribution (its central value), and the range of the data. The plot consists of the most extreme values in the data set (maximum and minimum values), the lower and upper quartiles, and the median.
Box plots are useful when a large number of observations in a data set are involved, when two or more data sets are being compared, or when a distribution is skewed. The box plot cannot be influenced by extreme values or outliers.
Following are statistical terms that should be reviewed.
Median: The median value in a data set is the value that appears in the middle of a sorted data set. If the data set has an even number of values, the median is the average of the two middle values in the data set.
Quartiles: By definition, quartiles separate a quarter of data points from the rest. This means that the first quartile is the value under which 25% of the data is found, and the third quartile is the value over which 25% of the data is found. (Note: this indicates that the second quartile is the median itself.)
First Quartile, Q1: The first quartile is the median of the lower-half of the data. If the number of data points is odd, the lower-half includes the median.
Third Quartile, Q3: The third quartile is the median of the upper-half of the data. If the number of data points is odd, the upper-half of the data includes the median, as in the following example.
Consider the following data set.
52, 57, 60, 63, 71, 72, 73, 76, 98, 110, 120
The data set has 11 values sorted in ascending order. The median is the middle value (i.e., 6th value in this case).
Median = 72
Q1 is the median of the first 6 values (i.e., the mean of 3rd and 4th values)
25th Percentile = 61.5
Q3 is the median of the last 6 values (i.e., the mean of the 8th and 9th values)
75th Percentile = 87
The mean is the average of all the data values ((52 + 57 + 60 + 63 + 71 + 72 + 73 + 76 + 98 + 110 + 120) / 11)
Mean = 77.45
Interquartile Range = 25.5 The Interquartile range is a useful measure of the amount of variation in a data set and is simply the 75th Percentile - 25th Percentile (87 - 61.5 = 25.5)
The box extends from Q1 to Q3 and includes Q2. The extreme points are included in the 'whiskers.' This means the box includes the middle one-half of the data. In XLMiner, the mean is denoted with a dotted line, and the median with a solid line. XLMiner completes the box plot by extending its 'whiskers' to the most extreme points, 52 and 120.
A Histogram, or Frequency Histogram, is a bar graph that depicts the range and scale of the observations on the x axis, and the number of data points (or frequency) of the various intervals on the y axis. These graphs are popular among statisticians. Although these graphs do not show the exact values of the data points, they provide a good idea about the spread and shape of the data.
Consider the percentages below from a college final exam.
82.5, 78.3, 76.2, 81.2, 72.3, 73.2, 76.3, 77.3, 78.2, 78.5, 75.6, 79.2, 78.3, 80.2, 76.4, 77.9, 75.8, 76.5, 77.3, 78.2
The folowing graph displays the value of a histogram. This plot efficiently illustrates the shape and size of the data set above. Note: XLMiner determines the number and size of the intervals when drawing the histogram.
A line chart is best suited for time series data sets. In the example below, the line chart plots the number of airline passengers from January 1949 to December 1960, where the x axis represent the number of months starting with January 1949 as 1.
A Parallel Coordinates plot consists of n number of vertical axes where n is the number of variables selected to be included in the plot. A line is drawn connecting the values for observations for each different variable (each different axis) creating a multivariate profile. These graphs can be useful for prediction and possible data binning. In addition, these graphs can expose clusters, outliers, and variable overlap. Axes can be reordered by simply dragging and moving the axis to the desired location. Following is an example of a Parallel Coordinates plot.
One of the most common plots is the Scatterplot. These graphs are used to compare the relationships between two variables, and are useful in identifying clusters and variable overlap.
A Scatterplot Matrix plot combines several scatterplots into one panel to see pairwise relationships between variables. Given a set of variables Var1, Var2, Var3, ...., Var N the scatterplot matrix plot contains all the pairwise scatterplots of the variables on a single page in a matrix format. The names of the variables are on the diagonals. In other words, if there are k variables, there will be k rows and k columns in the matrix, and the ith row and jth column will be the plot of Vari versus Varj.
The axes titles and the values of the variables appear at the edge of the respective row or column. The comparison of the variables and their interactions can be studied easily,which is why scatterplot matrix plots are becoming increasingly common in general purpose statistical software programs. Following is an example of a scatterplot matrix.
Variable plots simply plot the distribution for each selected variable. Following is an example of a variable plot.
Export to PowerBI
Use the Chart Wizard to export your data to Microsoft's Power BI. Used with Office 365, Microsoft's PowerBI is a cloud-based service that works with Excel to visualize data using various charts and reports. XLMiner V2016 introduces the ability to export a data set directly into PowerBI.
Export to Tableau
Tableau is a popular interactive software package that explores and analyzes data. Tableau can import data from a wide range of sources (i.e., Excel workbooks), and is often used in conjunction with Excel. Using XLMiner V2016, a single click converts the results of an optimization model into a Tableau Data Extract (.tde) file, and can be viewed in Tableau.