This example illustrates the XLMiner Association Rules method. On the XLMiner ribbon, from the Applying Your Model tab, select Help - Examples, then Forecasting/Data Mining Examples to open the Associations.xlsx example file. A portion of the data set is shown below.
Select a cell in the data set, then on the XLMiner Ribbon, from the Data Mining tab, select Associate - Association Rules to open the Association Rule dialog.
Since the data contained in the Associations.xlsx data set are all 0s and 1s, under Input Data Format, select Data in binary matrix format. This option should be selected if each column in the data represents a distinct item. XLMiner treats the data as a matrix of two entities, zeros and nonzeros. All nonzeros are treated as 1s. A 0 signifies that the item is absent in that transaction, and a 1 signifies the item is present. When each row of data consists of item codes or names that are present in that transaction, select Data in item list.
Enter 100 for the Minimum support (# transactions). This option specifies the minimum number of transactions in which a particular item set must appear to qualify for inclusion in an association rule.
Enter 90 for Minimum confidence (%). This option specifies the minimum confidence threshold for rule generation. If A is the set of Antecedents and C the set of Consequents, then only those A =>C (Antecedent implies Consequent) rules will qualify, for which the ratio (support of A U C) / (support of A) at least equals this percentage.
Click OK. The output worksheet, AssocRules_Output, is inserted immediately to the right of the Assoc_binary worksheet.
Rule 2 indicates that if a Youth book, a Reference book, and a Geography book are purchased, then with 90.35% confidence a Child book will also be purchased. The Support for A column indicates that the rule has the support of 114 transactions, meaning that 114 people bought a Youth book, Reference book, and a Geography book. The Support for C column indicates the number of transactions involving the purchase of Child books. The Support for A & C columns indicate the number of transactions where a Youth book, Reference book, Geography book, and Child book were purchased.
The Lift Ratio indicates how likely a transaction will be found where all four book types (Youth, Reference, Geography, and Child) are purchased, as compared to the entire population of transactions. In other words, the Lift Ratio is the Confidence divided by the value for Support for C. For Rule 2, with a confidence of 90.35%, support is calculated as 846/2000 = .423. The Lift Ratio is calculated as .9035/.423 or 2.136. Given support at 90.35% and a Lift Ratio of 2.136, this rule can be considered useful.