library(here)
library(arules)
library(arulesViz)
library(tidyverse)
library(reshape2)
library(ggplot2)
library(plotly)
library(dplyr)
library(DT)
<- read_csv(here("data", "basketAnalysis.csv")) %>%
data ::clean_names()
janitor
::datatable(data) DT
Introduction
Also called Frequent itemset analysis, association analysis, association rule learning, it is one of the most popular data mining methods.
It is typically applied in the analysis of consumer buying trends in supermarkets and chain stores as it models a many-to-many relationships between items and baskets (also transactions).
Using data from a groceries store, we will try to investigate it to discover patterns and/or correlations between items purchased.
Data
Our dataset has 999 observations or transactions and 17 variables including the index variable representing the transaction or basket number.
Data cleaning
<- data %>%
data rename(transaction = x1) %>%
mutate(transaction= transaction + 1)
::datatable(data) DT
Data Exploration
::skim(data) skimr
Name | data |
Number of rows | 999 |
Number of columns | 17 |
_______________________ | |
Column type frequency: | |
logical | 16 |
numeric | 1 |
________________________ | |
Group variables | None |
Variable type: logical
skim_variable | n_missing | complete_rate | mean | count |
---|---|---|---|---|
apple | 0 | 1 | 0.38 | FAL: 616, TRU: 383 |
bread | 0 | 1 | 0.38 | FAL: 615, TRU: 384 |
butter | 0 | 1 | 0.42 | FAL: 579, TRU: 420 |
cheese | 0 | 1 | 0.40 | FAL: 595, TRU: 404 |
corn | 0 | 1 | 0.41 | FAL: 592, TRU: 407 |
dill | 0 | 1 | 0.40 | FAL: 601, TRU: 398 |
eggs | 0 | 1 | 0.38 | FAL: 615, TRU: 384 |
ice_cream | 0 | 1 | 0.41 | FAL: 589, TRU: 410 |
kidney_beans | 0 | 1 | 0.41 | FAL: 591, TRU: 408 |
milk | 0 | 1 | 0.41 | FAL: 594, TRU: 405 |
nutmeg | 0 | 1 | 0.40 | FAL: 598, TRU: 401 |
onion | 0 | 1 | 0.40 | FAL: 596, TRU: 403 |
sugar | 0 | 1 | 0.41 | FAL: 590, TRU: 409 |
unicorn | 0 | 1 | 0.39 | FAL: 610, TRU: 389 |
yogurt | 0 | 1 | 0.42 | FAL: 579, TRU: 420 |
chocolate | 0 | 1 | 0.42 | FAL: 578, TRU: 421 |
Variable type: numeric
skim_variable | n_missing | complete_rate | mean | sd | p0 | p25 | p50 | p75 | p100 | hist |
---|---|---|---|---|---|---|---|---|---|---|
transaction | 0 | 1 | 500 | 288.53 | 1 | 250.5 | 500 | 749.5 | 999 | ▇▇▇▇▇ |
Notes:
We do not have missing values;
The mean represents the frequency of TRUE in each transaction, so it is the `support` measure, which tells us how frequently a set of items appear in transactions. apple for example has a support of 0.383% because it appears 383 times out of 999 transactions.
To help us extract the most relevant itemsets in our data, we will use measures: support, confidence, and lift.
support : is an indication of how frequently a set of items appear in baskets. \[ Support_(x) = \frac {frequency(X)}{n} \]
Confidence : is an indication of how often the support-rule has been found to be true. High is 50%, if less than that the rule has little practical effect. \[ Confidence(X⤑Y) = \frac {support(X ∪ Y)}{support(X)} = \frac {P(X ∩ Y)}{P(X)} = P(Y | X) \]
Lift : is a measure of association using both support and confidence. \(Lift(X-> Y) = \frac { P(X ∩ Y)} {P(X) ∗ P(Y)}\)
Let us visualize the frequency of each item to find out which items are most popular and which ones are the least popular:
<- data %>%
p melt(., id.vars = "transaction",
variable.name = "Item",
value.name = "Purchased") %>%
group_by(Item) %>%
summarise(Frequency = sum(Purchased)) %>%
ggplot( aes(x = reorder(Item, -Frequency), y = Frequency)) +
geom_bar(stat = "identity", fill = "steelblue") +
labs(title = "Item Distribution", x = "Item", y = "Frequency") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1))
ggplotly(p)
Chocolate seems to be the most purchased items closely followed by butter and yogurt while breads, eggs, and apple are less purchased.
Though less purchased compared others, there is not a major difference between the them.
Maybe we could visualize which pairs of items are usually bought together:
# Generate all possible item pairs
<- function(items) {
generate_pairs if (length(items) >= 2) {
combn(items, 2, simplify = FALSE)
else {
} NULL
}
}
# Create a co-occurrence matrix
<- do.call(rbind, lapply(1:nrow(data), function(i) {
item_pairs <- data[i, -1]
row <- names(row)[row == TRUE]
items <- generate_pairs(items)
pairs if (!is.null(pairs)) {
do.call(rbind, lapply(pairs, function(pair) data.frame(Item1 = pair[1], Item2 = pair[2])))
else {
} NULL
}
}))
# Count the frequency of each item pair
<- item_pairs %>%
pair_counts group_by(Item1, Item2) %>%
summarise(Frequency = n()) %>%
ungroup()
# Create a bubble chart using ggplot2
<- ggplot(pair_counts, aes(x = Item1, y = Item2, size = Frequency, color = Frequency)) +
p2 geom_point(alpha = 0.7) +
labs(title = "Items Frequently Bought Together", x = "Item 1", y = "Item 2") +
theme_minimal() +
theme(axis.text.x = element_text(angle = 45, hjust = 1)) +
scale_color_gradientn(colors = c("steelblue","yellow"))
# Make the plot interactive using plotly
::datatable(pair_counts) DT
ggplotly(p2)
We see that while some products are never bought together (i.e. bread and ice-cream, eggs and onions), some pairs seems to be frequently bought together (i.e. milk and chocolate 211 times, ice-cream and butter 207 times, or even ice-cream and chocolate 202 times).
We just observed pairs of products frequently bought together, what is the average number of items in a transaction?
<- data %>%
p3 rowwise() %>%
mutate(item_count = sum(c_across(-transaction))) %>%
ungroup() %>%
group_by(item_count) %>%
summarise(Frequency = n()) %>%
%>%
ungroup ggplot(aes(item_count, Frequency)) +
geom_bar(stat = "identity", fill = "steelblue") +
labs(title = "Distribution of the Number of Items per Transaction",
x = "Number of Items",
y = "Frequency") +
theme_minimal()
ggplotly(p3)
Most transactions contain on average 7 to 9 items in them.
Modelling
While there are many algorithms we could use, we will focus on the A-priori algorithm because it is the most popular and easy to implement. Furthermore, it uses the measures we mentioned above in the process of generating itemsets.
Data transformation
Our dataset only contains boolen values and does not need further pre-processing which are required when dealing with numerical features.
However, we do need to transform our tibble into a set of transactions:
<- data %>%
trans select(-transaction) %>%
transactions()
trans
transactions in sparse format with
999 transactions (rows) and
16 items (columns)
We have 999 transactions and 16 unique items which are:
[1] "apple" "bread" "butter" "cheese" "corn"
[6] "dill" "eggs" "ice_cream" "kidney_beans" "milk"
[11] "nutmeg" "onion" "sugar" "unicorn" "yogurt"
[16] "chocolate"
Let us have a look at the first 2 transactions:
items transactionID
[1] {bread,
corn,
dill,
ice_cream,
sugar,
yogurt,
chocolate} 1
[2] {milk} 2
A summary can be made about all the transactions:
transactions as itemMatrix in sparse format with
999 rows (elements/itemsets/transactions) and
16 columns (items) and a density of 0.4032783
most frequent items:
chocolate butter yogurt ice_cream sugar (Other)
421 420 420 410 409 4366
element (itemset/transaction) length distribution:
sizes
1 2 3 4 5 6 7 8 9 10 11 12 13
59 63 72 81 99 99 132 115 115 82 55 23 4
Min. 1st Qu. Median Mean 3rd Qu. Max.
1.000 4.000 7.000 6.452 9.000 13.000
includes extended item information - examples:
labels variables levels
1 apple apple TRUE
2 bread bread TRUE
3 butter butter TRUE
includes extended transaction information - examples:
transactionID
1 1
2 2
3 3
We have a sparse matrix, the most frequent items, as well as a more detailed length distribution of the transactions (i.e 59 transactions only have 1 item in them).
Now we model the algorithm but before we have to set the parameters values of our support and confidence:
Rules generation
Based on our EDA, the items frequencies are relatively the same, and setting a moderate support could help extract rules of those runner-items. Furthermore, setting the confidence high (50%) will strengthen our believe about the rules found.
support-threshold: 15%
confidence : 50%
<- apriori(trans,
rules parameter = list(supp=0.15,
conf= 0.5,
maxlen=10,
target= "rules"))
Apriori
Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen
0.5 0.1 1 none FALSE TRUE 5 0.15 1
maxlen target ext
10 rules TRUE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
Absolute minimum support count: 149
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[16 item(s), 999 transaction(s)] done [0.00s].
sorting and recoding items ... [16 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [5 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
inspect(rules) %>%
::datatable() DT
lhs rhs support confidence coverage lift count
[1] {bread} => {yogurt} 0.1931932 0.5026042 0.3843844 1.195480 193
[2] {dill} => {chocolate} 0.1991992 0.5000000 0.3983984 1.186461 199
[3] {milk} => {chocolate} 0.2112112 0.5209877 0.4054054 1.236263 211
[4] {chocolate} => {milk} 0.2112112 0.5011876 0.4214214 1.236263 211
[5] {ice_cream} => {butter} 0.2072072 0.5048780 0.4104104 1.200889 207
Based on our parameters, we extracted the above 5 rules:
Yogurt is likely to be bought if bread is bought
chocolate is likely to be bought if dill is bought
Should we want a more invasive marketing approach, we could lessen our confidence but increasing the support in order to extract more rules from our data:
Support-threshold: 15%
confidence: 40%
<- apriori(trans,
rules parameter = list(supp=0.15,
conf= 0.4,
maxlen=10,
minlen= 2,
target= "rules")
)
Apriori
Parameter specification:
confidence minval smax arem aval originalSupport maxtime support minlen
0.4 0.1 1 none FALSE TRUE 5 0.15 2
maxlen target ext
10 rules TRUE
Algorithmic control:
filter tree heap memopt load sort verbose
0.1 TRUE TRUE FALSE TRUE 2 TRUE
Absolute minimum support count: 149
set item appearances ...[0 item(s)] done [0.00s].
set transactions ...[16 item(s), 999 transaction(s)] done [0.00s].
sorting and recoding items ... [16 item(s)] done [0.00s].
creating transaction tree ... done [0.00s].
checking subsets of size 1 2 3 done [0.00s].
writing ... [238 rule(s)] done [0.00s].
creating S4 object ... done [0.00s].
inspect(rules) %>%
::datatable() DT
lhs rhs support confidence coverage lift
[1] {eggs} => {bread} 0.1571572 0.4088542 0.3843844 1.0636597
[2] {bread} => {eggs} 0.1571572 0.4088542 0.3843844 1.0636597
[3] {eggs} => {apple} 0.1561562 0.4062500 0.3843844 1.0596443
[4] {apple} => {eggs} 0.1561562 0.4073107 0.3833834 1.0596443
[5] {eggs} => {unicorn} 0.1681682 0.4375000 0.3843844 1.1235540
[6] {unicorn} => {eggs} 0.1681682 0.4318766 0.3893894 1.1235540
[7] {eggs} => {dill} 0.1571572 0.4088542 0.3843844 1.0262445
[8] {eggs} => {cheese} 0.1691692 0.4401042 0.3843844 1.0882774
[9] {cheese} => {eggs} 0.1691692 0.4183168 0.4044044 1.0882774
[10] {eggs} => {nutmeg} 0.1721722 0.4479167 0.3843844 1.1158822
[11] {nutmeg} => {eggs} 0.1721722 0.4289277 0.4014014 1.1158822
[12] {eggs} => {onion} 0.1741742 0.4531250 0.3843844 1.1232553
[13] {onion} => {eggs} 0.1741742 0.4317618 0.4034034 1.1232553
[14] {eggs} => {corn} 0.1801802 0.4687500 0.3843844 1.1505682
[15] {corn} => {eggs} 0.1801802 0.4422604 0.4074074 1.1505682
[16] {eggs} => {kidney_beans} 0.1691692 0.4401042 0.3843844 1.0776080
[17] {kidney_beans} => {eggs} 0.1691692 0.4142157 0.4084084 1.0776080
[18] {eggs} => {sugar} 0.1701702 0.4427083 0.3843844 1.0813340
[19] {sugar} => {eggs} 0.1701702 0.4156479 0.4094094 1.0813340
[20] {eggs} => {milk} 0.1761762 0.4583333 0.3843844 1.1305556
[21] {milk} => {eggs} 0.1761762 0.4345679 0.4054054 1.1305556
[22] {eggs} => {ice_cream} 0.1571572 0.4088542 0.3843844 0.9962081
[23] {eggs} => {yogurt} 0.1861862 0.4843750 0.3843844 1.1521205
[24] {yogurt} => {eggs} 0.1861862 0.4428571 0.4204204 1.1521205
[25] {eggs} => {butter} 0.1731732 0.4505208 0.3843844 1.0715960
[26] {butter} => {eggs} 0.1731732 0.4119048 0.4204204 1.0715960
[27] {eggs} => {chocolate} 0.1821822 0.4739583 0.3843844 1.1246660
[28] {chocolate} => {eggs} 0.1821822 0.4323040 0.4214214 1.1246660
[29] {bread} => {apple} 0.1541542 0.4010417 0.3843844 1.0460591
[30] {apple} => {bread} 0.1541542 0.4020888 0.3833834 1.0460591
[31] {bread} => {unicorn} 0.1681682 0.4375000 0.3843844 1.1235540
[32] {unicorn} => {bread} 0.1681682 0.4318766 0.3893894 1.1235540
[33] {bread} => {dill} 0.1601602 0.4166667 0.3843844 1.0458543
[34] {dill} => {bread} 0.1601602 0.4020101 0.3983984 1.0458543
[35] {bread} => {cheese} 0.1731732 0.4505208 0.3843844 1.1140354
[36] {cheese} => {bread} 0.1731732 0.4282178 0.4044044 1.1140354
[37] {bread} => {nutmeg} 0.1711712 0.4453125 0.3843844 1.1093945
[38] {nutmeg} => {bread} 0.1711712 0.4264339 0.4014014 1.1093945
[39] {bread} => {onion} 0.1781782 0.4635417 0.3843844 1.1490772
[40] {onion} => {bread} 0.1781782 0.4416873 0.4034034 1.1490772
[41] {bread} => {corn} 0.1741742 0.4531250 0.3843844 1.1122159
[42] {corn} => {bread} 0.1741742 0.4275184 0.4074074 1.1122159
[43] {bread} => {kidney_beans} 0.1671672 0.4348958 0.3843844 1.0648552
[44] {kidney_beans} => {bread} 0.1671672 0.4093137 0.4084084 1.0648552
[45] {bread} => {sugar} 0.1791792 0.4661458 0.3843844 1.1385811
[46] {sugar} => {bread} 0.1791792 0.4376528 0.4094094 1.1385811
[47] {bread} => {milk} 0.1741742 0.4531250 0.3843844 1.1177083
[48] {milk} => {bread} 0.1741742 0.4296296 0.4054054 1.1177083
[49] {bread} => {ice_cream} 0.1811812 0.4713542 0.3843844 1.1484947
[50] {ice_cream} => {bread} 0.1811812 0.4414634 0.4104104 1.1484947
[51] {bread} => {yogurt} 0.1931932 0.5026042 0.3843844 1.1954799
[52] {yogurt} => {bread} 0.1931932 0.4595238 0.4204204 1.1954799
[53] {bread} => {butter} 0.1801802 0.4687500 0.3843844 1.1149554
[54] {butter} => {bread} 0.1801802 0.4285714 0.4204204 1.1149554
[55] {bread} => {chocolate} 0.1851852 0.4817708 0.3843844 1.1432044
[56] {chocolate} => {bread} 0.1851852 0.4394299 0.4214214 1.1432044
[57] {apple} => {unicorn} 0.1661662 0.4334204 0.3833834 1.1130770
[58] {unicorn} => {apple} 0.1661662 0.4267352 0.3893894 1.1130770
[59] {apple} => {dill} 0.1791792 0.4673629 0.3833834 1.1731044
[60] {dill} => {apple} 0.1791792 0.4497487 0.3983984 1.1731044
[61] {apple} => {cheese} 0.1621622 0.4229765 0.3833834 1.0459246
[62] {cheese} => {apple} 0.1621622 0.4009901 0.4044044 1.0459246
[63] {apple} => {nutmeg} 0.1721722 0.4490862 0.3833834 1.1187957
[64] {nutmeg} => {apple} 0.1721722 0.4289277 0.4014014 1.1187957
[65] {apple} => {onion} 0.1671672 0.4360313 0.3833834 1.0808816
[66] {onion} => {apple} 0.1671672 0.4143921 0.4034034 1.0808816
[67] {apple} => {corn} 0.1861862 0.4856397 0.3833834 1.1920247
[68] {corn} => {apple} 0.1861862 0.4570025 0.4074074 1.1920247
[69] {apple} => {kidney_beans} 0.1761762 0.4595300 0.3833834 1.1251728
[70] {kidney_beans} => {apple} 0.1761762 0.4313725 0.4084084 1.1251728
[71] {apple} => {sugar} 0.1821822 0.4751958 0.3833834 1.1606861
[72] {sugar} => {apple} 0.1821822 0.4449878 0.4094094 1.1606861
[73] {apple} => {milk} 0.1841842 0.4804178 0.3833834 1.1850305
[74] {milk} => {apple} 0.1841842 0.4543210 0.4054054 1.1850305
[75] {apple} => {ice_cream} 0.1721722 0.4490862 0.3833834 1.0942368
[76] {ice_cream} => {apple} 0.1721722 0.4195122 0.4104104 1.0942368
[77] {apple} => {yogurt} 0.1871872 0.4882507 0.3833834 1.1613391
[78] {yogurt} => {apple} 0.1871872 0.4452381 0.4204204 1.1613391
[79] {apple} => {butter} 0.1881882 0.4908616 0.3833834 1.1675494
[80] {butter} => {apple} 0.1881882 0.4476190 0.4204204 1.1675494
[81] {apple} => {chocolate} 0.1831832 0.4778068 0.3833834 1.1337981
[82] {chocolate} => {apple} 0.1831832 0.4346793 0.4214214 1.1337981
[83] {unicorn} => {dill} 0.1681682 0.4318766 0.3893894 1.0840320
[84] {dill} => {unicorn} 0.1681682 0.4221106 0.3983984 1.0840320
[85] {unicorn} => {cheese} 0.1701702 0.4370180 0.3893894 1.0806460
[86] {cheese} => {unicorn} 0.1701702 0.4207921 0.4044044 1.0806460
[87] {unicorn} => {nutmeg} 0.1651652 0.4241645 0.3893894 1.0567091
[88] {nutmeg} => {unicorn} 0.1651652 0.4114713 0.4014014 1.0567091
[89] {unicorn} => {onion} 0.1751752 0.4498715 0.3893894 1.1151901
[90] {onion} => {unicorn} 0.1751752 0.4342432 0.4034034 1.1151901
[91] {unicorn} => {corn} 0.1771772 0.4550129 0.3893894 1.1168497
[92] {corn} => {unicorn} 0.1771772 0.4348894 0.4074074 1.1168497
[93] {unicorn} => {kidney_beans} 0.1841842 0.4730077 0.3893894 1.1581733
[94] {kidney_beans} => {unicorn} 0.1841842 0.4509804 0.4084084 1.1581733
[95] {unicorn} => {sugar} 0.1811812 0.4652956 0.3893894 1.1365045
[96] {sugar} => {unicorn} 0.1811812 0.4425428 0.4094094 1.1365045
[97] {unicorn} => {milk} 0.1831832 0.4704370 0.3893894 1.1604113
[98] {milk} => {unicorn} 0.1831832 0.4518519 0.4054054 1.1604113
[99] {unicorn} => {ice_cream} 0.1851852 0.4755784 0.3893894 1.1587874
[100] {ice_cream} => {unicorn} 0.1851852 0.4512195 0.4104104 1.1587874
[101] {unicorn} => {yogurt} 0.1841842 0.4730077 0.3893894 1.1250826
[102] {yogurt} => {unicorn} 0.1841842 0.4380952 0.4204204 1.1250826
[103] {unicorn} => {butter} 0.1821822 0.4678663 0.3893894 1.1128535
[104] {butter} => {unicorn} 0.1821822 0.4333333 0.4204204 1.1128535
[105] {unicorn} => {chocolate} 0.1861862 0.4781491 0.3893894 1.1346103
[106] {chocolate} => {unicorn} 0.1861862 0.4418052 0.4214214 1.1346103
[107] {dill} => {cheese} 0.1771772 0.4447236 0.3983984 1.0997002
[108] {cheese} => {dill} 0.1771772 0.4381188 0.4044044 1.0997002
[109] {dill} => {nutmeg} 0.1731732 0.4346734 0.3983984 1.0828895
[110] {nutmeg} => {dill} 0.1731732 0.4314214 0.4014014 1.0828895
[111] {dill} => {onion} 0.1921922 0.4824121 0.3983984 1.1958552
[112] {onion} => {dill} 0.1921922 0.4764268 0.4034034 1.1958552
[113] {dill} => {corn} 0.1801802 0.4522613 0.3983984 1.1100959
[114] {corn} => {dill} 0.1801802 0.4422604 0.4074074 1.1100959
[115] {dill} => {kidney_beans} 0.1721722 0.4321608 0.3983984 1.0581584
[116] {kidney_beans} => {dill} 0.1721722 0.4215686 0.4084084 1.0581584
[117] {dill} => {sugar} 0.1791792 0.4497487 0.3983984 1.0985306
[118] {sugar} => {dill} 0.1791792 0.4376528 0.4094094 1.0985306
[119] {dill} => {milk} 0.1901902 0.4773869 0.3983984 1.1775544
[120] {milk} => {dill} 0.1901902 0.4691358 0.4054054 1.1775544
[121] {dill} => {ice_cream} 0.1851852 0.4648241 0.3983984 1.1325836
[122] {ice_cream} => {dill} 0.1851852 0.4512195 0.4104104 1.1325836
[123] {dill} => {yogurt} 0.1851852 0.4648241 0.3983984 1.1056174
[124] {yogurt} => {dill} 0.1851852 0.4404762 0.4204204 1.1056174
[125] {dill} => {butter} 0.1751752 0.4396985 0.3983984 1.0458543
[126] {butter} => {dill} 0.1751752 0.4166667 0.4204204 1.0458543
[127] {dill} => {chocolate} 0.1991992 0.5000000 0.3983984 1.1864608
[128] {chocolate} => {dill} 0.1991992 0.4726841 0.4214214 1.1864608
[129] {cheese} => {nutmeg} 0.1921922 0.4752475 0.4044044 1.1839708
[130] {nutmeg} => {cheese} 0.1921922 0.4788030 0.4014014 1.1839708
[131] {cheese} => {onion} 0.1851852 0.4579208 0.4044044 1.1351436
[132] {onion} => {cheese} 0.1851852 0.4590571 0.4034034 1.1351436
[133] {cheese} => {corn} 0.1821822 0.4504950 0.4044044 1.1057606
[134] {corn} => {cheese} 0.1821822 0.4471744 0.4074074 1.1057606
[135] {cheese} => {kidney_beans} 0.2002002 0.4950495 0.4044044 1.2121433
[136] {kidney_beans} => {cheese} 0.2002002 0.4901961 0.4084084 1.2121433
[137] {cheese} => {sugar} 0.1871872 0.4628713 0.4044044 1.1305829
[138] {sugar} => {cheese} 0.1871872 0.4572127 0.4094094 1.1305829
[139] {cheese} => {milk} 0.1721722 0.4257426 0.4044044 1.0501650
[140] {milk} => {cheese} 0.1721722 0.4246914 0.4054054 1.0501650
[141] {cheese} => {ice_cream} 0.1871872 0.4628713 0.4044044 1.1278254
[142] {ice_cream} => {cheese} 0.1871872 0.4560976 0.4104104 1.1278254
[143] {cheese} => {yogurt} 0.1811812 0.4480198 0.4044044 1.0656471
[144] {yogurt} => {cheese} 0.1811812 0.4309524 0.4204204 1.0656471
[145] {cheese} => {butter} 0.1821822 0.4504950 0.4044044 1.0715347
[146] {butter} => {cheese} 0.1821822 0.4333333 0.4204204 1.0715347
[147] {cheese} => {chocolate} 0.1861862 0.4603960 0.4044044 1.0924837
[148] {chocolate} => {cheese} 0.1861862 0.4418052 0.4214214 1.0924837
[149] {nutmeg} => {onion} 0.1951952 0.4862843 0.4014014 1.2054541
[150] {onion} => {nutmeg} 0.1951952 0.4838710 0.4034034 1.2054541
[151] {nutmeg} => {corn} 0.1811812 0.4513716 0.4014014 1.1079120
[152] {corn} => {nutmeg} 0.1811812 0.4447174 0.4074074 1.1079120
[153] {nutmeg} => {kidney_beans} 0.1891892 0.4713217 0.4014014 1.1540450
[154] {kidney_beans} => {nutmeg} 0.1891892 0.4632353 0.4084084 1.1540450
[155] {nutmeg} => {sugar} 0.1931932 0.4812968 0.4014014 1.1755879
[156] {sugar} => {nutmeg} 0.1931932 0.4718826 0.4094094 1.1755879
[157] {nutmeg} => {milk} 0.1821822 0.4538653 0.4014014 1.1195345
[158] {milk} => {nutmeg} 0.1821822 0.4493827 0.4054054 1.1195345
[159] {nutmeg} => {ice_cream} 0.1871872 0.4663342 0.4014014 1.1362630
[160] {ice_cream} => {nutmeg} 0.1871872 0.4560976 0.4104104 1.1362630
[161] {nutmeg} => {yogurt} 0.1921922 0.4788030 0.4014014 1.1388671
[162] {yogurt} => {nutmeg} 0.1921922 0.4571429 0.4204204 1.1388671
[163] {nutmeg} => {butter} 0.1981982 0.4937656 0.4014014 1.1744567
[164] {butter} => {nutmeg} 0.1981982 0.4714286 0.4204204 1.1744567
[165] {nutmeg} => {chocolate} 0.1861862 0.4638404 0.4014014 1.1006569
[166] {chocolate} => {nutmeg} 0.1861862 0.4418052 0.4214214 1.1006569
[167] {onion} => {corn} 0.1841842 0.4565757 0.4034034 1.1206858
[168] {corn} => {onion} 0.1841842 0.4520885 0.4074074 1.1206858
[169] {onion} => {kidney_beans} 0.1701702 0.4218362 0.4034034 1.0328784
[170] {kidney_beans} => {onion} 0.1701702 0.4166667 0.4084084 1.0328784
[171] {onion} => {sugar} 0.1901902 0.4714640 0.4034034 1.1515710
[172] {sugar} => {onion} 0.1901902 0.4645477 0.4094094 1.1515710
[173] {onion} => {milk} 0.1821822 0.4516129 0.4034034 1.1139785
[174] {milk} => {onion} 0.1821822 0.4493827 0.4054054 1.1139785
[175] {onion} => {ice_cream} 0.1921922 0.4764268 0.4034034 1.1608546
[176] {ice_cream} => {onion} 0.1921922 0.4682927 0.4104104 1.1608546
[177] {onion} => {yogurt} 0.1921922 0.4764268 0.4034034 1.1332152
[178] {yogurt} => {onion} 0.1921922 0.4571429 0.4204204 1.1332152
[179] {onion} => {butter} 0.1971972 0.4888337 0.4034034 1.1627260
[180] {butter} => {onion} 0.1971972 0.4690476 0.4204204 1.1627260
[181] {onion} => {chocolate} 0.1961962 0.4863524 0.4034034 1.1540760
[182] {chocolate} => {onion} 0.1961962 0.4655582 0.4214214 1.1540760
[183] {corn} => {kidney_beans} 0.1951952 0.4791155 0.4074074 1.1731283
[184] {kidney_beans} => {corn} 0.1951952 0.4779412 0.4084084 1.1731283
[185] {corn} => {sugar} 0.1871872 0.4594595 0.4074074 1.1222494
[186] {sugar} => {corn} 0.1871872 0.4572127 0.4094094 1.1222494
[187] {corn} => {milk} 0.1931932 0.4742015 0.4074074 1.1696970
[188] {milk} => {corn} 0.1931932 0.4765432 0.4054054 1.1696970
[189] {corn} => {ice_cream} 0.1921922 0.4717445 0.4074074 1.1494457
[190] {ice_cream} => {corn} 0.1921922 0.4682927 0.4104104 1.1494457
[191] {corn} => {yogurt} 0.1901902 0.4668305 0.4074074 1.1103896
[192] {yogurt} => {corn} 0.1901902 0.4523810 0.4204204 1.1103896
[193] {corn} => {butter} 0.1911912 0.4692875 0.4074074 1.1162338
[194] {butter} => {corn} 0.1911912 0.4547619 0.4204204 1.1162338
[195] {corn} => {chocolate} 0.1921922 0.4717445 0.4074074 1.1194127
[196] {chocolate} => {corn} 0.1921922 0.4560570 0.4214214 1.1194127
[197] {kidney_beans} => {sugar} 0.1871872 0.4583333 0.4084084 1.1194988
[198] {sugar} => {kidney_beans} 0.1871872 0.4572127 0.4094094 1.1194988
[199] {kidney_beans} => {milk} 0.1991992 0.4877451 0.4084084 1.2031046
[200] {milk} => {kidney_beans} 0.1991992 0.4913580 0.4054054 1.2031046
[201] {kidney_beans} => {ice_cream} 0.1961962 0.4803922 0.4084084 1.1705165
[202] {ice_cream} => {kidney_beans} 0.1961962 0.4780488 0.4104104 1.1705165
[203] {kidney_beans} => {yogurt} 0.1941942 0.4754902 0.4084084 1.1309874
[204] {yogurt} => {kidney_beans} 0.1941942 0.4619048 0.4204204 1.1309874
[205] {kidney_beans} => {butter} 0.2022022 0.4950980 0.4084084 1.1776261
[206] {butter} => {kidney_beans} 0.2022022 0.4809524 0.4204204 1.1776261
[207] {kidney_beans} => {chocolate} 0.1911912 0.4681373 0.4084084 1.1108530
[208] {chocolate} => {kidney_beans} 0.1911912 0.4536817 0.4214214 1.1108530
[209] {sugar} => {milk} 0.1861862 0.4547677 0.4094094 1.1217604
[210] {milk} => {sugar} 0.1861862 0.4592593 0.4054054 1.1217604
[211] {sugar} => {ice_cream} 0.1951952 0.4767726 0.4094094 1.1616972
[212] {ice_cream} => {sugar} 0.1951952 0.4756098 0.4104104 1.1616972
[213] {sugar} => {yogurt} 0.1911912 0.4669927 0.4094094 1.1107754
[214] {yogurt} => {sugar} 0.1911912 0.4547619 0.4204204 1.1107754
[215] {sugar} => {butter} 0.1961962 0.4792176 0.4094094 1.1398533
[216] {butter} => {sugar} 0.1961962 0.4666667 0.4204204 1.1398533
[217] {sugar} => {chocolate} 0.1881882 0.4596577 0.4094094 1.0907317
[218] {chocolate} => {sugar} 0.1881882 0.4465558 0.4214214 1.0907317
[219] {milk} => {ice_cream} 0.1771772 0.4370370 0.4054054 1.0648780
[220] {ice_cream} => {milk} 0.1771772 0.4317073 0.4104104 1.0648780
[221] {milk} => {yogurt} 0.1901902 0.4691358 0.4054054 1.1158730
[222] {yogurt} => {milk} 0.1901902 0.4523810 0.4204204 1.1158730
[223] {milk} => {butter} 0.1981982 0.4888889 0.4054054 1.1628571
[224] {butter} => {milk} 0.1981982 0.4714286 0.4204204 1.1628571
[225] {milk} => {chocolate} 0.2112112 0.5209877 0.4054054 1.2362629
[226] {chocolate} => {milk} 0.2112112 0.5011876 0.4214214 1.2362629
[227] {ice_cream} => {yogurt} 0.1821822 0.4439024 0.4104104 1.0558537
[228] {yogurt} => {ice_cream} 0.1821822 0.4333333 0.4204204 1.0558537
[229] {ice_cream} => {butter} 0.2072072 0.5048780 0.4104104 1.2008885
[230] {butter} => {ice_cream} 0.2072072 0.4928571 0.4204204 1.2008885
[231] {ice_cream} => {chocolate} 0.2022022 0.4926829 0.4104104 1.1690980
[232] {chocolate} => {ice_cream} 0.2022022 0.4798100 0.4214214 1.1690980
[233] {yogurt} => {butter} 0.1911912 0.4547619 0.4204204 1.0816837
[234] {butter} => {yogurt} 0.1911912 0.4547619 0.4204204 1.0816837
[235] {yogurt} => {chocolate} 0.1981982 0.4714286 0.4204204 1.1186630
[236] {chocolate} => {yogurt} 0.1981982 0.4703088 0.4214214 1.1186630
[237] {butter} => {chocolate} 0.2022022 0.4809524 0.4204204 1.1412623
[238] {chocolate} => {butter} 0.2022022 0.4798100 0.4214214 1.1412623
count
[1] 157
[2] 157
[3] 156
[4] 156
[5] 168
[6] 168
[7] 157
[8] 169
[9] 169
[10] 172
[11] 172
[12] 174
[13] 174
[14] 180
[15] 180
[16] 169
[17] 169
[18] 170
[19] 170
[20] 176
[21] 176
[22] 157
[23] 186
[24] 186
[25] 173
[26] 173
[27] 182
[28] 182
[29] 154
[30] 154
[31] 168
[32] 168
[33] 160
[34] 160
[35] 173
[36] 173
[37] 171
[38] 171
[39] 178
[40] 178
[41] 174
[42] 174
[43] 167
[44] 167
[45] 179
[46] 179
[47] 174
[48] 174
[49] 181
[50] 181
[51] 193
[52] 193
[53] 180
[54] 180
[55] 185
[56] 185
[57] 166
[58] 166
[59] 179
[60] 179
[61] 162
[62] 162
[63] 172
[64] 172
[65] 167
[66] 167
[67] 186
[68] 186
[69] 176
[70] 176
[71] 182
[72] 182
[73] 184
[74] 184
[75] 172
[76] 172
[77] 187
[78] 187
[79] 188
[80] 188
[81] 183
[82] 183
[83] 168
[84] 168
[85] 170
[86] 170
[87] 165
[88] 165
[89] 175
[90] 175
[91] 177
[92] 177
[93] 184
[94] 184
[95] 181
[96] 181
[97] 183
[98] 183
[99] 185
[100] 185
[101] 184
[102] 184
[103] 182
[104] 182
[105] 186
[106] 186
[107] 177
[108] 177
[109] 173
[110] 173
[111] 192
[112] 192
[113] 180
[114] 180
[115] 172
[116] 172
[117] 179
[118] 179
[119] 190
[120] 190
[121] 185
[122] 185
[123] 185
[124] 185
[125] 175
[126] 175
[127] 199
[128] 199
[129] 192
[130] 192
[131] 185
[132] 185
[133] 182
[134] 182
[135] 200
[136] 200
[137] 187
[138] 187
[139] 172
[140] 172
[141] 187
[142] 187
[143] 181
[144] 181
[145] 182
[146] 182
[147] 186
[148] 186
[149] 195
[150] 195
[151] 181
[152] 181
[153] 189
[154] 189
[155] 193
[156] 193
[157] 182
[158] 182
[159] 187
[160] 187
[161] 192
[162] 192
[163] 198
[164] 198
[165] 186
[166] 186
[167] 184
[168] 184
[169] 170
[170] 170
[171] 190
[172] 190
[173] 182
[174] 182
[175] 192
[176] 192
[177] 192
[178] 192
[179] 197
[180] 197
[181] 196
[182] 196
[183] 195
[184] 195
[185] 187
[186] 187
[187] 193
[188] 193
[189] 192
[190] 192
[191] 190
[192] 190
[193] 191
[194] 191
[195] 192
[196] 192
[197] 187
[198] 187
[199] 199
[200] 199
[201] 196
[202] 196
[203] 194
[204] 194
[205] 202
[206] 202
[207] 191
[208] 191
[209] 186
[210] 186
[211] 195
[212] 195
[213] 191
[214] 191
[215] 196
[216] 196
[217] 188
[218] 188
[219] 177
[220] 177
[221] 190
[222] 190
[223] 198
[224] 198
[225] 211
[226] 211
[227] 182
[228] 182
[229] 207
[230] 207
[231] 202
[232] 202
[233] 191
[234] 191
[235] 198
[236] 198
[237] 202
[238] 202
Visualizations
Now let us visualize our rules:
plot(rules, engine = "html") %>%
layout(title = "Interactive visualization of rules")
Note that the higher (i.e. lift > 1) the lift score, the more likely item Y is to be purchased given item X is also purchased.
Interactive network plot of rules and items:
References
Michael Hahsler (2024). An R Companion for Introduction to Data Mining. figshare. DOI: 10.6084/m9.figshare.26750404 URL: https://mhahsler.github.io/Introduction_to_Data_Mining_R_Examples/book/
Jan Kirenz (2024). Introduction to Association Rule Mining in R.