The table
command does what you want:
table(df$V1, df$V2, useNA = "ifany")
Table will work on all distinct values. If you want blanks ""
to be equivalent to missing values NA
, you need to make that change in your data:
df[df == ""] = NA
Similarly, if the 1 x
or 2 x
doesn’t matter, get rid of them. Maybe add a new column
df$goodname = gsub(pattern = "^[0-9]+ x ", replacement = "", x = df$V1)
table(df$goodname, df$V2, useNA = "ifany")
BUX1_T10963 BUX1_T10964 BUX1_T10965 BUX1_T10966 BUX2_T10076
Bruit (U) 3 5 3 5 1
TAMAN (M) 0 0 1 0 0
TIKIam(T) 0 0 0 3 0
<NA> 0 4 1 2 0
Pulling out the quantity into its own column and tabulating:
library(stringr)
# extract the number
df$quantity = as.numeric(str_extract(df$V1, "^[0-9]+"))
# any missing values assume to be 1
df$quantity[is.na(df$quantity)] = 1
library(reshape2)
dcast(data = df, formula = goodname ~ V2, value.var = "quantity", fun.aggregate = sum, na.rm = T)
# goodname BUX1_T10963 BUX1_T10964 BUX1_T10965 BUX1_T10966 BUX2_T10076
# 1 Bruit (U) 5 9 4 6 2
# 2 TAMAN (M) 0 0 1 0 0
# 3 TIKIam(T) 0 0 0 6 0
# 4 <NA> 0 4 1 2 0
2
solved how to count the numbers based on two columns