[Solved] Calculate conditional mean in R with dplyr (like group by in SQL) [duplicate]


I think what you are looking for (if you want to use dplyr) is a combination of the functions group_byand mutate.

library(dplyr)
city <- c("a", "a", "b", "b", "c")
temp <- 1:5
df <- data.frame(city, temp)

df %>% group_by(city) %>% mutate(mean(temp))

Which would output:

    city  temp mean(temp)
  (fctr) (int)      (dbl)
1      a     1        1.5
2      a     2        1.5
3      b     3        3.5
4      b     4        3.5
5      c     5        5.0

On a side note, I do not think 50,000 rows is that big of a data set for dplyr. I would not worry too much unless this code is going to be inside some kind of loop or you have 1M+ rows. As Heroka sugested in the comments, data.table is a better alternative when it comes to performance in most cases.

Edit: removed unnecessary step

4

solved Calculate conditional mean in R with dplyr (like group by in SQL) [duplicate]