[Solved] colSums – shifted results

Question

Noting your actual attempted solution posted in the comment to @ChristopherLouden’s answer, which looks suspiciously like the solution offered by @Jilber to a question from earlier today, I can finally reproduce your problem and offer a solution.

For the sake of simplicity, here’s a much smaller data.frame to start our work with. Note that the data.frame has two non-numeric columns (one character and one factor). Something as small as this is sufficient to demonstrate your problem and is much easier for others to follow.

data <- structure(list(Name = c("a", "b", "c", "d"), 
    time1 = c(6692.50136510743, 41682.9111356503, 405946.374877924, 
    4640.34876265179), time2 = c(14404.8414547167, 40466.9047986558, 
    638019.540242027, 2397.71968447607), time3 = c(10146.3608040476, 
    34148.4389867747, 459639.431186888, 10490.8359468475), 
    New = structure(1:4, .Label = c("A", "B", "C", "D"), 
    class = "factor")), .Names = c("Name", "time1", "time2", "time3", 
    "New"), class = "data.frame", row.names = c(NA, 4L))
data
#   Name      time1     time2     time3 New
# 1    a   6692.501  14404.84  10146.36   A
# 2    b  41682.911  40466.90  34148.44   B
# 3    c 405946.375 638019.54 459639.43   C
# 4    d   4640.349   2397.72  10490.84   D

Here is your current solution, complete with strange “shifting” of column means.

df <- suppressWarnings(
  rbind(data, colMeans=colMeans(data[, sapply(data, is.numeric)])))
df
#                      Name      time1     time2     time3  New
# 1                       a   6692.501  14404.84  10146.36    A
# 2                       b  41682.911  40466.90  34148.44    B
# 3                       c 405946.375 638019.54 459639.43    C
# 4                       d   4640.349   2397.72  10490.84    D
# colMeans 114740.534035333 173822.252 128606.27 114740.53 <NA>

The solution I’m offering makes use of rbind.fill from “plyr” to bind the results to your original data.frame. The results are calculated only on the numeric columns of your original data.frame.

library(plyr) ## For `rbind.fill`
useme <- sapply(data, is.numeric)
rbind.fill(data, data.frame(t(colMeans(data[useme]))))
#   Name      time1     time2     time3  New
# 1    a   6692.501  14404.84  10146.36    A
# 2    b  41682.911  40466.90  34148.44    B
# 3    c 405946.375 638019.54 459639.43    C
# 4    d   4640.349   2397.72  10490.84    D
# 5 <NA> 114740.534 173822.25 128606.27 <NA>

mean(data$time1) ## Just for verification...
# [1] 114740.5

Accepted Answer

Noting your actual attempted solution posted in the comment to @ChristopherLouden’s answer, which looks suspiciously like the solution offered by @Jilber to a question from earlier today, I can finally reproduce your problem and offer a solution.

For the sake of simplicity, here’s a much smaller data.frame to start our work with. Note that the data.frame has two non-numeric columns (one character and one factor). Something as small as this is sufficient to demonstrate your problem and is much easier for others to follow.

data <- structure(list(Name = c("a", "b", "c", "d"), 
    time1 = c(6692.50136510743, 41682.9111356503, 405946.374877924, 
    4640.34876265179), time2 = c(14404.8414547167, 40466.9047986558, 
    638019.540242027, 2397.71968447607), time3 = c(10146.3608040476, 
    34148.4389867747, 459639.431186888, 10490.8359468475), 
    New = structure(1:4, .Label = c("A", "B", "C", "D"), 
    class = "factor")), .Names = c("Name", "time1", "time2", "time3", 
    "New"), class = "data.frame", row.names = c(NA, 4L))
data
#   Name      time1     time2     time3 New
# 1    a   6692.501  14404.84  10146.36   A
# 2    b  41682.911  40466.90  34148.44   B
# 3    c 405946.375 638019.54 459639.43   C
# 4    d   4640.349   2397.72  10490.84   D

Here is your current solution, complete with strange “shifting” of column means.

df <- suppressWarnings(
  rbind(data, colMeans=colMeans(data[, sapply(data, is.numeric)])))
df
#                      Name      time1     time2     time3  New
# 1                       a   6692.501  14404.84  10146.36    A
# 2                       b  41682.911  40466.90  34148.44    B
# 3                       c 405946.375 638019.54 459639.43    C
# 4                       d   4640.349   2397.72  10490.84    D
# colMeans 114740.534035333 173822.252 128606.27 114740.53 <NA>

The solution I’m offering makes use of rbind.fill from “plyr” to bind the results to your original data.frame. The results are calculated only on the numeric columns of your original data.frame.

library(plyr) ## For `rbind.fill`
useme <- sapply(data, is.numeric)
rbind.fill(data, data.frame(t(colMeans(data[useme]))))
#   Name      time1     time2     time3  New
# 1    a   6692.501  14404.84  10146.36    A
# 2    b  41682.911  40466.90  34148.44    B
# 3    c 405946.375 638019.54 459639.43    C
# 4    d   4640.349   2397.72  10490.84    D
# 5 <NA> 114740.534 173822.25 128606.27 <NA>

mean(data$time1) ## Just for verification...
# [1] 114740.5