[Solved] R create a matrix of occurence [closed]

Ok, still not perfectly clear, but I THINK that presence is now an adjacency matrix, where the columns represent users and the rows represent events, so presence[i,j] indicates that user i attended event j. If I’m interpreting it correctly then counts seems to be the co-occurrence matrix, correct? count[i,j] should record the number of events … Read more

[Solved] Split a group of lists R [closed]

Suppose you have a list of lists, and each element is named: ll <- list(jan = list(1:3), feb = list(4:6), mar = list(7:9)) ls() # [1] “ll” You can use list2env to assign the list components into the global environment: list2env(ll, globalenv()) ls() # [1] “feb” “jan” “ll” “mar” jan # [[1]] # [1] 1 … Read more

[Solved] How to create a loop that will make regression models in R? [closed]

Here’s a solution without loops. # some artificial data set.seed(1) daf <- data.frame(species = factor(paste0(“species”, c(rep(1:3, 10)))), year = rep(2000:2009, 3), x = sample(1:100, 30)) library(dplyr) library(broom) lm_fit <- daf %>% group_by(species) %>% do(fit = lm(x ~ year, .)) tidy(lm_fit, fit) # or as.data.frame(tidy(lm_fit, fit)) to get a data.frame # # A tibble: 6 x … Read more

[Solved] I want to summarize by a column and then have it take the sum of 1 column and the mean of another column

The crucial point in OP’s approach is the staggered aggregation (see the related question row not consolidating duplicates in R when using multiple months in Date Filter). The OP wants to aggregate data across a number of files which apparently are too large to be loaded altogether and combined into a large data.table. Instead, each … Read more

[Solved] Can somebody help me understand “for” functions?

An easy way to understand how loops work in any programming language is to follow, step by step what it is doing. Here a simple example of how you can “debug” in paper. This could be useful in the future. The table below represents each iteration of your loops and the values of each operation. … Read more

[Solved] How do I replace all NA with mean in R? [duplicate]

We can use na.aggregate from zoo. Loop through the columns of dataset (assuming all the columns are numeric ), apply the na.aggregate to replace the NA with mean values (by default) and assign it back to the dataset. library(zoo) df[] <- lapply(df, na.aggregate) By default, the FUN argument of na.aggregate is mean: Default S3 method: … Read more

[Solved] combine duplicates, do not publish blanks, dplyr::distinct

Perhaps the reason that you are having problems is that you are using empty strings when you should be using NAs. This is what I would assume is the idiomatic code. df <- data.frame(unique_id = c(rep(1,3),rep(2,3)), school = c(rep(‘great’,3),rep(‘spring’,3)), subject = rep(c(“Math”, “English”, “History”),2), grade = c(88,78,98,65,72,84), sex = c(NA,NA, “male”, NA, “female”, NA)) r2 … Read more

[Solved] data type conversion in R [closed]

Seems your code needs to be cleaned up a bit: if(lastFlag!=1){ lastDay <<- rawData$Date[nrow(rawData)] # last complete day rawData <<- subset.data.frame(rawData,rawData$Date < lastDay) } I think the “rawData” could be an empty dataframe here but not be checked; Let’s suppose: rawData <- data.frame(x=c(), y=c()) tProcRows <- 100 So: tProcRows <- tProcRows + as.numeric(row.names(rawData)[nrow(rawData)]) print(tProcRows) Output: … Read more

[Solved] Comparing two version of the same string

Here’s a tidyverse approach: library(dplyr) library(tidyr) # put data in a data.frame data_frame(string = unlist(data)) %>% # add ID column so we can recombine later add_rownames(‘id’) %>% # add a lagged column to compare against mutate(string2 = lag(string)) %>% # break strings into words separate_rows(string) %>% # evaluate the following calls rowwise (until regrouped) rowwise() … Read more