[Solved] Dataframe with string columns – each column need to split into multiple at word “and” – R [closed]


Here is what worked for me – using inputs from above and various other threads on SO. I am a complete newbie to R and my objective is to migrate work from excel to R.

# returns string w/o leading or trailing whitespace
trim <- function (x) gsub("^\\s+|\\s+$", "", x)

#--------------------------------------------------------------------------------
# OBJECTIVE - migrate this activity from excel + VBA to R
#
# split and find out max cols required - each element in dataframe is a list of
#variable length - objective is to convert it into individual columns with number of 
#columns = maximum size of list - for the rows with less number of entries the
#additional columns will contain "NA"
---------------------------------------------------------------------------------

temp_split<-strsplit(src.df$PREV,"and")
max_col=max(unlist(lapply(temp_split,length),recursive=TRUE))

# add to dataframe with fixed row and max_col
# keep columns empty - if no data

add_list <- function (x,max_col){
u_l <- unlist(x)
l<-length(unlist(x))
pad_col = max_col - l
r_l <- c(u_l, rep("NA",pad_col))
return(r_l)
}

test<-lapply(temp_split,add_list,max_col)
test_matrix<-data.frame(matrix(unlist(test,recursive=TRUE),nrow=NROW(src.df),byrow=T))

t.df<-test_matrix
c.df<-cbind(src.df,t.df)

solved Dataframe with string columns – each column need to split into multiple at word “and” – R [closed]