[Solved] Using R – Read CSV , aggregate/percent column

The first step (reading in the csv) is pretty simple: data = read.table(“filename.csv”, sep = “,”, quote = “”, header=TRUE) The ‘sep’ part makes it clear that the separator is a comma, and ‘header’ keeps those column headings separate from the rest of the data. Does that work for you? -Y solved Using R – … Read more

[Solved] how to remove all text before a pattern? [duplicate]

To solve this problem you can use a positive lookahead ‘.*(?=START)’, as follows: # load environment library(stringr) # create text vector text = c(‘hello guys this is it START hi’, ‘one two START this is good’, ‘a longer example. I cannot believe it! START hello’) # remove pattern text = str_remove(text, ‘.*(?=START)’) # print output … Read more

[Solved] Adding standard deviation to barplot() in R

with base R, you can use the function arrows() : barCenters <- barplot(height = Ymeans12stdev$mean, main = “Average Time per Group”, xlab = “Group”, ylab = “Time”) arrows(barCenters, Ymeans12stdev$mean-Ymeans12stdev$sd, barCenters, Ymeans12stdev$mean+Ymeans12stdev$sd,angle=90,code=3) the argument angle=90 specifies to draw “flat” arrows (i.e. a horizontal bar on top of a vertical one) and the argument code=3 specifies to … Read more

[Solved] Can we develop a Python API which wraps R code [closed]

As mentioned in an earlier post, things that are not easy in R can be relatively simple in other languages. Another example would be connecting to Amazon Web Services. In relation to s3, although there are a number of existing packages, many of them seem to be deprecated, premature or platform-dependent. (I consider the cloudyr … Read more

[Solved] Summarize data.frame by dates in R

I would use data.table and do something like the following: library(data.table) setDT(df) ndf <- df[, .(Date = paste(Date[1], “to”, Date[.N]), weather = Type.of.Weather[1]), rleid(Type.of.Weather) ][, rleid := NULL ][] ndf Date weather 1: 01-Jan to 03-Jan Cloudy 2: 04-Jan to 10-Jan Rainy 3: 11-Jan to 15-Jan Cloudy 4: 16-Jan to 20-Jan Sunny 1 solved Summarize … Read more

[Solved] Merge data frame based on column names in r

Working Azure ML – This was the best option I found to automate this merge. df <- maml.mapInputPort(1) df2 <- maml.mapInputPort(2) if (length(df2.toAdd <- setdiff (names(df), names(df2)))) df2[, c(df2.toAdd) := NA] if (length(df.toAdd <- setdiff (names(df2), names(df)))) df[, c(df.toAdd) := NA] df3 <- rbind(df, df2, use.names=TRUE) maml.mapOutputPort(“df3”); 0 solved Merge data frame based on column … Read more

[Solved] how to download all MS access attachments using R [closed]

Attachments in MS Access are special data types that actually involve nested tables for metadata (filename and filedata) information. Hence, you cannot access this data with DML SQL statements alone via RODBC but can using a COM interface, specifically connecting to the DAO SaveToFile() method. Consider the following using the RDCOMClient package which allows interfacing … Read more

[Solved] Replacing row elements in a dataframe based on values from another dataframe [duplicate]

Here’s a stab: tableresults <- read.table(header=TRUE, stringsAsFactors=FALSE, text=” ACTIVITY_X ACTIVITY_Y ACTIVITY_Z winning_cluster 1 19 21 28 cluster3 2 20 14 24 cluster3 3 34 35 49 cluster3 4 18 5 19 cluster2 5 23 27 35 cluster3 6 33 20 39 cluster3″) averagetable <- read.table(header=TRUE, stringsAsFactors=FALSE, text=” Group.1 Standing 1 cluster1 0.5642857 2 cluster2 0.7795848 … Read more

[Solved] read mixed data into R

I have saved that one line of text you have provided into a file called ‘parseJSON.txt’. You can then read the file in as per usual using read.table, then make use of library(jsonlite) to parse the 3rd column. I’ve also formatted the line of text to include quotes around the JSON code: factor1 param1 {“type”: … Read more

[Solved] Regression model to predict student’s grade in R

What you’re looking for is a linear regression model. In R, it’s invoked with lm(). You can read more here. You’d want to fit a model predicting the grade, and then run the model on the data with the Age incremented by one, since presumably, that is the only attribute that will be changing next … Read more

[Solved] Code needed to convert call variable to two separate variables

This is how I would do it: data = data.frame( Well = c(“A01”, “A02”, “A03”, “A04”, “A05”, “A06”, “A07”, “A08”), Call = c(“No Call”, “No Call”, “Allele 1”, “Heterozygote”, “Allele 1”, “Allele 2”, “Heterozygote”, “Heterozygote”), stringsAsFactors=FALSE ) g121map = c(“No Call”=””, “Allele 1″=”+”, “Allele 2″=”G1^{1384M}”, “Heterozygote”=”G1^{1384M}”) g122map = c(“No Call”=””, “Allele 1″=”+”, “Allele 2″=”G1^{1384M}”, “Heterozygote”=”+”) … Read more

[Solved] R computing not so fast

This isn’t quite right, but maybe gives some indication of how to make this type of operation faster. Here’s the data url <- “http://pastebin.com/raw.php?i=hsGACr2L” dfi <- read.csv(url) I calculate the product and cumulative sum of the product of price and volume. The calculation is vectorized so fast. pv <- with(dfi, Price * Volume) cpv <- … Read more