[Solved] how to use loop to do linear regression in R

Question

If you really want to do this, it’s pretty trivial with lapply(), where we use it to “loop” over the other columns of df. A custom function takes each variable in turn as x and fits a model for that covariate.

df <- data.frame(crim = rnorm(20), rm = rnorm(20), ad = rnorm(20), wd = rnorm(20))

mods <- lapply(df[, -1], function(x, dat) lm(crim ~ x, data = dat))

mods is now a list of lm objects. The names of mods contains the names of the covariate used to fit the model. The main negative of this is that all the models are fitted using a variable x. More effort could probably solve this, but I doubt that effort is worth the time.

If you are just selecting models, which may be dubious, there are other ways to achieve this. For example via the leaps package and its regsubsets function:

library("leapls")
a <- regsubsets(crim ~ ., data = df, nvmax = 1, nbest = ncol(df) - 1)
summa <- summary(a)

Then plot(a) will show which of the models is “best”, for example.

Original

If I understand what you want (crim is a covariate and the other variables are the responses you want to predict/model using crim), then you don’t need a loop. You can do this using a matrix response in a standard lm().

Using some dummy data:

df <- data.frame(crim = rnorm(20), rm = rnorm(20), ad = rnorm(20), wd = rnorm(20))

we create a matrix or multivariate response via cbind(), passing it the three response variables we’re interested in. The remaining parts of the call to lm are entirely the same as for a univariate response:

mods <- lm(cbind(rm, ad, wd) ~ crim, data = df)
mods 

> mods

Call:
lm(formula = cbind(rm, ad, wd) ~ crim, data = df)

Coefficients:
             rm        ad        wd      
(Intercept)  -0.12026  -0.47653  -0.26419
crim         -0.26548   0.07145   0.68426

The summary() method produces a standard summary.lm output for each of the responses.

Accepted Answer