# Multivariate Regression with 5 Variables in R

I am trying to analyse the correlation of different factors of packagings on the pack-time. This is what I got.

``df <- read.csv2("file",                 header = TRUE, strip.white = TRUE) over.all.raw <- as.Date(Shipset.data.frame\$  Outgoing, origin = "1899-12-30") -                 as.Date(Shipset.data.frame\$  Start, origin = "1899-12-30")  over.all.raw <- as.numeric(over.all.raw)  new.df <- data.frame(over.all.raw)  new.df <- cbind(over.all.raw, Shipset.data.frame\$  Weight1,                               Shipset.data.frame\$  Weightbrutto,                               Shipset.data.frame\$  Volume,                               Shipset.data.frame\$  ComponentsLC,                               Shipset.data.frame\$  ComponentsPl)  new.df1 <- as.data.frame(new.df) colnames(new.df1) <- c("Transporttime", "Weight1", "WeightBrutto", "Volume", "ComponentsLC", "ComponentsPl")  clean.new.df1<- new.df1[complete.cases(new.df1), ] ``

In this example I want to test how the Weight (Brutto or Netto), Volume, and Components from different locations have an influence on the packing time. This is how I made my linear model.

``lm <- lm(Transporttime ~ Weight1 + WeightBrutto + Volume + ComponentsLC + ComponentsPl, data = new.df1)  summary(lm)       Residuals:     Min      1Q  Median      3Q     Max  -25.955  -5.074   2.408   7.676  27.353   Coefficients:              Estimate Std. Error t value Pr(>|t|)     (Intercept)  25.81903    0.49060  52.627  < 2e-16 *** Weight1      -0.17205    0.09469  -1.817 0.069601 .   WeightBrutto  0.08748    0.07226   1.211 0.226390     Volume       -6.59973    1.65135  -3.997 7.04e-05 *** ComponentsLC  0.04362    0.29912   0.146 0.884107     ComponentsPl  0.52863    0.14467   3.654 0.000276 *** --- Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1  Residual standard error: 10.57 on 774 degrees of freedom   (94 observations deleted due to missingness) Multiple R-squared:  0.09307,   Adjusted R-squared:  0.08722  F-statistic: 15.89 on 5 and 774 DF,  p-value: 6.453e-15 ``

There is one problem I already noticed. There is a model violation in ComponentsPl. From there (PL), components are delivered faster to outgoing, but the more components, the longer the time to pack. So “The more components from the faster stockage, the more time”… but anyway. The second fishy thing is the volume. I can imagine some ideas why high volume could cause shorter time but it would not match my first expectations.

Can somebody review this model and give me hints when I made mistakes?