AE 11: MLR Inference + conditions

October 26, 2022


rail_trail <- read_csv("data/rail_trail.csv")

Exercise 1

Below is the model predicting volume from hightemp and season.

rt_mlr_main_fit <- linear_reg() %>%
  set_engine("lm") %>%
  fit(volume ~ hightemp + season, data = rail_trail)

tidy(rt_mlr_main_fit) |>
  kable(digits = 2)
term estimate std.error statistic p.value
(Intercept) -125.23 71.66 -1.75 0.08
hightemp 7.54 1.17 6.43 0.00
seasonSpring 5.13 34.32 0.15 0.88
seasonSummer -76.84 47.71 -1.61 0.11

Add an interaction effect between hightemp and season to the model. Do the data provide evidence of a significant interaction effect? Comment on the significance of the interaction terms.

## add code

Exercise 2

Below is the model predicting volume from all available predictors.

rt_full_fit <- linear_reg() %>%
  set_engine("lm") %>%
  fit(volume ~ ., data = rail_trail)

tidy(rt_full_fit) |>
kable(digits = 2)
term estimate std.error statistic p.value
(Intercept) 17.62 76.58 0.23 0.82
hightemp 7.07 2.42 2.92 0.00
avgtemp -2.04 3.14 -0.65 0.52
seasonSpring 35.91 32.99 1.09 0.28
seasonSummer 24.15 52.81 0.46 0.65
cloudcover -7.25 3.84 -1.89 0.06
precip -95.70 42.57 -2.25 0.03
day_typeWeekend 35.90 22.43 1.60 0.11

Fill in the code to plot the histogram of residuals with an overlay of the normal distribution based on the results of the model.


rt_full_aug <- augment(_______)

ggplot(rt_full_aug, aes(.resid)) +
  geom_histogram(aes(y = after_stat(density)), binwidth = 50) +
    fun = dnorm, 
    args = list(mean = mean(rt_full_aug$____), sd = ______), 
    lwd = 2, 
    color = "red"