AE 11: MLR Inference + conditions

Trail riders

Published

October 26, 2022

Important

Go to the course GitHub organization and locate your ae-11- to get started.

The AE is due on GitHub by Saturday, October 29 at 11:59pm.

Packages

library(tidyverse)
library(tidymodels)
library(knitr)

Data

rail_trail <- read_csv("data/rail_trail.csv")

Exercise 1

Below is the model predicting volume from hightemp and season.

rt_mlr_main_fit <- linear_reg() %>%
  set_engine("lm") %>%
  fit(volume ~ hightemp + season, data = rail_trail)

tidy(rt_mlr_main_fit) |>
  kable(digits = 2)
term estimate std.error statistic p.value
(Intercept) -125.23 71.66 -1.75 0.08
hightemp 7.54 1.17 6.43 0.00
seasonSpring 5.13 34.32 0.15 0.88
seasonSummer -76.84 47.71 -1.61 0.11

Add an interaction effect between hightemp and season to the model. Do the data provide evidence of a significant interaction effect? Comment on the significance of the interaction terms.

## add code

Exercise 2

Below is the model predicting volume from all available predictors.

rt_full_fit <- linear_reg() %>%
  set_engine("lm") %>%
  fit(volume ~ ., data = rail_trail)

tidy(rt_full_fit) |>
kable(digits = 2)
term estimate std.error statistic p.value
(Intercept) 17.62 76.58 0.23 0.82
hightemp 7.07 2.42 2.92 0.00
avgtemp -2.04 3.14 -0.65 0.52
seasonSpring 35.91 32.99 1.09 0.28
seasonSummer 24.15 52.81 0.46 0.65
cloudcover -7.25 3.84 -1.89 0.06
precip -95.70 42.57 -2.25 0.03
day_typeWeekend 35.90 22.43 1.60 0.11

Fill in the code to plot the histogram of residuals with an overlay of the normal distribution based on the results of the model.

Note

Update to eval: true once the code is updated.

rt_full_aug <- augment(_______)

ggplot(rt_full_aug, aes(.resid)) +
  geom_histogram(aes(y = after_stat(density)), binwidth = 50) +
  stat_function(
    fun = dnorm, 
    args = list(mean = mean(rt_full_aug$____), sd = ______), 
    lwd = 2, 
    color = "red"
  )