9 Making rigorous conclusions

In this part we introduce modelling and statistical inference for making data-based conclusions. We discuss building, interpreting, and selecting models, visualizing interaction effects, and prediction and model validation. Statistical inference is introduced from a simulation based perspective, and the Central Limit Theorem is discussed very briefly to lay the foundation for future coursework in statistics.

The RStudio Cloud workspace for Data Science Course in a Box project is here. You can join the workspace and play around with the sample application exercises.

9.1 Slides, videos, and application exercises

9.1.1 Modelling data

Unit 4 - Deck 1: The language of models

Unit 4 - Deck 2: Fitting and interpreting models

Unit 4 - Deck 3: Modelling nonlinear relationships

Unit 4 - Deck 4: Models with multiple predictors

Unit 4 - Deck 5: More models with multiple predictors

9.1.2 Classification and model building

Unit 4 - Deck 6: Logistic regression

Unit 4 - Deck 7: Prediction and overfitting

tidymodels :: Build a model

Unit 4 - Deck 8: Feature engineering

9.1.3 Model validation

Unit 4 - Deck 9: Cross validation

The Office + Feature engineering, Pt. 1

The Office + Cross validation, Pt. 2

9.1.4 Uncertainty quantification

Unit 4 - Deck 10: Quantifying uncertainty

Unit 4 - Deck 11: Bootstrapping

Unit 4 - Deck 12: Hypothesis testing

Unit 4 - Deck 13: Inference overview

9.2 Labs

Lab 10: Grading the professor, Pt. 1

Fitting and interpreting simple linear regression models

Lab 11: Grading the professor, Pt. 2

Fitting and interpreting multiple linear regression models

Lab 12: Smoking while pregnant

Constructing confidence intervals, conducting hypothesis tests, and interpreting results in context of the data

9.3 Homework assignments

HW 7: Bike rentals in DC

Exploratory data analysis and fitting and interpreting models

HW 8: Exploring the GSS

Fitting and interpreting models

HW 9: Modelling the GSS

Model validation and inference