# Linear Models in Data Science (통계적 선형모형)

undergraduate course, *IE-3500452*, 2023

# Description

- First, the course will review basic linear models in statistics including simple regression, multiple regression, regression with categorical variables and polynomials, etc.
- After studying basic linear regression models, we will focus on general F-test, basic model selection methods, and building appropriate models.
- Time permitting, we will also consider how to deal with outliers and influential observations.
- The popular R statistical language, Python3, and Minitab will be handled in this class.
- We will also consider various practical applications widely used for engineering.

# Objectives

Upon successful completion of this course, students will be able to:

- Program statistical softwares (Minitab and R).
- Derive parameter estimates under the simple linear regression model.
- Do basic statistical inference for the simple linear regression model.
- Know how to use matrix algebra in regression models.
- Extend the simple linear regression model to the multiple linear regression model using the matrix algebra.
- Set up polynomial regression models.
- Analyze and infer the multiple linear regression model.
- Understand how to diagnose the problems from regression models.
- Know the general linear
*F*-test. - Use categorical predictor variables in the regression model setup.
- Use
*all possible regression.* - Understand several model selection procedures.
- Build an appropriate model.
- Detect outliers and influential observations.

# Why regression?

- Key Types of Regressions: Which One to Use?
- What is “linear” regression model?
- R Vs Python: What’s the Difference?
- Why R for Data Science – and not Python?

# Links

- Syllabus
- Class Notes
- R and Minitab Codes
- R4pda (R written in Korean)
- Minitab (trial SW)
- Minitab (manual)