Model Diagnostics
The following supplemental notes were created by Dr. Maria Tackett for STA 210. They are provided for students who want to dive deeper into the mathematics behind regression and reflect some of the material covered in STA 211: Mathematics of Regression. Additional supplemental notes will be added throughout the semester.
This document discusses some of the mathematical details of the model diagnostics - leverage, standardized residuals, and Cook’s distance. We assume the reader knowledge of the matrix form for multiple linear regression. Please see Matrix Form of Linear Regression for a review.
Introduction
Suppose we have
We can write the response for the
such that
Matrix Form for the Regression Model
We can represent the Equation 1 and Equation 2 using matrix notation. Let
Thus,
Therefore the estimated response for a given combination of explanatory variables and the associated residuals can be written as
$$
= = -
$$ {#eq-matrix_mean}
Hat Matrix & Leverage
Recall from the notes Matrix Form of Linear Regression that
$$
= (T){-1}^T
$$ {#eq-beta-hat}
Combining ?@eq-matrix_mean and ?@eq-beta-hat, we can write
$$
$$ {#eq-y-hat}
We define the hat matrix as an
$$
=
$$ {#eq-y-hat-matrix}
The diagonal elements of the hat matrix are a measure of how far the predictor variables of each observation are from the means of the predictor variables. For example,
$$
h_{ii} = +
$$
We call these diagonal elements, the leverage of each observation.
The diagonal elements of the hat matrix have the following properties:
, where is the number of predictor variables in the model.- The mean hat value is
.
Using these properties, we consider a point to have high leverage if it has a leverage value that is more than 2 times the average. In other words, observations with leverage greater than
When there are high leverage points in the data, the regression line will tend towards those points; therefore, one property of high leverage points is that they tend to have small residuals. We will show this by rewriting the residuals from ?@eq-matrix_mean using ?@eq-y-hat-matrix.
$$
$$ {#eq-resid-hat}
Note that the identity matrix and hat matrix are idempotent, i.e.
$$
$$ {#eq-resid-var}
where
Standardized Residuals
In general, we standardize a value by shifting by the expected value and rescaling by the standard deviation (or standard error). Thus, the
$$
std.res_i =
$$
The expected value of the residuals is 0, i.e.
$$
std.res_i =
$$ {#eq-std-resid}
Cook’s Distance
Cook’s distance is a measure of how much each observation influences the model coefficients, and thus the predicted values. The Cook’s distance for the
$$
D_i =
$$ {#eq-cooksd}
where
$$
D_i = std.res_i^2=
$$ {#eq-cooksd-v2}