Understanding RSS, TSS, ESS & R² in Regression
In regression analysis, we measure how well a model explains variation in data using three core quantities: Total Sum of Squares (TSS), Residual Sum of Squares (RSS), and Explained Sum of Squares (ESS).
๐ฏ Learning Goal
Understand how total variation in data is decomposed into explained and unexplained parts.
๐ Key Definitions
1️⃣ Total Sum of Squares (TSS)
Definition: Measures total variation in y around its mean.
TSS = ฮฃ (y_i - y_mean)^2
- y_i → actual values
- y_mean → mean of y
2️⃣ Residual Sum of Squares (RSS)
Definition: Measures unexplained variation (model error).
RSS = ฮฃ (y_i - y_hat_i)^2
- y_hat_i → predicted values
3️⃣ Explained Sum of Squares (ESS)
Definition: Measures variation explained by the model.
ESS = ฮฃ (y_hat_i - y_mean)^2
๐ The Fundamental Relationship
TSS = ESS + RSS
The total variability in y is split into:
- Explained part (ESS)
- Unexplained part (RSS)
๐ Visual Interpretation (Conceptual)
Think of It Geometrically
TSS → Distance from actual points to the mean ESS → Distance from predictions to the mean RSS → Distance from actual points to predictions
Graphically:
- Mean line → baseline model
- Regression line → improved model
- Vertical gaps → residuals
๐ Coefficient of Determination (R²)
R^2 = ESS / TSS R^2 = 1 - (RSS / TSS)
Interpretation
- R² = 1 → Perfect fit
- R² = 0 → No improvement over mean
๐งช Step-by-Step Example Logic
How You Compute in Practice
- Compute y_mean
- Calculate TSS using actual values
- Fit regression → obtain y_hat
- Calculate RSS
- Compute ESS = TSS − RSS
- Compute R²
๐ Final Summary
- TSS → Total variability
- ESS → Explained variability
- RSS → Unexplained variability
End of Interactive Learning Guide
No comments:
Post a Comment