📘 Linear Regression — Full Concept + Math + Intuition
📑 Table of Contents
- What is Linear Regression
- Why Do We Need It?
- Deep Intuition
- Dataset
- RSS Explained
- Full Derivation
- Solution
- Code
- Key Takeaways
📌 What is Linear Regression?
Linear Regression is a statistical and machine learning technique used to model the relationship between variables.
It tries to answer a simple question: "Can we predict output (y) using input (x)?"
The model assumes a linear relationship:
ŷ = β0 + β1x
- β0 → Intercept (value when x = 0)
- β1 → Slope (how much y changes when x changes)
❓ Why Do We Need Linear Regression?
In real life, relationships exist everywhere:
- Hours studied → Marks scored
- Ad spend → Sales
- Experience → Salary
Linear regression helps us quantify and predict these relationships.
🧠 Deep Intuition
Click to expand
Imagine plotting points on a graph. There are infinite lines you could draw.
But we want the "best" line.
Best means:
- Closest to all points
- Minimum total error
Instead of guessing, we use math to find this optimal line.
📊 Dataset
| x | y |
|---|---|
| 1 | 2 |
| 2 | 3 |
📉 Residual Sum of Squares (RSS)
Residual = Actual - Predicted
RSS measures total squared error.
RSS = (2 - (β0 + β1*1))^2 + (3 - (β0 + β1*2))^2
Why square?
- Avoid negative cancellation
- Penalize large errors more
📐 Full Step-by-Step Derivation (Deep Explanation)
Expand Full Math with Explanation
Step 1: Start with RSS
RSS = (2 - β0 - β1)^2 + (3 - β0 - 2β1)^2
Step 2: Expand each term
(2 - β0 - β1)^2 = (2 - β0 - β1)(2 - β0 - β1) = 4 - 4β0 - 4β1 + β0^2 + 2β0β1 + β1^2 (3 - β0 - 2β1)^2 = (3 - β0 - 2β1)(3 - β0 - 2β1) = 9 - 6β0 - 12β1 + β0^2 + 4β0β1 + 4β1^2
Step 3: Add both expressions
RSS = (4 + 9)
+ (β0^2 + β0^2)
+ (β1^2 + 4β1^2)
+ (2β0β1 + 4β0β1)
+ (-4β0 - 6β0)
+ (-4β1 - 12β1)
RSS = 13 + 2β0^2 + 5β1^2 + 6β0β1 -10β0 -16β1
Step 4: Take derivative w.r.t β0
d(RSS)/dβ0 = d/dβ0 (2β0^2 + 6β0β1 -10β0) = 4β0 + 6β1 -10
Step 5: Take derivative w.r.t β1
d(RSS)/dβ1 = d/dβ1 (5β1^2 + 6β0β1 -16β1) = 10β1 + 6β0 -16
Step 6: Set derivatives to zero
4β0 + 6β1 = 10 6β0 + 10β1 = 16
Step 7: Solve using elimination
Multiply first equation by 3: 12β0 + 18β1 = 30 Multiply second equation by 2: 12β0 + 20β1 = 32 Subtract: (12β0 + 20β1) - (12β0 + 18β1) = 32 - 30 2β1 = 2 β1 = 1 Substitute into first equation: 4β0 + 6(1) = 10 4β0 + 6 = 10 4β0 = 4 β0 = 1
Final Result:
β0 = 1 β1 = 1 ŷ = x + 1
🧮 Solving Equations
Set derivatives = 0 to find minimum:
4β0 + 6β1 = 10 6β0 + 10β1 = 16
Solving gives:
β0 = 1 β1 = 1
Final Model:
ŷ = x + 1
💻 Code Example
import numpy as np from sklearn.linear_model import LinearRegression X = np.array([1,2]).reshape(-1,1) y = np.array([2,3]) model = LinearRegression() model.fit(X,y) print(model.intercept_) print(model.coef_)
🖥 CLI Output
1.0 [1.0]
💡 Key Takeaways
- Linear regression models relationships
- RSS measures error
- Derivatives minimize error
- Gives best-fit line mathematically
No comments:
Post a Comment