๐ณ Pruning Decision Trees: Complete Guide (Theory + Math + Code)
๐ Table of Contents
- Decision Tree Intuition
- How Trees Make Decisions (Math)
- Why Overfitting Happens
- What is Pruning?
- Types of Pruning
- Math Behind Pruning
- Code Example
- CLI Output
- Key Takeaways
- Related Articles
๐ Decision Tree Intuition (Deep Understanding)
A decision tree is not just a flowchart — it is a recursive partitioning algorithm. It keeps splitting data into smaller groups until each group becomes "pure".
๐ Example (Intuition)
Suppose we classify emails:
Step 1 → Contains "offer"?
Step 2 → Contains "urgent"?
Step 3 → Sender known?
Each split reduces uncertainty.
๐ How Trees Make Decisions (Math Made Easy)
Decision trees use impurity measures.
1. Gini Impurity
Gini = 1 - (p1² + p2² + ... + pn²)
๐ Simple meaning: - If all samples belong to one class → Gini = 0 (perfect) - If mixed → higher value
๐ Example
Class A = 50%, Class B = 50% Gini = 1 - (0.5² + 0.5²) = 0.5
2. Entropy (Information Gain)
Entropy = - ฮฃ p log2(p)
๐ Measures randomness ๐ Higher entropy = more disorder
๐ Example
If both classes equal → entropy is high If one class dominates → entropy is low
⚠️ Why Overfitting Happens (Core Concept)
A decision tree keeps splitting until:
- Each leaf is pure
- Or no further gain exists
Problem → It starts learning noise.
๐ Real Insight
If one rare data point exists, the tree may create a full branch just for it.
✂️ What is Pruning?
Pruning removes weak splits that do not generalize well.
Instead of asking: "Does this split improve training accuracy?" We ask: "Does this split help future data?"
๐ฟ Types of Pruning
1. Pre-Pruning
Expand
- max_depth
- min_samples_split
- min_samples_leaf
2. Post-Pruning
Expand
Grow full tree → Remove unnecessary nodes later.
๐งฎ Math Behind Pruning (Very Important)
Cost Complexity Pruning Formula:
Rฮฑ(T) = R(T) + ฮฑ|T|
Where:
- R(T) → Error of tree
- |T| → Number of leaf nodes
- ฮฑ → Complexity penalty
๐ Easy Explanation
Think of ฮฑ as a "penalty for complexity".
If tree is too big → penalty increases
If tree is small → penalty decreases
๐ Practical Meaning
If adding a branch improves accuracy slightly but increases complexity a lot → it gets removed.
๐ป Code Example
from sklearn.tree import DecisionTreeClassifier
from sklearn.datasets import load_iris
data = load_iris()
X, y = data.data, data.target
model = DecisionTreeClassifier(
max_depth=3,
min_samples_split=4
)
model.fit(X, y)
print("Training complete")
๐ฅ️ CLI Output
$ python train.py Loading dataset... Splitting data... Training model... Applying pruning: - max_depth = 3 - min_samples_split = 4 Training complete Accuracy: 95%
๐ฏ Key Takeaways
- Decision trees reduce uncertainty step by step
- Overfitting occurs due to excessive branching
- Pruning removes unnecessary complexity
- Cost complexity pruning balances accuracy and simplicity
- Simpler models generalize better
๐ Related Articles
- Pruning Decision Trees Explained
- Cost Complexity Pruning
- Random vs Best Splits
- Model Variance Explained
- Decision Trees Guide
๐ Designed for deep learning + practical understanding.