๐ณ Cost Complexity Pruning (CCP)
Cost Complexity Pruning (CCP) is a pruning technique used in decision tree algorithms to reduce overfitting by balancing model accuracy with tree simplicity.
❓ What Is Cost Complexity Pruning?
CCP introduces a penalty for tree complexity using a parameter called alpha (ฮฑ). Larger trees are penalized more heavily, encouraging simpler models.
⚙️ How CCP Works
The decision tree is grown to its maximum depth, often resulting in overfitting.
Each possible subtree is evaluated using the cost complexity function.
- Low ฮฑ → larger tree, higher accuracy
- High ฮฑ → smaller tree, stronger pruning
The subtree with the lowest cost complexity is selected as the final model.
๐ CCP Example: Fruit Classification
Dataset: Apples, Bananas, Cherries
Features: Weight, Color
Subtree A has the lowest cost complexity (11) and is selected as the final model.
✅ Final Pruned Tree
The pruned tree is:
- Simpler
- Less prone to overfitting
- More generalizable to unseen data
- CCP balances accuracy and complexity
- Alpha controls pruning strength
- Lower cost ≠ lowest error alone
- Simpler trees generalize better
- Used in CART-based decision trees
No comments:
Post a Comment