Showing posts with label MAE. Show all posts
Showing posts with label MAE. Show all posts

Monday, September 16, 2024

Decision Tree Splitting Criteria Explained: MSE, Friedman MSE, Poisson, and MAE

Decision Tree Criteria Explained

Decision Tree Criteria Explained: MSE, Friedman MSE, Poisson & MAE

When building decision trees in machine learning, the criterion determines how the tree decides the best split at each step. Choosing the right one has a direct impact on model accuracy, robustness, and performance.

This guide explains MSE, Friedman MSE, Poisson, and MAE in simple terms — and when to use each.

What Is a Decision Tree?

A decision tree works like a flowchart. Each internal node asks a question, each branch represents an answer, and each leaf produces a final prediction.

The criterion defines how the tree evaluates possible splits and chooses the one that best separates the data.

1. Mean Squared Error (MSE)

๐Ÿ“Œ When to use

Regression problems with continuous targets.

๐Ÿ” What it is

MSE measures how far predictions are from actual values by squaring the errors. Larger errors are penalized more heavily.

⚙️ How it works

At each split, the tree chooses the point that minimizes the average squared error within the resulting child nodes.

Example use cases:
  • House price prediction
  • Stock price forecasting
  • Sales or temperature prediction

2. Friedman MSE

๐Ÿ“Œ When to use

Regression with Gradient Boosting

๐Ÿ” What it is

A modified version of MSE designed specifically for gradient boosting algorithms. It balances bias and variance more effectively.

⚙️ How it works

It improves split quality by incorporating gradient-based corrections, making boosted trees converge faster and perform better.

Example use cases:
  • Fraud detection
  • Customer churn prediction
  • High-performance ML pipelines

3. Poisson Criterion

๐Ÿ“Œ When to use

Count-based regression

๐Ÿ” What it is

Designed for predicting non-negative integer values. Assumes the target follows a Poisson distribution.

⚙️ How it works

Instead of squared error, it minimizes loss appropriate for event-count data, ensuring predictions stay non-negative.

Example use cases:
  • Website sign-ups per day
  • Traffic volume prediction
  • Call center demand forecasting

4. Mean Absolute Error (MAE)

๐Ÿ“Œ When to use

Regression with outliers

๐Ÿ” What it is

MAE measures the average absolute difference between predictions and true values. Unlike MSE, it does not heavily penalize large errors.

⚙️ How it works

Each error contributes linearly, making MAE more robust to extreme values.

Example use cases:
  • Income prediction
  • Robust pricing models
  • Median-based forecasting

Choosing the Right Criterion

  • MSE → Standard regression, smooth optimization
  • Friedman MSE → Gradient boosting models
  • Poisson → Count data and event prediction
  • MAE → Robust regression with outliers

๐Ÿ’ก Key Takeaway

  • The criterion controls how your tree learns
  • Match the criterion to your target data type
  • Wrong choice can reduce accuracy or stability
  • Simpler criteria often outperform complex ones when well-matched
Decision Tree Learning Guide • Clear • Practical • Model-Aware

Featured Post

How HMT Watches Lost the Time: A Deep Dive into Disruptive Innovation Blindness in Indian Manufacturing

The Rise and Fall of HMT Watches: A Story of Brand Dominance and Disruptive Innovation Blindness The Rise and Fal...

Popular Posts