Pasting in Machine Learning (Simple & Clear Guide)
๐ Table of Contents
- What is Pasting?
- Core Idea (Simple)
- How Pasting Works
- Why It Works
- When to Use
- When NOT to Use
- Pasting vs Bagging vs Boosting
- Code Example
- CLI Output
- Key Takeaways
- Related Articles
๐ What is Pasting?
Pasting is an ensemble learning technique where we train multiple models on different parts of the dataset and combine their predictions.
๐ง Core Idea (Very Simple)
Imagine this:
You ask 5 people to guess something. Each person sees different information.
- Each gives a different answer
- You take the average
- The result is usually better
⚙️ How Pasting Works
- Split dataset into different parts (no overlap)
- Train one model on each part
- Get predictions from all models
- Combine predictions (average or voting)
Important:
- Each model sees different data
- No repetition of data
❓ Why Pasting Works
Single models can make mistakes.
But when multiple models:
- See different data
- Learn different patterns
Their mistakes cancel out.
✅ When to Use Pasting
- Large dataset available
- Model has high variance (unstable predictions)
- Want simple ensemble method
❌ When NOT to Use
- Small dataset (data gets divided too much)
- Need highest accuracy
- Limited computing power
⚖️ Pasting vs Bagging vs Boosting
- Pasting: No overlap in data
- Bagging: Overlapping data (random sampling)
- Boosting: Models learn from mistakes step-by-step
๐ป Code Example
from sklearn.tree import DecisionTreeClassifier import numpy as np # Sample data X = np.array([[1],[2],[3],[10],[11],[12]]) y = np.array([0,0,0,1,1,1]) # Split manually (pasting) X1, y1 = X[:3], y[:3] X2, y2 = X[3:], y[3:] model1 = DecisionTreeClassifier().fit(X1, y1) model2 = DecisionTreeClassifier().fit(X2, y2) # Prediction pred1 = model1.predict([[5]]) pred2 = model2.predict([[5]]) final = (pred1 + pred2) / 2 print(final)
๐ฅ CLI Output
[0.5]
Interpretation:
- 0 → class 0
- 1 → class 1
- 0.5 → uncertain (average result)
๐ฏ Key Takeaways
๐ Related Articles
๐ Final Thought
Pasting is simple but powerful. It shows an important lesson in machine learning: multiple simple models together can outperform one complex model.
No comments:
Post a Comment