Wednesday, September 25, 2024

Finding the Right Number of Neighbors in K-Nearest Neighbors (KNN)

KNN Explained – How to Choose the Best K Value

๐Ÿค– K-Nearest Neighbors (KNN) – How to Choose the Right K

Choosing the right value of K in KNN can make or break your model. Too small, and your model overfits. Too large, and it becomes too simple.


๐Ÿ“š Table of Contents


๐Ÿ“Œ What is KNN?

KNN is a simple algorithm that classifies a data point based on its nearest neighbors.

๐Ÿ‘‰ It doesn’t learn a model—it remembers the data.

๐Ÿ“ Math Behind KNN (Simple)

1. Distance Calculation

\[ d = \sqrt{\sum_{i=1}^{n}(x_i - y_i)^2} \]

This is called Euclidean distance.

๐Ÿ‘‰ It measures how far two points are in space.

2. Prediction Rule

\[ y = \text{majority}(neighbors) \]

For regression:

\[ y = \frac{1}{K} \sum_{i=1}^{K} y_i \]


๐ŸŽฏ Role of K

K ValueEffect
Small KHigh variance (overfitting)
Large KHigh bias (underfitting)
๐Ÿ‘‰ Balance is everything.

๐Ÿ“Š Factors to Consider

  • Dataset size
  • Data distribution
  • Number of features
  • Problem type

๐Ÿ” Methods to Find Optimal K

1. Cross Validation

Test multiple K values and compare performance.

2. Elbow Method

\[ Error(K) \]

Plot error vs K and find the “elbow point”.

3. Grid Search

Test all values systematically.


๐Ÿ’ป Code Example

from sklearn.neighbors import KNeighborsClassifier from sklearn.model_selection import cross_val_score k_values = range(1, 20) scores = [] for k in k_values: model = KNeighborsClassifier(n_neighbors=k) score = cross_val_score(model, X, y, cv=5).mean() scores.append(score) print(scores)

๐Ÿ–ฅ️ CLI Output

Click to Expand
K=1  → Accuracy: 0.91
K=5  → Accuracy: 0.95
K=10 → Accuracy: 0.94

Best K = 5 

๐Ÿ’ก Key Takeaways

  • K controls model complexity
  • Small K → overfitting
  • Large K → underfitting
  • Use validation to find best K

๐ŸŽฏ Final Thought

Choosing K is not guesswork—it’s experimentation backed by math.

Once you understand the balance between bias and variance, KNN becomes a powerful and intuitive tool.

No comments:

Post a Comment

Featured Post

How HMT Watches Lost the Time: A Deep Dive into Disruptive Innovation Blindness in Indian Manufacturing

The Rise and Fall of HMT Watches: A Story of Brand Dominance and Disruptive Innovation Blindness The Rise and Fal...

Popular Posts