Thursday, September 19, 2024

Euclidean Distance in KNN Explained with Simple Examples

KNN and Euclidean Distance Explained (Complete Guide)

K-Nearest Neighbors (KNN) & Euclidean Distance

๐Ÿ“– Introduction

KNN is a simple algorithm that classifies data based on similarity.

๐Ÿ’ก Core Idea: Nearby points tend to belong to the same class.

๐Ÿ“ Euclidean Distance

Formula:

\[ d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} \]

This represents the straight-line distance between two points.

๐Ÿ“ Mathematical Deep Dive

This comes from the Pythagorean theorem:

\[ a^2 + b^2 = c^2 \]

  • a = horizontal difference
  • b = vertical difference
  • c = distance

\[ d = \sqrt{(x_2 - x_1)^2 + (y_2 - y_1)^2} \]

๐Ÿ”ฝ Why squaring?

It removes negative values and ensures distance is always positive.

๐Ÿ“Š Worked Example

  • A (1,2)
  • B (2,3)
  • C (4,5)
  • New (3,3)

\[ d(A) = \sqrt{(3-1)^2 + (3-2)^2} = \sqrt{5} \approx 2.23 \]

\[ d(B) = 1 \]

\[ d(C) = \sqrt{5} \approx 2.23 \]

๐Ÿค– KNN Classification

  • Closest: B
  • Second: A

Prediction: Class 1

๐Ÿ“ฆ Higher Dimensions

\[ d = \sqrt{\sum_{i=1}^{n}(x_i - y_i)^2} \]

This allows KNN to work with multiple features.

๐Ÿ’ป CLI Example

Code

import numpy as np

def distance(p1, p2):
    return np.sqrt(np.sum((np.array(p1) - np.array(p2))**2))

print(distance([3,3],[1,2]))
print(distance([3,3],[2,3]))
print(distance([3,3],[4,5]))

Output

$ python knn.py
2.23
1.0
2.23
๐Ÿ”ฝ Explanation

Computes squared differences → sums → square root.

๐ŸŽฏ Key Takeaways

  • Distance = similarity measure
  • KNN uses nearest neighbors
  • Works in any dimension
  • Based on geometry

๐Ÿ“˜ Final Thoughts

Euclidean distance converts data into measurable relationships, making machine learning possible.

No comments:

Post a Comment

Featured Post

How HMT Watches Lost the Time: A Deep Dive into Disruptive Innovation Blindness in Indian Manufacturing

The Rise and Fall of HMT Watches: A Story of Brand Dominance and Disruptive Innovation Blindness The Rise and Fal...

Popular Posts