How do you choose K in KNN?

Small K (K=1, 3) captures fine-grained local structure but is sensitive to noise — a single mislabelled training point can throw off predictions. Large K produces smoother decision boundaries but may over-generalise. The optimal K is typically found by testing multiple values on a validation set. For most tasks, K between 3 and 15 works well. Use odd K for binary classification to avoid ties.

Is KNN used in production today?

Yes, in specific contexts. Approximate nearest neighbour search — the core of vector databases and semantic search systems — is essentially high-dimensional KNN made fast through specialised index structures (FAISS, HNSW). Recommendation systems and image similarity search also use nearest-neighbour logic at scale with approximate methods.

KNN (K-Nearest Neighbours)

Q: What is KNN in simple terms?

KNN is the simplest possible approach to classification: to predict what a new data point is, look at the K most similar points you have already seen and take a majority vote. If 4 of your 5 nearest neighbours are labelled 'cat', predict 'cat'. No complex training. Just memorise all examples and compare new ones to what you know.

Q: What are the limitations of KNN?

Slow at prediction time — for each new point, KNN must compute its distance to every training point. With millions of training examples, this is prohibitively slow. High memory — all training data must be stored. Curse of dimensionality — distance becomes less meaningful as feature count grows. KNN performs well in low dimensions but degrades rapidly in high-dimensional spaces.

⚡ KNN (K-Nearest Neighbours) classifies a new data point by finding the K most similar points in the training set and taking a majority vote of their labels. No training phase — just memorise all examples and compare new ones to what you have seen. One of the simplest ML algorithms and the conceptual foundation of modern vector search and recommendation systems.

Category: Machine Learning · Difficulty: Beginner · Last updated: 15 May 2026 · 4 min read

KNN — How the Simplest ML Algorithm Makes Predictions by Asking Its Neighbours

What is KNN?

Tell me who your friends are and I will tell you who you are. KNN is this principle made into an algorithm.

To classify a new data point, KNN asks: who are this point’s nearest neighbours in the training data? Find the K most similar training examples (the “nearest” in terms of distance in feature space). Look at their labels. Take a majority vote. That majority label is the prediction.

There is no training phase. No model is built. No weights are learned. KNN simply stores all training examples and uses them directly at prediction time. It is called a lazy learner because it does all its work at prediction time rather than at training time.

How KNN works

Store all training examples — each with its features and label.
A new unlabelled point arrives.
Calculate the distance from this new point to every training example (typically Euclidean distance).
Find the K training examples with the smallest distances — the K nearest neighbours.
For classification: take the majority vote of the K neighbours’ labels. That is the prediction.
For regression: take the mean of the K neighbours’ values. That is the prediction.

Real-world examples

Not theory — what real teams actually shipped using this technique.

Netflix’s earliest recommendation system was essentially KNN — find users most similar to you (nearest neighbours in the space of viewing history), see what they watched and liked, recommend those. Collaborative filtering is still conceptually KNN at scale.
Medical diagnosis support: given a patient’s symptoms and test results, find the K most similar past patients in the records, look at their diagnoses — a simple KNN provides a baseline for diagnostic suggestion that is fully interpretable.
FAISS (Facebook AI Similarity Search) and HNSW (Hierarchical Navigable Small World graphs) are approximate KNN algorithms that power modern vector databases — making semantic search fast enough for production by finding approximate nearest neighbours without checking every training point.

Common pitfalls

Slow inference at scale — computing distance to every training point for every prediction is O(n) per query. With millions of training points, this is impractical. Approximate nearest neighbour methods solve this.
Curse of dimensionality — in high dimensions, all points become approximately equidistant from each other. The concept of “nearest neighbour” becomes meaningless when you have hundreds of features. Dimensionality reduction or feature selection is needed before KNN in high-dimensional settings.
Feature scaling is critical — KNN is purely distance-based. A feature on a 0-100,000 scale dominates a feature on a 0-1 scale. Always normalise features before using KNN.
Memory scales with training data — all training examples must be stored and searched. Unlike a trained neural network that compresses knowledge into weights, KNN scales linearly with dataset size.

Frequently asked questions

QUESTION 1 What is KNN in simple terms?

ANSWER 1 Find the K most similar examples you have seen, take a majority vote of their labels. No training — just memorise all examples and compare at prediction time.

QUESTION 2 How do you choose K?

ANSWER 2 Test multiple values on a validation set. Small K: sensitive to noise. Large K: over-generalises. Typically 3-15 works well. Use odd K for binary classification to avoid ties.

QUESTION 3 What are the limitations of KNN?

ANSWER 3 Slow at prediction time (scales with training set size), degrades in high dimensions (curse of dimensionality), and requires feature scaling.

QUESTION 4 Is KNN still used in production?

ANSWER 4 Yes — approximate nearest neighbour search (FAISS, HNSW) is essentially fast KNN powering vector databases, semantic search, and recommendation systems.

📬 Get one concept + one use case every Tuesday. Join the newsletter →