⚡ Regression predicts a continuous numerical value — house price, sales forecast, patient recovery time, energy demand. Unlike classification (which category?), regression answers how much? or how many? Linear regression fits a straight line; more complex models capture curved relationships. It is one of the most widely used ML techniques across every quantitative field.
Category: Machine Learning · Difficulty: Beginner · Last updated: 15 May 2026 · 4 min read
Regression — What It Is, How AI Predicts Numbers & The Difference from Classification
What is Regression?
Not every prediction is a category. “Will this customer churn?” is classification — yes or no. “How much will this customer spend next month?” is regression — a number. “Is this tumour malignant?” is classification. “How large is this tumour in cubic centimetres?” is regression.
Regression is any ML task where the output is a continuous number rather than a discrete category. The model learns a mapping from input features to a numerical prediction. Linear regression learns the simplest possible mapping — a straight line. Polynomial regression fits curves. Gradient boosting and neural networks learn highly non-linear relationships between inputs and a numerical output.
Regression underpins virtually every quantitative forecast in business: revenue projections, inventory planning, pricing models, risk scoring, demand forecasting, energy consumption prediction, and financial valuations.
TYPES OF REGRESSION
Linear regression — the output is a weighted sum of input features. Assumes the relationship is approximately linear. Interpretable, fast, and often surprisingly effective as a baseline.
Polynomial regression — extends linear regression with polynomial terms (x², x³) to capture curved relationships. Still interpretable but prone to overfitting with high-degree polynomials.
Ridge and Lasso regression — linear regression with regularisation. Ridge (L2) shrinks all coefficients toward zero. Lasso (L1) can set some coefficients to exactly zero (feature selection). Both reduce overfitting on high-dimensional data.
Non-linear regression — gradient boosting (XGBoost, LightGBM), random forests, and neural networks learn arbitrarily complex relationships between inputs and continuous outputs. Most production regression models use one of these.
EVALUATION METRICS
MSE (Mean Squared Error): average squared difference between predicted and actual values. Large errors are penalised heavily.
RMSE (Root Mean Squared Error): square root of MSE — expressed in the same units as the prediction. A house price model with RMSE of $25,000 is on average $25,000 off.
MAE (Mean Absolute Error): average absolute difference. More robust to outliers than MSE/RMSE.
R² (R-squared / Coefficient of Determination): proportion of variance in the target explained by the model. R²=0 means no better than always predicting the mean. R²=1 means perfect prediction. R²=0.85 means the model explains 85% of the variance in house prices.
Real-world examples
Not theory — what real teams actually shipped using this technique.
- Zillow’s Zestimate — a regression model that predicts residential property values from hundreds of features (location, size, age, recent sales of similar properties). Used by millions of buyers and sellers as a valuation reference.
- Uber’s ETA prediction — regression models predict trip duration from origin, destination, time of day, traffic conditions, and driver behaviour history. Accuracy directly affects user experience and driver dispatching.
- Google’s Ad auction — regression models predict click-through rate and conversion probability for each ad-query pair, which determines ad ranking and pricing in real-time auctions.
Common pitfalls
- Extrapolation failure — regression models are unreliable outside the range of training data. A house price model trained on prices up to $2M may produce nonsensical predictions for a $10M property.
- Outlier sensitivity — MSE loss heavily penalises large errors, which can cause the model to fit outliers at the expense of typical cases. Use MAE or robust regression when outliers are prevalent.
- Feature scaling — distance-based and gradient-based regression algorithms are sensitive to feature scale. Normalise features before training to ensure no single feature dominates by magnitude.
- Confusing regression with causation — regression finds correlations that predict the target. High correlation between ice cream sales and drowning rates does not mean ice cream causes drowning (both are caused by summer). Regression identifies associations; causal inference requires additional design.
Frequently asked questions
QUESTION 1 What is regression in machine learning?
ANSWER 1 Predicting a continuous number — how much, how many, what value. The output is a number, not a category.
QUESTION 2 What is the difference between regression and classification?
ANSWER 2 Regression predicts a continuous number (house price: $423,000). Classification predicts a category (cheap / mid / expensive). Same data, different output type.
QUESTION 3 What is linear regression?
ANSWER 3 Fits the best straight line through data points — output is a weighted sum of input features. Simple, interpretable, surprisingly effective when relationships are approximately linear.
QUESTION 4 How do you evaluate regression models?
ANSWER 4 MSE (large error penalty), MAE (outlier-robust), RMSE (same units as target), and R² (proportion of variance explained). Use all together for a complete picture.
Sources & further reading
- Hastie, Tibshirani & Friedman (2009). The Elements of Statistical Learning. Stanford. Available free at: web.stanford.edu/~hastie/ElemStatLearn/ — comprehensive treatment of regression and ML.
- James et al. (2023). An Introduction to Statistical Learning. Available free at: statlearning.com — accessible introduction with R and Python code.
- Bishop (2006). Pattern Recognition and Machine Learning. Springer — Chapter 3 covers linear regression thoroughly.
- Scikit-learn documentation: scikit-learn.org/stable/supervised_learning.html — practical implementations with examples.
📬 Get one concept + one use case every Tuesday. Join the newsletter →