Supervised Learning is one of the most fundamental and widely used techniques in machine learning. It powers systems that can classify emails, predict prices, detect diseases, and much more.
This tutorial introduces the core concepts of supervised learning, its types, practical examples, and a basic Python implementation. Whether you're a beginner starting out or a professional looking to refresh your knowledge, this guide will provide a clear understanding of the topic.
Supervised learning is a type of machine learning where a model is trained using a labeled dataset. Each input (also called a feature) has a known output (also called a label or target). The model learns the relationship between the input and the output so it can make predictions on new data.
| Term | Description |
|---|---|
| Labeled Data | Dataset where each input has a known output |
| Training | Teaching the model to find patterns in the data |
| Prediction | Estimating the output for new, unseen data |
| Evaluation | Measuring model performance using metrics |
Supervised learning can be broadly classified into two categories:
| Application | Type | Example |
|---|---|---|
| House Price Prediction | Regression | Predicting prices based on location and features |
| Email Spam Detection | Classification | Classifying messages as spam or not spam |
| Credit Risk Assessment | Classification | Predicting whether a customer will default on a loan |
| Temperature Forecasting | Regression | Predicting tomorrow’s temperature based on weather data |
| Disease Diagnosis | Classification | Predicting whether a patient has a certain disease |
Let’s build a simple linear regression model using scikit-learn.
# Install required libraries if not already installed:
# pip install scikit-learn matplotlib
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
import numpy as np
import matplotlib.pyplot as plt
# Sample dataset: Years of experience vs. Salary
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([30000, 35000, 40000, 45000, 50000])
# Split data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.2, random_state=42
)
# Create and train the model
model = LinearRegression()
model.fit(X_train, y_train)
# Make predictions
y_pred = model.predict(X_test)
# Display predictions
print("Predicted salaries:", y_pred)
# Plotting the results
plt.scatter(X, y, color='blue', label='Actual Data')
plt.plot(X, model.predict(X), color='red', label='Regression Line')
plt.xlabel("Years of Experience")
plt.ylabel("Salary")
plt.title("Simple Linear Regression")
plt.legend()
plt.show()Output -
Predicted salaries: [35000.]This is a basic example that demonstrates how a supervised learning algorithm (in this case, linear regression) learns from labeled data and makes predictions.
scikit-learn make it easy to get started with Python.Sign in to join the discussion and post comments.
Sign in