Sklearn predict 函数

2024 年 8 月 29 日 | 5 分钟阅读

本教程将展示如何利用 Python 机器学习模型使用 Sklearn 预测函数来预测结果。

因此，我们将简要总结该函数的功能，回顾语法，然后提供使用此方法与各种机器学习模型进行示例。

Sklearn 预测概述

要理解 predict 方法的主要功能，您必须熟悉标准的机器学习方法。

即使机器学习模型经过开发并部署在多个阶段，只有两个

训练模型进行算法
使用模型进行值预测

它比这要复杂一些。我们经常需要调整模型的参数或设计新的优化器来提高模型的效率。

然而，在极端情况下，我们在了解已知数据后对其进行训练，然后使用它来执行其他任务，例如预测。

机器学习模型主要用于基于对旧数据的理解来预测或预测新数据结果。机器学习算法旨在执行成功的价值预测。

使用机器学习算法预测数据

给定一组输入值，任何机器学习模型的主要目标是预测某个数量的值。

例如，让我们看一个预测房价的模型。输入，也称为特征，可能包括有关房屋邮政编码、房屋的平方英尺、卧室数量、浴室数量以及各种其他便利设施的信息。

如果我们已经使用这些特征训练了一个模型，我们可以向其输入更多数据，这应该会产生我们期望的输出。一旦我们用我们的观察结果训练了模型，我们就可以使用该模型对全新的数据进行预测。

大多数机器学习模型都遵循相同的方法。机器学习算法可用于就以下主题进行预测：

预测具有特定特征的某人是否会响应提供的营销活动
预测来自特定来源的电子邮件消息是否是“垃圾邮件”
预测给定的输入图像是否是猫、狗或其他物体。

进行某种形式的预测是许多机器学习算法（例如回归和分类模型）的主要目标。

Sklearn 库的 Predict 方法语法

既然我们已经了解了 sklearn 库的 predict 方法的作用，让我们看看它的语法。

我们将看到 predict 函数的语法。这意味着我们将假定您已经在当前工作环境中导入了 scikit-learn 库，并且已经在一个数据集上训练了一个模型，例如 LinearRegression、RandomForestRegressor 等。

Sklearn 预测语法

在执行 predict 方法时，我们必须使用一个已在训练数据集上进行过训练的机器学习模型类的实例。例如，支持向量机、决策树回归器、逻辑回归和线性回归都使用 scikit-learn 提供的机器学习模型。

使用“点”语法，我们可以在训练模型后使用 predict 方法

trained_model.predict(X_test)

我们在方法的括号内提供存储了我们要使用训练模型进行预测的值的数据的变量名称（即，我们正在为测试数据集预测值）。

例如，我们将使用 LinearRegression 对象来执行普通的线性回归。Sklearn.fit() 方法将使用 regressor_model 生成和训练学习模型。现在要预测某些输入特征的值，我们将使用以下命令：

regressor_model.predict(input_features)

作为输入的待使用数据的格式

在继续之前，还有一件事。

X_test 数据必须作为二维数组提供给 predict() 函数。例如，所有特征或完整数据集都应存储在二维 Numpy 数组对象中。

如果 X_test（测试数据集）是某种其他 Python 数据结构而不是二维数组格式，Python 解释器将停止执行并引发错误。之后，我们必须进行转换（如果不是 Numpy 数组）或重塑（如果具有其他形状），然后才能将其传递给 fit() 方法。

实现 Python predict() 函数

将所需数据集加载到工作环境中应是我们的第一步。可以使用 pandas.read_csv() 方法从系统加载数据集。我们将使用 sklearn 库的内置数据集。

使用 train_test_split() 方法，我们将数据集划分为训练集和测试集。

代码

# Python program to show how to use the predict() method to predict the values using the trained model

# Importing the required classes and dataset
from sklearn.datasets import load_iris
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Loading and separating the independent and the dependent features of the dataset
X, Y = load_iris(return_X_y = True)

# Creating an instance of the Logistic Regression model class
logreg = LogisticRegression(random_state = 0)

# Fitting our load_iris dataset to the model
logreg.fit(X, Y)

# Predicting the values using the predict method in the model class
Y_pred = logreg.predict(X)

# Calculating the accuracy score of the model based on the predicted values and the true values of the dependent feature
score = accuracy_score(Y, Y_pred)
print(score)

输出

0.9733333333333334

在决策树中使用 predict() 方法

现在我们将决策树方法应用于同一数据集，以预测测试数据集的目标标签。

代码

# Python program to show how to use the predict() method with the decision tree model

# Importing the required classes and dataset
from sklearn.datasets import load_iris
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# Loading and separating the independent and the dependent features of the dataset
X, Y = load_iris(return_X_y = True)

# Splitting the dataset
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.3)

# Creating an instance of the Decision Tree model class
DT_model = DecisionTreeClassifier(max_depth = 5)

# Fitting our load_iris dataset to the model
DT_model.fit(X_train, Y_train)

# Predicting the values using the predict method in the model class
Y_pred = DT_model.predict(X_test)

# Calculating the accuracy score of the model based on the predicted values and the true values of the dependent feature
score = accuracy_score(Y_test, Y_pred)
print(score)

输出

0.9777777777777777

在 KNN 算法中使用 predict() 函数

在这种情况下，数据集已针对 KNN 模型进行了训练以进行预测。我们将遵循相同的步骤，将数据分为训练数据和测试数据，然后使用训练数据来训练 KNeighborsRegressor()。

此外，我们将使用 accuracy() 方法来查找该数据集的模型准确性。

代码

# Python program to show how to use the predict() method with the KNN model

# Importing the required classes and dataset
from sklearn.datasets import load_iris
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.model_selection import train_test_split

# Loading and separating the independent and the dependent features of the dataset
X, Y = load_iris(return_X_y = True)

# Splitting the dataset
X_train, X_test, Y_train, Y_test = train_test_split(X, Y, test_size = 0.3)

# Creating an instance of the Decision Tree model class 
knn_model = KNeighborsClassifier()

# Fitting our load_iris dataset to the model
knn_model.fit(X_train, Y_train)

# Predicting the values using the predict method in the model class
Y_pred = knn_model.predict(X_test)

# Calculating the accuracy score of the model based on the predicted values and the true values of the dependent feature
score = accuracy_score(Y_test, Y_pred)
print(score)

输出

0.9777777777777777

下一主题Python 中减去字符串列表

Sklearn predict 函数

Sklearn 预测概述

使用机器学习算法预测数据

Sklearn 库的 Predict 方法语法

实现 Python predict() 函数

在决策树中使用 predict() 方法

在 KNN 算法中使用 predict() 函数

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

Python 问题

Sklearn predict 函数

Sklearn 预测概述

使用机器学习算法预测数据

Sklearn 库的 Predict 方法语法

实现 Python predict() 函数

在决策树中使用 predict() 方法

在 KNN 算法中使用 predict() 函数

相关帖子

Python PyOpenGL 简介

编写 Python 程序对奇偶排序或奇偶转换排序进行排序

Python 中的 vif

Python | OpenCV 中的图像滤镜

Borůvka 算法 - 最小生成树

使用 Python 进行名片阅读器

Python 编码平台

Python 开发者的高级 Python 概念

Python 中的 T-Test

如何在 Python 中对数字进行四舍五入

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器