机器学习的特征值和特征向量

2025年6月18日 | 阅读 7 分钟

特征向量：每个向量（数字列表）在 X 和 Y 值图上绘制时都有一个方向。特征向量是指在某种线性变换（例如与某个标量相乘）下，其方向保持不变的向量。

特征值：在变换中用于缩放（拉伸）或压缩（变窄）特征向量的标量。

通过使用数据的特征向量和特征值可以降低数据中的噪声。它们有助于提高许多计算密集型任务的效率。消除之间具有高度相关性的特征也有助于减少过拟合。

Eigenvalues and Eigenvectors for Machine Learning

当我们构建在图像、声音或文本内容上训练的预测模型时，输入特征集最终可能会包含大量的特征。超过 3 维的数据也很难理解和可视化。因此，例如，为了转换文本特征中的值，我们经常使用独热编码，它将值转换为完全独立的数值列，这些列随后会占用磁盘空间。一种旨在在不丢失关键信息的情况下降低维度空间的重要策略是使用主成分分析。PCA 的核心组成部分是特征值和特征向量。

代码

import numpy as np

M = np.array([[3, 1], [1, 2]])

# Computing the eigenvalues and eigenvectors of M
values_eig, _ = np.linalg.eig(M)

# Displaying the eigenvalues of M
print("Eigenvalues of M:", values_eig)

# Computing the squared eigenvalues
print("Squared Eigenvalues of M:", np.square(values_eig))

# Computing N = M^T * M
M_T = M.T  # Transpose of M
N = np.dot(M_T, M)

# Computing the eigenvalues and eigenvectors of N
values_eig_N, _ = np.linalg.eig(N)

# Displaying the eigenvalues of N
print("Eigenvalues of N:", values_eig_N)

输出

代码

import numpy as np

# Define a new matrix M
M = np.array([[3, 1], [1, 2]])

# Compute the eigenvalues and eigenvectors of M
eigen_vals_M, eigen_vecs_M = np.linalg.eig(M)

eigen_vals_M_diag = np.diag(eigen_vals_M)
print("Eigenvalues of M:")
print(eigen_vals_M_diag)
print()

print("Eigenvectors of M:")
print(eigen_vecs_M)
print()

# Compute matrix N as the product of M^T and M
M_transpose = np.transpose(M)
N = np.dot(M_transpose, M)

print("Matrix N (M^T * M):")
print(N)
print()

# Compute the eigenvalues and eigenvectors of N using the properties of M
eigen_vecs_N = eigen_vecs_M  # Eigenvectors remain the same
eigen_vals_N = np.square(eigen_vals_M_diag)  # Squaring the eigenvalues

print("Eigenvalues of N using M:")
print(eigen_vals_N)
print()

print("Eigenvectors of N using M:")
print(eigen_vecs_N)
print()

# Compute the actual eigenvalues and eigenvectors of N
eigen_vals_N_actual, eigen_vecs_N_actual = np.linalg.eig(N)

eigen_vals_N_actual_diag = np.diag(eigen_vals_N_actual)
print("Eigenvalues of N (direct computation):")
print(eigen_vals_N_actual_diag)
print()

print("Eigenvectors of N (direct computation):")
print(eigen_vecs_N_actual)
print()

输出

现在我们将应用基于特征向量和特征值的降维。

导入库

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from sklearn.decomposition import PCA
from sklearn.preprocessing import StandardScaler
%matplotlib inline

读取数据集

dFrame = pd.read_csv(
     filepath_or_buffer='https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data', 
     header=None, 
     sep=',')
dFrame.columns=['sepal_length(cm)', 'sepal_width(cm)', 'petal_length(cm)', 'petal_width(cm)', 'class']

# are there any null value
print(dFrame.isnull().values.any())

输出

False

代码

输出

代码

输出

代码

# As shown below, separate the values of the independent features in the X array variables from the Target column, which contains the class column values in the Y array.

X = dFrame.iloc[:,0:4].values
y = dFrame.iloc[:,4].values
X.shape, y.shape

输出

((150, 4), (150,))

标准化

在 150×4 的数据集中，150 个样本中的每一个都将由矩阵中的一行表示，其中相应的列代表特征。

从上面截图显示的输出可以看出，每个样本 x 的每一行都可以被想象成一个四维向量。

当数据集中不同的特征尺度不同时，标准化对于将这些特征输入转换为具有均值 0 和方差 1 的相应输入空间至关重要。标准化是 PCA 过程的关键步骤，因为 PCA 对数据变化非常敏感。如果某些特征之间存在很大差异，那么较大的尺度将过度影响主成分结果。因此，进行了标准化以平衡各个特征的影响，从而使 PCA 能够掌握并包含数据中最有意义的模式。

PCA 的一种传统方法是通过协方差矩阵 Σ 的特征分解来实现的，Σ 是一个 d×d 矩阵，其元素表示两个特征之间的协方差。这里的 d 是原始数据集的原始维度数。

代码

#vec_mean = np.mean(std_X, axis=0)
#mat_cov = (std_X - vec_mean).T.dot((std_X - vec_mean)) / (std_X.shape[0]-1)

"""
Different way to implement Cov-matrix: 

Way 1 -

mat_cov = (X.T @ X) / (X.shape[0] - 1)

--------
Way 2 -

mat_cov= np.cov(std_X, rowvar=False)

"""

print('Covariance matrix \n')

mat_cov = np.cov(std_X.T)
vals_eig, vecs_eig = np.linalg.eig(mat_cov)
print('Eigenvectors \n%s' %mat_cov)
print('Eigenvectors \n%s' %vecs_eig)
print('\nEigenvalues \n%s' %vals_eig)

输出

我们知道一个特征向量所有值的平方和为 1。让我们检查一下这是否表明我们已成功计算出特征向量。

代码

eig_squared=[]
for i in vecs_eig:
    eig_squared.append(i**2)
print(eig_squared)
sum(eig_squared)
print("\nsum of squares of each values in an  eigenvector is \n", 0.27287211+ 0.13862096+0.51986524+ 0.06864169)
for ev in vecs_eig:
    np.testing.assert_array_almost_equal(1.0, np.linalg.norm(ev))

输出

其背后的规则是，我们将特征值按降序排序，然后选择与前 k 个特征值对应的 k 个特征。通过选择前 k 个特征空间，我们可以确定这 k 个特征空间对应的方差足以表征数据集。消除未被选中的特征不会造成太多精度损失，因为这些特征对应的方差可能不太有用，或者由于忽略方差而可以接受损失精度。

这是我们必须根据给定的问题集和业务案例做出的决定。没有确切的规则来决定它。

代码

#Create a list of tuples with the eigenvalue and eigenvector.
pairs_eig = [(np.abs(vals_eig[i]), vecs_eig[:,i]) for i in range(len(vals_eig))]
print(type(pairs_eig))
#Arrange the tuples of (eigenvalue, eigenvector) in ascending order.
pairs_eig.sort()
pairs_eig.reverse()
print("\n",pairs_eig)
#Verify visually that the list is arranged according to diminishing eigenvalues.
print('\n\n\nEigenvalues in descending order:')
eigenValues_sorted = []
eigenVectors_sorted = []

for i,k in zip(pairs_eig,range(len(pairs_eig))):
    print(i[0])
    eigenValues_sorted.append(i[0])
    eigenVectors_sorted.append(list(pairs_eig[k][1]))
    
eigenVectors_sorted

输出

代码

tto = sum(vals_eig)
print("\n",tto)
exp_Var = [(i / tto)*100 for i in sorted(vals_eig, reverse=True)]
print("\n\n1. Variance Explained\n",exp_Var)
exp_cum_var = np.cumsum(exp_Var)
print("\n\n2. Cumulative Variance Explained\n",exp_cum_var)
print("\n\n3. Percentage of variance the first two principal components each contain\n ",exp_Var[0:2])
print("\n\n4. Percentage of variance the first two principal components together contain\n",sum(exp_Var[0:2]))

输出

代码

int_X = range(1, len(exp_cum_var) + 1)
plt.plot(int_X, exp_cum_var)

plt.xlabel("Number of components")
plt.ylabel("Cumulative explained variance")
plt.xticks(int_X)
plt.xlim(1, 4, 1)

输出

从图中可以看出，最大的两个主成分解释了超过 95% 的方差。因此，可以选择最大的两个主成分来形成投影矩阵 W。

这个投影矩阵将 Iris 数据集转换为一个低维的新特征子空间。将与最大特征值对应的 k 个特征向量连接起来形成该矩阵。在这里，从原始的 4 维特征空间中，只选择具有最大特征值的两个特征向量，并将其合并到一个 2 维子空间中。因此，在简化数据集的同时，最大化了方差的保留，并便于可视化和进一步分析。

代码

print(pairs_eig[0][1])
print(pairs_eig[1][1])
w_matrix = np.hstack((pairs_eig[0][1].reshape(4,1), 
                      pairs_eig[1][1].reshape(4,1)))
#Arrays are stacked horizontally (column-wise) using the hstack function.
print('Matrix W:\n', w_matrix)

输出

最后一步是使用大小为 4×2 的投影矩阵 W 将原始数据集 X 转换为维度低于 X 的空间。这是通过方程 Y=X×W 完成的，其中 Y 是转换后的数据集。由于原始数据集 X 由 150 个具有 4 个特征的样本组成，因此生成的矩阵 Y 的维度将是 150×2。这种降维使得特征空间足够小，能够保留数据中的最大方差。

代码

Y = std_X.dot(w_matrix)
dFrame_principal = pd.DataFrame(data = Y
          , columns = ['principal component 1', 'principal component 2'])
dFrame_principal.head()

输出

代码

dFrame_final = pd.concat([dFrame_principal,pd.DataFrame(y,columns = ['species'])], axis = 1)
dFrame_final.head()

输出

代码

fig = plt.figure(figsize = (8,5))
ax = fig.add_subplot(1,1,1) 
ax.set_xlabel('Principal Component 1', fontsize = 15)
ax.set_ylabel('Principal Component 2', fontsize = 15)
ax.set_title('2 Component PCA', fontsize = 20)
targets = ['Iris-setosa', 'Iris-versicolor', 'Iris-virginica']
colors = ['r', 'b', 'y']
for target, color in zip(targets,colors):
    indicesToKeep = dFrame_final['species'] == target
    ax.scatter(dFrame_final.loc[indicesToKeep, 'principal component 1']
               , dFrame_final.loc[indicesToKeep, 'principal component 2']
               , c = color
               , s = 50)
ax.legend(targets)
ax.grid()

输出

代码

pca = PCA(num_compo=2) #The percentage may also be given as a parameter to the PCA function as pca = PCA(0.95). Here 0.95 means we want to explain 95% of the variance, hence PCA will return the number of components that explain approximately 95% of variance. But here we know from above computations that 2 components are sufficient; hence we will pass 2.
prin_compo = pca.fit_transform(std_X) 
dFrame_principal = pd.DataFrame(data = prin_compo
              , columns = ['principal component 1', 'principal component 2']) 
dFrame_principal.head(5) # prints the top 5 rows

输出

代码

dFrame_final = pd.concat([dFrame_principal, dFrame_final[['species']]], axis = 1)
dFrame_final.head(5)

输出

前两个主成分覆盖了 95.80% 的可用信息。第一个主成分解释了 72.77% 的方差，第二个主成分解释了 23.03% 的方差。第三个和第四个主成分包含了数据集的其余方差。

下一个主题机器学习和云计算

← 上一个下一个 →

机器学习的特征值和特征向量

导入库

读取数据集

标准化

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

机器学习

监督式学习

分类

杂项

相关教程

面试题

机器学习的特征值和特征向量

导入库

读取数据集

标准化

相关帖子

机器学习中的地磁场

进化策略

Caret R 包用于应用预测建模

Light Gradient Boosted Machine (LightGBM)

机器学习中的问题

衡量模型不确定性的方法

机器学习中的基尼指数

泰坦尼克号 - 机器学习灾难

机器学习中的解析解与数值解

卷积神经网络中的步幅

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器