C++ 中的皮尔逊相关系数

2025年5月10日 | 阅读 4 分钟

C++ 程序使用用户提供的包含两个浮点值（表示变量 X 和 Y）的向量作为输入来计算皮尔逊相关系数。

皮尔逊相关系数用于衡量两个变量之间的线性关系。它通常取值介于 -1 和 1 之间，用符号 r 表示。

r = 1 表示完美的正线性关系，
r = −1 表示完美的负线性关系，
r = 0 表示没有线性关系。

在 C++ 中实现皮尔逊相关系数的步骤

有必要找到两个变量的均值。
找出两个变量之间的协方差表示了多少。
分别计算两个变量的标准差。
使用以下公式查找相关系数。

示例 1

让我们举一个例子来说明 C++ 中的皮尔逊相关系数。

#include <iostream>
#include <vector>
#include <cmath>

double calculateMean(const std::vector<double>& data) {
    double sum = 0.0;
    for (double value : data) {
        sum += value;
    }
    return sum / data.size();
}

double calculateCovariance(const std::vector<double>& x, const std::vector<double>& y) {
    if (x.size() != y.size()) {
        std::cerr << "Error: Size mismatch between x and y vectors\n";
        return 0.0;
    }
    
    double xMean = calculateMean(x);
    double yMean = calculateMean(y);
    
    double covariance = 0.0;
    for (size_t i = 0; i < x.size(); ++i) {
        covariance += (x[i] - xMean) * (y[i] - yMean);
    }
    return covariance / x.size();
}

double calculateStandardDeviation(const std::vector<double>& data, double mean) {
    double variance = 0.0;
    for (double value : data) {
        variance += pow(value - mean, 2);
    }
    return sqrt(variance / data.size());
}

double calculatePearsonCorrelation(const std::vector<double>& x, const std::vector<double>& y) {
    double covariance = calculateCovariance(x, y);
    double xStdDev = calculateStandardDeviation(x, calculateMean(x));
    double yStdDev = calculateStandardDeviation(y, calculateMean(y));
    
    return covariance / (xStdDev * yStdDev);
}

int main() {
    std::vector<double> x = {1, 2, 3, 4, 5};
    std::vector<double> y = {2, 4, 6, 8, 10};
    
    double correlation = calculatePearsonCorrelation(x, y);
    std::cout << "Pearson Correlation Coefficient: " << correlation << std::endl;
    
    return 0;
}

输出

Pearson Correlation Coefficient: 1

示例 2

让我们再举一个例子来说明 C++ 中的皮尔逊相关系数。

#include <iostream>
#include <cmath>

// Function to calculate the mean of an array
double calculateMean(double *arr, int n) {
    double sum = 0.0;
    for (int i = 0; i < n; ++i) {
        sum += *(arr + i);
    }
    return sum / n;
}

// Function to calculate the covariance of two arrays
double calculateCovariance(double *arr1, double *arr2, int n) {
    double mean1 = calculateMean(arr1, n);
    double mean2 = calculateMean(arr2, n);
    
    double covariance = 0.0;
    for (int i = 0; i < n; ++i) {
        covariance += (*(arr1 + i) - mean1) * (*(arr2 + i) - mean2);
    }
    return covariance / n;
}

// Function to calculate the standard deviation of an array
double calculateStandardDeviation(double *arr, int n, double mean) {
    double variance = 0.0;
    for (int i = 0; i < n; ++i) {
        variance += pow(*(arr + i) - mean, 2);
    }
    return sqrt(variance / n);
}

// Function to calculate the Pearson correlation coefficient
double calculatePearsonCorrelation(double *arr1, double *arr2, int n) {
    double covariance = calculateCovariance(arr1, arr2, n);
    double mean1 = calculateMean(arr1, n);
    double mean2 = calculateMean(arr2, n);
    
    double stdDev1 = calculateStandardDeviation(arr1, n, mean1);
    double stdDev2 = calculateStandardDeviation(arr2, n, mean2);
    
    return covariance / (stdDev1 * stdDev2);
}

int main() {
    int n;
    std::cout << "Enter the number of elements in the arrays: ";
    std::cin >> n;

    double *x = new double[n];
    double *y = new double[n];

    std::cout << "Enter the elements of the first array:\n";
    for (int i = 0; i < n; ++i) {
        std::cin >> x[i];
    }

    std::cout << "Enter the elements of the second array:\n";
    for (int i = 0; i < n; ++i) {
        std::cin >> y[i];
    }

    double correlation = calculatePearsonCorrelation(x, y, n);

    std::cout << "Pearson Correlation Coefficient: " << correlation << std::endl;

    // Free dynamically allocated memory
    delete[] x;
    delete[] y;

    return 0;
}

输出

Enter the number of elements in the arrays: 5
Enter the elements of the first array:
1 2 5 6 8
Enter the elements of the second array:
12 232 45 61 76 
Pearson Correlation Coefficient: -0.202537
===============================================================
Enter the number of elements in the arrays: 5
Enter the elements of the first array:
1 2 5 6 8
Enter the elements of the second array:
12 23 45 61 76
Pearson Correlation Coefficient: 0.995226

结论

总之，皮尔逊相关系数有助于量化两个变量之间线性关系的强度和方向。计算过程涉及查找数据点、它们的平方和它们的乘积的总和，如所提供的 C++ 代码所示。值得注意的是，使用皮尔逊相关系数的有效性基于几个假设，例如变量之间的线性、数据不包含异常值的数值性质。此代码实现中不存在错误处理机制，例如除以零或无效输入大小，这些可能导致运行时问题；但是，尽管存在这些限制，此代码仍传授了如何使用 C++ 计算相关系数的基本知识。改进措施将涉及包含更强大的错误检查和验证功能，以使其在实际场景中更可靠。

下一主题C++ 中的获取-释放语义

C++ 中的皮尔逊相关系数

示例 1

示例 2

结论

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

C++ 教程

C++ 控制语句

C++ 函数

C++ 数组

C++ 类和对象

C++ 构造函数

C++ 继承

C++ 多态

C++ 抽象

C++ 命名空间

C++ 模板

C++ 字符串

C++ 指针

信号处理

C++ 异常

C++ 文件与流

C++ STL 教程

面试题

选择题

C++ 程序

C++ STL Stack

C++ STL Bitset

C++ STL Deque

C++ STL List

C++ STL Map

C++ STL Math

C++ STL priority_queue

C++ STL Queue

C++ STL Multiset

C++ STL Multimap

C++ STL Set

C++ STD Strings

C++ STL Vector

C++ 操纵符

C++ STL Algorithms

C++ Algorithm

C++ Iterators

C++ 杂项

C++ 中的皮尔逊相关系数

示例 1

示例 2

结论

相关帖子

C++ 中的函数调用运算符重载

C++ 中如何使用 const_iterator 遍历 set？

C++ 中检查一个数是否为四次素数

C++ 中查找康托尔序列的第 n 项

C++ 中的可变 lambda

C++ 中如何找到数组中的第二小元素

C++ 比较器

C++ 中的 std::scoped_lock

使用现代 C++ 避免错误

C++ 中计算黄金分割率序列

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器