C 语言最优归并模式

2025年1月7日 | 阅读 6 分钟

最优合并模式问题是在文件管理系统中合并多个已排序文件时出现的一个著名的算法问题。本文提出了一种算法，重点介绍了如何最优地合并给定的一组不同大小的文件。

由于合并取决于两个文件的总大小，而目标是最小化合并所有文件的总成本，因此考虑拥有几个包含一定数量记录的文件。之后，当合并任意两个文件时，工作量将取决于这些文件的总大小。如果选择了错误的合并顺序，则结果成本可能会高得多。因此，找到产生此成本的最优合并序列变得很重要。

让我们以三个大小分别为 10、20 和 30 单位的文件为例。假设我们以某种非最优方式合并文件，首先以 50 的成本合并 20 和 30 大小的文件，然后将该结果与 10 大小的文件合并。之后，总成本将为 50 + 60 = 110。但是，如果我们首先以 30 的成本合并 10 和 20 大小的文件，然后将该结果与 30 大小的文件合并，则总成本将为 30 + 60 = 90。这个简单的例子说明了不同合并序列之间的成本差异，并激发了对最优方法的需要。

所有合并操作的总和定义了成本。如上所述，该问题在需要执行大量合并时具有实际应用，例如在外部排序算法和文件处理可能涉及合并的操作系统中。如果使用了最优合并模式，我们可以确保这些操作将以最小的计算开销完成。

程序

让我们举一个例子来说明 C 语言中的最优合并模式。

 
#include <stdio.h>
#include <stdlib.h>
// Define a structure to represent a file
typedef struct File {
    int size;  // Size of the file
} File;
// A function to swap two elements in an array
void swap(File* a, File* b) {
    File temp = *a;
    *a = *b;
    *b = temp;
}
// A utility function to heapify a subtree rooted at index i in a min-heap
// n is the size of the heap
void heapify(File files[], int n, int i) {
    int smallest = i; // Initialize smallest as root
    int left = 2 * i + 1; // Left child index
    int right = 2 * i + 2; // Right child index
    // If the left child is smaller than the root
    if (left < n && files[left].size < files[smallest].size) {
        smallest = left;
    }
    // If the right child is smaller than the smallest so far
    if (right < n && files[right].size < files[smallest].size) {
        smallest = right;
    }
    // If the smallest is not the root
    if (smallest != i) {
        swap(&files[i], &files[smallest]); // Swap the root with the smallest
        heapify(files, n, smallest); // Recursively heapify the affected subtree
    }
}
//Function to build a min-heap from the Array of files
void buildMinHeap(File files[], int n) {
    for (int i = n / 2 - 1; i >= 0; i--) {
        heapify(files, n, i);
    }
}
// A utility function to extract the smallest element from the heap
File extractMin(File files[], int* n) {
    File minFile = files[0]; // The root is the minimum element
    files[0] = files[*n - 1]; // Move the last element to root
    (*n)--; // Reduce heap size
    heapify(files, *n, 0); // Heapify the root
    return minFile;
}
// A utility function to insert a new file into the heap
void insertHeap(File files[], int* n, File newFile) {
    (*n)++; // Increase the size of the heap
    int i = *n - 1;
    files[i] = newFile; // Insert the new file at the end of the heap
    // Fix the min-heap property if violated
    while (i != 0 && files[(i - 1) / 2].size > files[i].size) {
        swap(&files[i], &files[(i - 1) / 2]);
        i = (i - 1) / 2;
    }
}
//Function to calculate the optimal merge pattern cost
int calculateOptimalMergeCost(File files[], int n) {
    int totalCost = 0; // Initialize total cost
    // Build a min-heap from the Array of files
    buildMinHeap(files, n);
    // Continue merging files until only one file remains
    while (n > 1) {
        // Extract the two smallest files
        File file1 = extractMin(files, &n);
        File file2 = extractMin(files, &n);
        // Merge them and calculate the cost
        int mergeCost = file1.size + file2.size;
        totalCost += mergeCost;
        // Create a new merged file and insert it back into the heap
        File mergedFile;
        mergedFile.size = mergeCost;
        insertHeap(files, &n, mergedFile);
    }
    return totalCost; // Return the total merge cost
}
// A function to print the Array of file sizes
void printFileSizes(File files[], int n) {
    printf("File sizes: ");
    for (int i = 0; i < n; i++) {
        printf("%d ", files[i].size);
    }
    printf("\n");
}
// A function to initialize an array of files with given sizes
void initializeFiles(File files[], int sizes[], int n) {
    for (int i = 0; i < n; i++) {
        files[i].size = sizes[i];
    }
}
// Main Function
int main() {
    int sizes[] = {5, 10, 15, 20, 25, 30, 35}; //Array of file sizes
    int n = sizeof(sizes) / sizeof(sizes[0]); // Number of files
    // Allocate memory for an array of File structures
    File* files = (File*)malloc(n * sizeof(File));
    // Initialize the files with their sizes
    initializeFiles(files, sizes, n);
    // Print the file sizes before merging
    printf("Before merging:\n");
    printFileSizes(files, n);
    // Calculate the optimal merge pattern cost
    int totalCost = calculateOptimalMergeCost(files, n);
    // Print the total cost
    printf("Optimal merge cost: %d\n", totalCost);
    // Free the allocated memory
    free(files);
    return 0;
}   

输出

 
Before merging:
File sizes: 5 10 15 20 25 30 35 
Optimal merge cost: 370

说明

在此示例中，下面的代码使用最小堆解决了最优合并模式问题。通过合并不同大小的文件，有效地降低了总体合并成本。目标是合并文件，以最小化与文件大小总和成比例的计算成本。

这个问题可以通过贪婪算法最有效地解决。它通过始终先合并两个最小的文件来工作，这在每一步都能确保成本的增加尽可能小。它类似于构建用于数据压缩的霍夫曼树。实际解决方案通常会基于最小堆或优先队列来高效地完成此操作。它将所有文件大小插入堆中，并反复取出两个最小的文件，合并它们，然后将结果放回堆中。重复此过程，直到堆中只剩下一个文件。

文件结构：定义了一个文件结构来存储每个文件的大小。这为将来增加文件的更多属性提供了灵活性。

堆函数

heapify：文件数组维护堆属性，其中最小元素位于根。算法在发生扰动时重新排列堆。
extractMin：它从堆中删除并返回最小的文件，即大小最小的文件。
insertHeap：它将新合并的文件放回堆中，同时维护最小堆属性。
buildMinHeap：它将一个未排序的数组转换为最小堆，以便我们可以高效地提取文件的最小尺寸。
最优合并成本计算：核心算法是 calculateOptimalMergeCost 函数。它从堆中取出两个最小的文件，合并它们，并将结果重新插入堆中。这个过程会一直重复，直到只剩下一个文件，每次合并一对文件时都会累积合并成本。

辅助函数

initializeFiles 设置文件大小。
printFileSizes 可视化过程。

复杂度分析

时间复杂度

堆操作：创建初始堆需要 O(n) 时间，其中 n = 文件数。
合并过程：由于最小堆的提取和插入由于其堆结构的维护而需要 O(log n) 时间，并且有 n-1 次合并，因此代码的总时间复杂度为 O(n log n)。这种效率归因于最小堆始终保证在每一步合并最小的两个文件。

空间复杂度

此代码的空间复杂度为O(n)，因为用于存储文件大小的数组大小为 n。堆的底层结构需要相同的空间。一些用于存储文件大小和合并成本的辅助变量需要额外的常数空间，但总体空间使用变为 O(n)。

下一主题Tss-create-function-in-c

C 语言最优归并模式

程序

说明

堆函数

辅助函数

复杂度分析

时间复杂度

空间复杂度

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

C 语言教程

C 语言控制语句

C 语言函数

C 语言数组

C 语言指针

C 语言动态内存

C 语言字符串

C 语言数学

C 语言结构体和联合体

C 语言文件处理

C 语言预处理器

C 语言命令行

C 语言程序

C 语言面试

选择题

C 语言编程测试

C 语言基础测试

C 语言控制语句测试

C 语言函数测试

C 语言数组测试

C 语言指针测试

C 语言字符串测试

C 语言结构体测试

C 语言预处理器测试

数学

C 语言杂项

C 语言最优归并模式

程序

说明

堆函数

辅助函数

复杂度分析

时间复杂度

空间复杂度

相关帖子

C 语言哥德巴赫数

C 语言 setenv() 函数

C 语言计算数字阶乘的程序

使用栈进行表达式求值的 C 语言程序

C 语言单层目录程序

C 语言 tgmath.h

C 语言可变参数函数

C 语言 posix_spawn

C 语言 Sbrk() 函数

C 语言 Mo 算法

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器