插值查找

2025年3月17日 | 阅读 8 分钟

在本文中，我们将详细探讨插值搜索，讨论其原理、优点、局限性及实际应用。

引言

插值搜索是一种搜索算法，它使用插值公式来估计目标值在排序数组或列表中的位置。

与二分搜索总是选择中间元素不同，插值搜索根据数据的分布做出更智能的猜测。它使用公式化方法来确定数组中目标元素的位置。

它在元素均匀分布时尤其有效。

数据集的均匀分布意味着元素之间的间隔应该均匀（没有大的差异）。

插值搜索如何工作

它利用插值公式的思想来估计目标元素的可能位置。

它使用插值公式计算可能位置，该公式考虑了数据元素的范围和值。

Interpolation Formula = low + [(high - low) * (X - A[low])]/(A[high] - A[low])]

low - Left pointer
high - Right pointer
A[low] - Element at the left pointer
A[high] - Element at the right pointer
X - Target element to be searched

这种估计引导算法缩小搜索范围，从而实现更快的检索。

算法

该算法可以总结为以下步骤

将低索引和高索引分别初始化为数组的开始和结束。
使用插值公式计算探测位置。
将探测元素与目标元素进行比较。
1. 如果它们相等，则搜索成功。
2. 如果探测元素较大，则将高索引更新为探测位置减一。
3. 如果探测元素较小，则将低索引更新为探测位置加一。
重复步骤 2-3，直到找到目标元素或低索引超过高索引。

Python 实现

# Function
def interpolation_search(arr, target):
    
    low = 0  # Starting index
    high = len(arr) - 1  # Ending index

    # Loop until low <= high and
    # target is between arr[low] and arr[high]
    while low <= high and arr[low] <= target <= arr[high]:

        if low == high:
            # If the target is at the low index
            if arr[low] == target:
                return low
            
            # Element not found
            return -1

        # Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) // (arr[high] - arr[low])

        if arr[pos] == target:
            return pos
        
        elif arr[pos] < target:
            # Search in the right portion of the array
            low = pos + 1
        else:
            # Search in the left portion of the array
            high = pos - 1

    # If the search key is not found in the array, return -1
    return -1

# Example usage:
sorted_list = [2, 4, 7, 9, 12, 15, 18, 20, 23, 25]
print("List =", sorted_list)
# Target element
target_element = 20
print("Target Element =", target_element)

# Calling the interpolation_search function
result = interpolation_search(sorted_list, target_element)

# Printing the result
if result != -1:
    print("Element found at index:", result)
else:
    print("Element not found")

输出

说明

最初，我们有一个数组 = [2, 4, 7, 9, 12, 15, 18, 20, 23, 25]，元素之间的间隔为 2 和 3，可以视为均匀。

让我们在这个数组中找到目标元素 = 15。

我们使用必要的参数调用 interpolation_search，并将返回的索引存储在结果中。

目标元素 = 15

第一次迭代

低 = 0, arr[low] = 2

高 = 9, arr[high] = 25

这里，low <= high 并且目标 = 15 在 arr[low] = 2 和 arr[high] = 25 之间。

因此，while 循环的两个条件都满足。

然后我们检查 low 是否等于 high。

if low == high:
            # If the target is at the low index
            if arr[low] == target:
                return low

如果是，则检查 arr[low] 是否等于目标。如果是，则返回 low 索引。

否则，返回 -1 表示未找到目标元素。

现在，使用插值公式计算可能的位置。

pos 	= low + ((target - arr[low]) * (high - low)) // (arr[high] - arr[low])
= 0 + ((20 - 2) * (9 - 0)) // (25 - 2)
= 0 + (18 * 9) // 23
= 162 // 23
= 7

检查位置 pos 处的元素是否等于目标。

if arr[pos] == target:
return pos

这里，arr[pos] = arr[7] = 20 等于目标。我们返回 pos = 7。

结果 = 7

最后，我们将结果打印到控制台。

if result != -1:
    print("Element found at index:", result)
else:
    print("Element not found")

输出：元素在索引 7 处找到

时间复杂度分析

平均情况：O(log logn) - 当数据均匀分布时。

最坏情况：O(n) - 当数据不均匀时，使其效率低于二分搜索。

C++ 实现

// Interpolation Search Algorithm
#include <iostream>
using namespace std;

// Search Function
int interpolation_search(int arr[], int n, int target)
{
    int low = 0; // Left Pointer
    int high = n - 1; // Right Pointer

    // If there is only one element
    if (low == high)
    {
        // If it is the target element
        if (arr[low] == arr[high])
        {
            return low;
        }
        // If target Not Found
        return -1;
    }
    
    
    int pos = -1;

    // Loop until low <= high and
    // target is between arr[low] and arr[high]
    while (low <= high && arr[low] <= target && target <= arr[high])
    {
        // Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) / (arr[high] - arr[low]);

        if (arr[pos] == target){
            return pos;
        }

        else if (arr[pos] < target){
 // Search in the right portion of the array
            low = pos + 1;
        }

        else{
 // Search in the left portion of the array
            high = pos - 1;
        }
    }
    // Element not found
    return -1;
}

// Driver Function
int main()
{
    // An array
    int arr[10] = {2, 4, 7, 9, 12, 15, 18, 20, 23, 25};
    int n = 10;
    // Print the array
    cout << "Array = [ ";
    for (int i = 0; i < n; i++)
    {
        cout << arr[i] << ", ";
    }
    cout << "]" << endl;
    
    // Target element
    int target_element = 7;
    cout << "Target Element = " << target_element << endl;
    
    // Calling the interpolation_search function 
    int result = interpolation_search(arr, n, target_element);

    // Printing the result.
    if (result != -1){
        cout << "Element found at index = " << result;
    }
    else{
        cout << "Element not found.";
    }
    return 0;
}

输出

C 语言实现

// Interpolation Search Algorithm
#include <stdio.h>

// Search Function
int interpolation_search(int arr[], int n, int target)
{
    int low = 0; // Left Pointer
    int high = n - 1; // Right Pointer

    // If there is only one element
    if (low == high)
    {
        // If it is the target element
        if (arr[low] == arr[high])
        {
            return low;
        }
        // If target Not Found
        return -1;
    }
    
    
    int pos = -1;

    // Loop until low <= high and
    // target is between arr[low] and arr[high]
    while (low <= high && arr[low] <= target && target <= arr[high])
    {
        // Estimate the position of the target element using the interpolation formula
        pos = low + ((target - arr[low]) * (high - low)) / (arr[high] - arr[low]);

        if (arr[pos] == target){
            return pos;
        }

        else if (arr[pos] < target){
 // Search in the right portion of the array
            low = pos + 1;
        }

        else{
 // Search in the left portion of the array
            high = pos - 1;
        }
    }
    // Element not found
    return -1;
}

// Driver Function
int main()
{
    // An array
    int arr[10] = {2, 4, 7, 9, 12, 15, 18, 20, 23, 25};
    int n = 10;
    // Print the array
    printf("Array = [ ");
    for (int i = 0; i < n; i++)
    {
        printf("%d, ", arr[i]);
    }
    printf("]\n");
    
    // Target element
    int target_element = 18;
    printf("Target Element = %d\n", target_element);
    
    // Calling the interpolation_search function 
    int result = interpolation_search(arr, n, target_element);

    // Printing the result.
    if (result != -1){
       printf("Element found at index = %d", result);
    }
    else{
        printf("Element not found.");
    }
    return 0;
}

输出

插值搜索的优点

更快的搜索 - 它根据数据的分布缩小搜索空间，从而实现值的快速搜索。
优于二分搜索 - 当需要在大型数据集上执行搜索时，它优于二分搜索。它减少了所需的比较次数，使其成为一种省时的算法。
对均匀分布数据集高效 - 该算法的整个思想基于数据分布。当数据集均匀分布时，它表现出色，将时间复杂度降低到 log(log(n))。

插值搜索的局限性

主要缺点是它需要一个均匀数据集。在非均匀数据集的情况下，它会导致性能不佳，甚至比线性搜索更差的时间复杂度。

此外，当元素之间差异很大时，该公式可能导致位置超出有效范围。

我们能想到的另一个缺点是它需要额外的计算，使其比二分搜索更复杂。

插值搜索的实际应用

数据库 - 它可以用于数据库中对排序数据执行搜索，从而缩短检索时间。
科学数据分析 - 我们可以在科学数据分析中用于具有均匀分布的大型数据集。
时间敏感型应用 - 它可以用于时间敏感型应用中，其中快速检索是关键因素。

结论

插值搜索是一种快速而强大的搜索算法，它提供了线性搜索和二分搜索算法的更高效替代方案。它对于均匀分布的数据表现出色。它使用插值公式估计目标元素的位置并缩小搜索空间。它增强了搜索性能，使其成为一个有价值的搜索工具。

尽管它可能对非均匀数据集存在局限性，但在搜索速度和效率至关重要的各种应用中，它仍然是一个有价值的工具。

下一个主题使用 Hoare 分区的快速排序

插值查找

引言

插值搜索如何工作

算法

Python 实现

C++ 实现

C 语言实现

插值搜索的优点

插值搜索的局限性

插值搜索的实际应用

结论

联系信息

关注我们

教程

面试题

在线编译器

Python

Java

.Net Framework

AI, ML and Data Science

Cloud Technology

B.Tech and MCA

Web Technology

PHP

Software Testing

Technical Interview

Java Interview

Python

Web Interview

Database Interview

B.Tech / MCA

Important Interview

Software Testing Interview

Company Interviews

Online Compilers

Multiple Choice Questions

数据结构教程

DS 数组

DS 链表

DS 栈

DS 队列

DS 树

DS 图

DS 搜索

DS 排序

哈希与堆

差异

二叉树

二叉搜索树

AVL 树

单向链表

双向链表

循环链表

循环双向链表

DS 选择题

其他

插值查找

引言

插值搜索如何工作

算法

Python 实现

C++ 实现

C 语言实现

插值搜索的优点

插值搜索的局限性

插值搜索的实际应用

结论

相关帖子

栈弹出操作

滑动窗口最大值 (大小为 K 的所有子数组的最大值)

检查两棵二叉树是否互为镜像

压缩线段树并在 O(N*logN) 中合并集合

二叉树的简洁编码

字符串的左旋转和右旋转

最小堆和最大堆的区别

在允许排列的情况下形成回文所需的最少插入次数

反转单向链表中交替的 K 个节点

到达末尾所需的最少跳数

订阅 Tpoint Tech

联系信息

关注我们

教程

面试题

在线编译器