数据结构与算法 - 二分查找

一、二分查找

二分查找算法也称折半查找，是一种非常高效的工作于有序数组的查找算法。

时间复杂度

最坏情况：O(log n)
最好情况：如果待查找元素恰好在数组中央，只需要循环一次O(1）

空间复杂度

递归->O(log n)；迭代->O(1)

1. 基础版

需求：在有序数组a内，查找值target

如果找到返回索引
如果找不到返回-1

算法描述

前提	给定一个包含n个元素的有序数组A，满足A{0}≤A{1}≤A{2}≤...≤A{n-1}，一个待查值target
1	设置i=0, j=n-1
2	如果i>j，结束查找，没找到
3	设置m=floor((i+j)/2)，m为中间索引，floor是向下取整
4	如果target＜A{m}，设置j=m-1，跳到第2步
5	如果target>A{m}，设置i=m+1，跳到第2步
6	如果A{m}=target，结束查找，找到了

Java实现：

public static int binarySearch(int[] a, int target) {
    int i = 0, j = a.length - 1;
    while (i <= j) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {			// 在左边
            j = m - 1;
        } else if (a[m] < target) {		// 在右边
            i = m + 1;
        } else {
            return m;
        }
    }
    return -1;
}

i,j 对应着搜索区间 [0,a.length-1]（注意是闭合的区间），i<=j 意味着搜索区间内还有未比较的元素，i, j 指向的元素也可能是比较的目标
- 思考：如果不加 i==j 行不行？
- 回答：不行，因为这意味着 i,j 指向的元素会漏过比较
m 对应着中间位置，中间位置左边和右边的元素可能不相等（差一个），不会影响结果
如果某次未找到，那么缩小后的区间内不包含 m

2. 改变版

public static int binarySearch(int[] a, int target) {
    int i = 0, j = a.length;
    while (i < j) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {			// 在左边
            j = m;
        } else if (a[m] < target) {		// 在右边
            i = m + 1;
        } else {
            return m;
        }
    }
    return -1;
}

i，j 对应着搜索区间 [0,a.length)（注意是左闭右开的区间），i<j 意味着搜索区间内还有未比较的元素，j 指向的一定不是查找目标
- 思考：为啥这次不加 i==j 的条件了？
- 回答：这回 j 指向的不是查找目标，如果还加 i==j 条件，就意味着 j 指向的还会再次比较，找不到时，会死循环
如果某次要缩小右边界，那么 j=m，因为此时的 m 已经不是查找目标了

3. 平衡版

public static int binarySearchBalance(int[] a, int target) {
    int i = 0, j = a.length;
    while (1 < j - i) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {
            j = m;
        } else {
            i = m;
        }
    }
    return (a[i] == target) ? i : -1;
}

思想：

左闭右开的区间，i 指向的可能是目标，而 j 指向的不是目标
不奢望循环内通过 m 找出目标, 缩小区间直至剩 1 个, 剩下的这个可能就是要找的（通过 i）
- j - i > 1 的含义是，在范围内待比较的元素个数 > 1
改变 i 边界时，它指向的可能是目标，因此不能 m+1
循环内的平均比较次数减少了
时间复杂度 Θ(log(n))

4. Java版 - 基础版

private static int binarySearch0(long[] a, int fromIndex, int toIndex,
                                     long key) {
    int low = fromIndex;
    int high = toIndex - 1;

    while (low <= high) {
        int mid = (low + high) >>> 1;
        long midVal = a[mid];

        if (midVal < key)
            low = mid + 1;
        else if (midVal > key)
            high = mid - 1;
        else
            return mid; // key found
    }
    return -(low + 1);  // key not found.
}

例如 [1,3,5,6] 要插入 2 那么就是找到一个位置，这个位置左侧元素都比它小
- 等循环结束，若没找到，low 左侧元素肯定都比 target 小，因此 low 即插入点
插入点取负是为了与找到情况区分
-1 是为了把索引 0 位置的插入点与找到的情况进行区分

5. Leftmost

有时我们希望返回的是最左侧的重复元素，如果采用Basic二分查找

对于数组 [1, 2, 3, 4, 4, 5, 6, 7]，查找元素4，结果是索引3
对于数组 [1, 2, 4, 4, 4, 5, 6, 7]，查找元素4，结果也是索引3，并不是最左侧的元素

public static int binarySearchLeftmost1(int[] a, int target) {
    int i = 0, j = a.length - 1;
    int candidate = -1;
    while (i <= j) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {
            j = m - 1;
        } else if (a[m] < target) {
            i = m + 1;
        } else {
            candidate = m; // 记录候选位置
            j = m - 1;     // 继续向左
        }
    }
    return candidate;
}

6. Rightmost

如果希望返回的是最右侧元素

public static int binarySearchRightmost1(int[] a, int target) {
    int i = 0, j = a.length - 1;
    int candidate = -1;
    while (i <= j) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {
            j = m - 1;
        } else if (a[m] < target) {
            i = m + 1;
        } else {
            candidate = m; // 记录候选位置
            i = m + 1;	   // 继续向右
        }
    }
    return candidate;
}

7. 应用

对于Leftmost与Rightmost，可以返回一个比-1更有用的值

Leftmost改为

public static int binarySearchLeftmost(int[] a, int target) {
    int i = 0, j = a.length - 1;
    while (i <= j) {
        int m = (i + j) >>> 1;
        if (target <= a[m]) {
            j = m - 1;
        } else {
            i = m + 1;
        }
    }
    return i; 
}

Leftmost返回值的另一层含义：小于target的元素个数，≥target的最靠左索引
小于等于中间值，都要向左找

Rightmost改为

public static int binarySearchRightmost(int[] a, int target) {
    int i = 0, j = a.length - 1;
    while (i <= j) {
        int m = (i + j) >>> 1;
        if (target < a[m]) {
            j = m - 1;
        } else {
            i = m + 1;
        }
    }
    return i - 1;
}

Rightmost返回值的另一层含义：≤target的最靠左索引
大于等于中间值，都要向右找

8. 几个名词

范围查询

查询x＜4，0...leftmost(4) - 1
查询x ≤ 4，0...rightmost(4)
查询4 ＜ x，rightmost(4) + 1 ... infty
查询4 ≤ x，leftmost(4) ... infty
查询4 ≤ x ≤ 7，leftmost(4) ... rightmost(7)
查询4 ＜ x ＜ 7，rightmost(4) + 1 ... leftmost(7) - 1

求排名：leftmost(target) + 1

target可以不存在，如：leftmost(5) + 1 = 6
target也可以存在，如：leftmost(4) + 1 = 3

求前任：leftmost(target) - 1

leftmost(3) - 1 = 1，前任a[1] = 2
leftmost(4) - 1 = 1，前任a[1] = 2

求后任：rightmost(target) + 1

rightmost(5) + 1 = 5，后任a[5] = 7
rightmost(4) + 1 = 5，后任a[5] = 7

求最近邻居

前任和后任距离更近者

9. 习题

9.1 二分查找

给定一个 n 个元素有序的（升序）整型数组 nums 和一个目标值 target ，写一个函数搜索 nums 中的 target，如果目标值存在返回下标，否则返回 -1。

示例 1:

输入: nums = [-1,0,3,5,9,12], target = 9
输出: 4
解释: 9 出现在 nums 中并且下标为 4

示例 2:

输入: nums = [-1,0,3,5,9,12], target = 2
输出: -1
解释: 2 不存在 nums 中因此返回 -1

提示：

你可以假设 nums 中的所有元素是不重复的。
n 将在 [1, 10000]之间。
nums 的每个元素都将在 [-9999, 9999]之间。

class Solution {
    public int search(int[] nums, int target) {
        int i = 0, j = nums.length;
        while(i < j) {
            int m = (i + j) >>> 1;
            if(nums[m] < target) {
                i = m + 1;
            } else if(target < nums[m]) {
                j = m;
            } else {
                return m;
            }
        }

        return -1;
    }
}

9.2 搜索插入位置

给定一个排序数组和一个目标值，在数组中找到目标值，并返回其索引。如果目标值不存在于数组中，返回它将会被按顺序插入的位置。

请必须使用时间复杂度为 O(log n) 的算法。

示例 1:

输入: nums = [1,3,5,6], target = 5
输出: 2

示例 2:

输入: nums = [1,3,5,6], target = 2
输出: 1

示例 3:

输入: nums = [1,3,5,6], target = 7
输出: 4

提示:

1 <= nums.length <= 10^4
-10^4 <= nums[i] <= 10^4
nums 为 无重复元素 的升序排列数组
-10^4 <= target <= 10^4

解法一：用二分查找基础版代码改写，基础版中，找到返回m，没找到i代表插入点

class Solution {
    public int searchInsert(int[] nums, int target) {
        int i = 0, j = nums.length;
        while(i < j) {
            int m = (i + j) >>> 1;
            if(target < nums[m]) {
                j = m;
            } else if(nums[m] < target) {
                i = m + 1;
            } else {
                return m;
            }
        }

        return i;
    }
}

解法二：用二分查找法平衡板改写，平衡版中

如果 target == a[i] 返回 i 表示找到
如果 target < a[i]，例如 target = 2，a[i] = 3，这时就应该在 i 位置插入 2
如果 a[i] < target，例如 a[i] = 3，target = 4，这时就应该在 i+1 位置插入 4

class Solution {
    public int searchInsert(int[] nums, int target) {
        int i = 0, j = nums.length;
        while(1 < j - i) {
            int m = (i + j) >>> 1;
            if(target < nums[m]) {
                j = m;
            } else {
                i = m;
            }
        }
        return (target <= nums[i]) ? i : i + 1;
    }
}

解法三：用leftmost版本解，返回值即为插入位置（并能处理元素重复的情况）

class Solution {
    public int searchInsert(int[] nums, int target) {
        int i = 0, j = nums.length;
        while(i < j) {
            int m = (i + j) >>> 1;
            if(target <= nums[m]) {
                j = m;
            } else {
                i = m + 1;
            }
        }
        return i;
    }
}

9.3 在排序数组中查找元素的第一个位置和最后一个位置

给你一个按照非递减顺序排列的整数数组 nums，和一个目标值 target。请你找出给定目标值在数组中的开始位置和结束位置。

如果数组中不存在目标值 target，返回 [-1, -1]。

你必须设计并实现时间复杂度为 O(log n) 的算法解决此问题。

示例 1：

输入：nums = [5,7,7,8,8,10], target = 8
输出：[3,4]

示例 2：

输入：nums = [5,7,7,8,8,10], target = 6
输出：[-1,-1]

示例 3：

输入：nums = [], target = 0
输出：[-1,-1]

提示：

0 <= nums.length <= 10^5
-10^9 <= nums[i] <= 10^9
nums 是一个非递减数组
-10^9 <= target <= 10^9

class Solution {
    public static int left(int[] a, int target) {
        int i = 0, j = a.length;
        int candidate = -1;
        while(i < j) {
            int m = (i + j) >>> 1;
            if(target < a[m]) {
                j = m;
            } else if(a[m] < target) {
                i = m + 1;
            } else {
                candidate = m;
                j = m;
            }
        }
        return candidate;
    }

    public static int right(int[] a, int target) {
        int i = 0, j = a.length;
        int candidate = -1;
        while(i < j) {
            int m = (i + j) >>> 1;
            if(target < a[m]) {
                j = m;
            } else if(a[m] < target) {
                i = m + 1;
            } else {
                candidate = m;
                i = m + 1;
            }
        }
        return candidate;
    }

    public int[] searchRange(int[] nums, int target) {
        int x = left(nums, target);
        if(x == -1) {
            return new int[] {-1, -1};
        } else {
            return new int[] {x, right(nums, target)};
        }
    }
}