Binary Search on Unsorted Array - 1point3acres


https://docs.google.com/document/d/1qxA2wps0IhVRWULulQ55W4SGPMu2AE5MkBB37h8Dr58/

随机数组中用二分法找不到的元素

第四题,一个印度小哥,给定一个没有排序的随机数组,使用二分法找数组里的各个元素,返回有多少元素是用二分法永远找不到的。要求算法复杂度为O(n), 楼主不会。印度小哥让楼主用例子试一试,看看能不能找到什么规律。楼主试了试,没找到规律…………最后给出了nlog(n)的解法,估计最后挂在这了。

思路:
可以用类似于判断树是否为BST的思路:维护一个“值域”R=(lo, hi)(即在当前搜索下标范围[i, j]内,只有在R范围内的数才有可能被二分法找到)。求得当前搜索范围[i, j]的中点mid。如果A[mid]在当前的R中,那A[mid]就一定能被找到。然后分别递归mid的左右两侧[i, mid-1]和[mid+1, j]。递归左侧时,允许的值域是(lo, A[mid]);同理右侧的值域是(A[mid], hi)。如果任何时刻发现值域R已经不包含任何整数,那就表示[i, j]这个下标范围内没有任何能找到的数字。

code
provider: null
   private int find(int[] nums) {
    // do some sanity check operations
    return dfsUtil(nums, 0, nums.length - 1, Integer.MIN_VALUE, Integer.MAX_VALUE);
   }
   private int dfsUtil(int[] nums, int left, int right, int lower, int upper) {
// base case: there should be no elements between (lower, upper) or (left,right)
    if(lower >= upper) return right - left + 1;
    if(left > right) return 0;
   
    int mid = left + (right - left) / 2;
    int l = dfsUtil(nums, left, mid-1, lower, nums[mid]);
    int r = dfsUtil(nums, mid+1, right, nums[mid], upper);
    if(nums[mid] > lower && nums[mid] < upper) {
    return l + r;
    } else {
    return l + r + 1;
    }
   }

需要和面试官clarify是否包含重复值,如果包括,情况会更复杂一点。首先需要定义二分法是找左边界还是右边界,其次,不同位置同样的数是否当做同样的值。提供一个python版本代码。
def bs_in_unsorted_arr(unsorted):
   from collections import defaultdict
   lo, hi = -1 << 32,1 << 32
   left, right = 0, len(unsorted)
   found = defaultdict(bool)
   def bs(left, right, lo, hi):
       if left >= right:
           return
       if lo >= hi:
           return
       mid = (left + right) >> 1
       if lo <= unsorted[mid] < hi:
           found[unsorted[mid]] = True
       bs(left, mid, lo, min(unsorted[mid], hi))
       bs(mid + 1, right, max(unsorted[mid], lo), hi)
   bs(left, right, lo, hi)
   return found

无序数组二分搜索:找不到的数
假设给你一个有序数组,,那么如果二分搜索其中的每个数,必定都能找到。
如果给你同样的数组,但是却是打乱的,然后使用同样的二分搜索来搜索其中的每个数,会有些数找不到。
请你用 O(N),返回哪些数找不到。

把这个无序数组直接用binarySearch 来搜索 [ 2, 7, 9, 10, 1 ]

l=0, h=4, m=2, 找9一次就可以找到。
找2,m=2 是9,所以在左边。则l=0, h=1, m=0, 可以找到
找7,m=2 是9,所以在左边。则l=0, h=1, m=0, 然后 l=1, h=1, m=1 可以找到
找10,m=2 是9,所以在右边。则l=3,h=4,m=3,能找到
找1,m=2 是9,所以在左边。则l=0,h=1,m=0 找不到。

所以对于这个题目,找不到的数是【1】。


如果知道中位数,则左半部>中位数和右半部<中位数的数会找不到。
那么如何在O(n)时间内找到中位数,可以用两个堆来找,或者平衡二叉树也可以。

再加一个例子:
Given nums: [22, 92, 25, 16, 29, 15, 49, 22, 38, 61]
Can't find: [16, 49, 25, 92, 15]
对于任意数组中的元素,如果用标准的二分法搜索搜索不到,就是最后答案的一员。当然下面的是测试代码,达不到楼顶要求的O(N)了。
for (int j : nums)
  if (Arrays.binarySearch(nums, j) < 0)
    System.out.println("Can't find " + j);

用堆或者Balanced BST的时间复杂度就不是O(N)了,找中位数果断用Quick Select啊

然后考虑一个样例数组,比如
[4, 3, 5, 7, 6, 8, 2, 9, 1]

假设我们需要寻找8,那么输出是这样的:
  1. Looking at position 4, value = 6
  2. Now decide whether left [0, 3] or right [5, 8]
  3. Go to right side!
  4. Looking at position 6, value = 2
  5. Now decide whether left [5, 5] or right [7, 8]. From 1point 3acres bbs
  6. Go to right side!
  7. Looking at position 7, value = 9
  8. Now decide whether left [7, 6] or right [8, 8]
  9. Go to left side!
  10. Not found
也就是找不到8 —— 问题出在第二步position = 6的值为2,所以我们将左边的区间[5, 5]排除了。
. From 1point 3acres bbs
那么进入区间[5, 5]的条件是什么?—— 条件是查找值要大于第一步position = 4的值(也就是6)从而向右走,并小于第二步position = 6的值(也就是2)从而向左走,也就是查找值要大于6并小于2 —— 当然这个条件不可能被满足,所以不仅是7,position = 5的位置设成任何值都是不可能被找到的。

所以总体思路就是维护一个“查找值需要大于X并小于Y才能到达这个区间“的记录,并在递归的每一层更新X和Y

  1. void CheckHidden(const std::vector<int> &data,
  2.                  std::vector<bool> &hidden,
  3.                  const int lo, const int hi,
  4.                  const int min_bound, const int max_bound) {
  5.   if (lo > hi) {
  6.     return;. From 1point 3acres bbs
  7.   }

  8.   const int mid = (lo + hi) / 2;
  9.   const int value = data[mid];

  10.   // 打印更新边界的过程
  11.   std::cout << "data[" << mid << "] = " << data[mid] << ", bound = "
  12.             << "[" << min_bound << ", " << max_bound << "]]\n";

  13.   // hidden记录每个位置的值是否能被查找到
  14.   hidden[mid] = (value <= min_bound || value >= max_bound);

  15.   CheckHidden(data, hidden, lo, mid-1, min_bound, std::min(value, max_bound));
  16.   CheckHidden(data, hidden, mid+1, hi, std::max(min_bound, value), max_bound);
  17. }

  18. // Wrapper
  19. void CheckHidden(const std::vector<int> &data, std::vector<bool> &hidden) {
  20.   // 初始边界是负无穷到正无穷. From 1point 3acres bbs
  21.   CheckHidden(data, hidden, 0, data.size() - 1,
  22.               std::numeric_limits<int>::min(),
  23.               std::numeric_limits<int>::max());
  24. }



Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts