Find Missing Number


Find Missing Number - Algorithms and Problem SolvingAlgorithms and Problem Solving
1. Given a set of positive numbers less than equal to N, where one number is missing. Find the missing number efficiently.
2. Given a set of positive numbers less than equal to N, where two numbers are missing. Find the missing numbers efficiently.
3. Given a sequence of positive numbers less than equal to N, where one number is repeated and another is missing. Find the repeated and the missing numbers efficiently.
4. Given a sequence of integers (positive and negative). Find the first missing positive number in the sequence.
Solutions should not use no more than O(n) time and constant space.
For example,
1. A=[2,1,5,8,6,7,3,10,9] and N=10 then, 4 is missing.
2. A=[2,1,5,8,6,7,3,9] and N=10 then, 4 and 10 are missing.
3. A=[2,1,5,8,3,6,7,3,10] and N=10 then, 3 is repeating and 9 is missing.
2. A=[1,2,0] then first missing positive is 3, A=[3,4,-1,1], the first missing positive is 2.
Single Number Missing
A trivial approach would be to sort the array and loop through zero to N-1 to check whether index i contains number i+1. This will take constant space but takes O(nlgn) time. We can do a counting sort to sort the array but still it’ll take in O(n+k) time and O(k) space. But we need to do it O(n) time and constant space, how?
public int missingNumber(int[] nums) {
    int n = nums.length;
    int expectedSum = n*(n+1)/2;
    
    int actualSum = 0;
    for(int i=0; i<n; i++){
        actualSum+= nums[i];
    }
    
    return (expectedSum-actualSum);
}
Two Missing Numbers
We can solve it using math same as above. Let’s say p and q are the missing numbers among 1 to N. Then summation of given input numbers,
S = N*(N+1)/2 - p -q
=>p+q = N*(N+1)/2 -S
Also, we know that multiplication of numbers 1 to N is N! –
P = N!/pq
=>pq = N!/P
Then we can solve these two equations to find the missing number p and q. However this approach has a serious limitation because the product of a large amount of numbers can overflow the buffer. We could have used long but still multiplication operation is not cheap.
Can we avoid multiplication? As the numbers are positive and between the range [1,N] we could use the element of the array as index into the array to mark them as exists. Then the positions for missing element will be unmarked. But it’ll change the array itself. How do we make sure that marking one position we are not losing information at the position we are marking. For example, A=[2,1,5,8,6,7,3,9] then if we mark A[A[0]-1] i.e. A[1] with special value, lets say 0 marking 2 as not missing, then A becomes A’=[2,0,5,8,6,7,3,9], then we are losing information and A[A[1]-1] i.e. A[0] will never get marked to inform us that 1 is not missing.
We can actually overcome the overwriting issue by just negating the number at index A[abs(A[i])-1] for each i. So, we are not losing value but just changing the sign and indexing based on absolute value. After we mark for all the numbers we can now have a second pass on the array and check for unmarked i.e. positive elements. At the same time we can revert the negated elements back to positive thus getting back to original array. Below is the implementation of this idea.
public static void find2Missing(int[] a, int n){
 for(int i = 0; i < a.length; i++){
  if(a[Math.abs(a[i])-1] > 0){
   a[Math.abs(a[i])-1]  = -a[Math.abs(a[i])-1];
  }
 }
 
 for(int i = 0; i < a.length; i++){
  if(a[i] > 0){
   System.out.println("missing: "+i+1);
  }
  else{
   a[i] = -a[i];
  }
 }
}
The above solution has a limitation that we assume the input array is not immutable. What if we can’t update the input array (i.e.e immutable) and still we need to find the missing values in O(n) time and constant space?
//O(n) time, O(1) space
public static void findMissing2(int a[], int n){
 int mask = 0;
 
 //O(n)
 for(int i = 1; i<=n; i++){
  mask ^= i;
 }
 
 //O(n)
 for(int i = 0; i < a.length; i++){
  mask ^= a[i];
 }
 
 //get the right most set bit
 mask &= ~(mask-1);
 int mis1=0, mis2=0;
 for(int i = 0; i<a.length; i++){
  if((a[i]&mask) == mask){
   mis1 ^= a[i];
  }
  else{
   mis2 ^= a[i];
  }
 }
 
 for(int i = 1; i<=n; i++){
  if((i&mask) == mask){
   mis1 ^= i;
  }
  else{
   mis2 ^= i;
  }
 }
 
 System.out.println("missing numbers : "+mis1+", "+mis2);
}
 One Missing, One Repeated
What if we have one single number getting repeated twice and one missing? Note that, missing one element and repeating one element is equivalent phenomena with respect to xor arithmetic. Because during xor1 these repeating element will nullify each other and made the element missing in the xor. That is we can use the same procedure described above to find one missing and one repeated element.
First missing Positive
For example, A=[1,2,0] then first missing positive is 3, A=[3,6,4,-1,1], the first missing positive is 2. Can we use some of the above techniques we discussed? Note that, there might be more than one missing numbers as well as negative numbers and zeros. If all numbers were positive then we could have used the 2nd method for finding two missing number where we used the element as index to negate the value for marking them as non-missing. However, in this problem we may have non-positive numbers i.e. zeros and negatives. So, we can’t simply apply the algorithm. But if we think carefully then we notice that we actually don’t have to care about zeros and negative numbers because we only care about smallest positive numbers. That is if we can put aside the non-positive numbers and only considers the positives then we can simply apply the “element as index to mark non-missing by negating the value” method to find the missing positives.
How do we put aside non-positive elements? We can actually do a partition as we do in quicksort to create a partition where all positive elements will be put on left of the partition and all zeros and negatives on the right hand. If we find such a partition index q, then A[0..q-1] will contain all positives. Now, we just have to scan the positive partition of array the i.e. from 0 to q-1 and mark A[abs(A[i])-1] as marked i.e negating the value. After marking phase we sweep through the partition again to find first index i where we find a positive element. Then i+1 is the smallest i.e. first missing positive. If we do not find such an index then there is no missing numbers between 1 to q (why?). In that case we return next positive number q+1 (why?). Below is the implementation of this algorithm which assumes we can update the original array. It runs in O(n) time and constant space.
public static int firstMissingPositive(int[] nums) {
    if(nums.length == 0){
        return 1;
    }
    
    int p = 0;
    int r = nums.length - 1;
    int q = p-1;

    for(int j = 0; j<=r; j++){
        if(nums[j] > 0){
            swap(nums, ++q, j);
        }
    }
    
    q++;
    for(int i = 0; i < q; i++){
     int index = Math.abs(nums[i])-1;
     if(index<q){
      nums[index] = -Math.abs(nums[index]);
     }
    }
    
    for(int i = 0; i < q; i++){
        if(nums[i] > 0){
            return i+1;
        }
    }
    
    return q+1;
}

Read full article from Find Missing Number - Algorithms and Problem SolvingAlgorithms and Problem Solving

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts