LeetCode 718 - Maximum Length of Repeated Subarray


https://leetcode.com/problems/maximum-length-of-repeated-subarray/
Given two integer arrays A and B, return the maximum length of an subarray that appears in both arrays.
Example 1:
Input:
A: [1,2,3,2,1]
B: [3,2,1,4,7]
Output: 3
Explanation: 
The repeated subarray with maximum length is [3, 2, 1].
Note:
  1. 1 <= len(A), len(B) <= 1000
  2. 0 <= A[i], B[i] < 100

Use dynamic programming. dp[i][j] will be the answer for inputs A[i:], B[j:].

X. Bisection
https://leetcode.com/problems/maximum-length-of-repeated-subarray/discuss/156891/Binary-Search-%2B-Rabin-Karp-%2B-Hash-Table-O(N-log-N)-Beats-100
    int findLength(vector<int>& A, vector<int>& B) {
        const int P = 19941229; //Rabin-Karp
        
        vector<int> hash1(A.size() + 1, 0), hash2(B.size() + 1, 0), p(max(A.size(), B.size()) + 1);
        int len = max(A.size(), B.size());
        p[0] = 1;
        for (int i = 1; i <= len; i++)
            p[i] = p[i - 1] * P;
        for (int i = 1; i <= A.size(); i++)
            hash1[i] = hash1[i - 1] + A[i - 1] * p[i];
        for (int i = 1; i <= B.size(); i++)
            hash2[i] = hash2[i - 1] + B[i - 1] * p[i];
        int l = 1, r = min(A.size(), B.size());
        while (l <= r) { //Binary search the answer
            int m = (l + r) / 2;
            unordered_set<int> hash;
            bool ok = false;
            for (int i = 1; i + m - 1 <= A.size(); i++) //Calculate Rabin-Karp hash for all substring s for the first string of length m, insert into hash table
                hash.insert((hash1[i + m - 1] - hash1[i - 1]) * p[len - (i + m - 1)]);
            for (int i = 1; i + m - 1 <= B.size(); i++){ //Calculate Rabin-Karp hash for the second string, query the hash table
                if (hash.count((hash2[i + m - 1] - hash2[i - 1]) * p[len - (i + m - 1)])) {
                    ok = true;
                    break;
                }
            }
            if (ok)
                l = m + 1;
            else
                r = m - 1;
        }
        return r;
    }

https://leetcode.com/articles/maximum-length-of-repeated-subarray/
Approach #4: Binary Search with Rolling Hash
As in Approach #2, we will binary search for the answer. However, we will use a rolling hash (Rabin-Karp algorithm) to store hashes in our set structure.
Algorithm
For some prime p, consider the following function modulo some prime modulus \mathcal{M}:
\text{hash}(S) = \sum_{0 \leq i &lt; len(S)} p^i * S[i]
Notably, \text{hash}(S[1:] + x) = \frac{(\text{hash}(S) - S[0])}{p} + p^{n-1} x. This shows we can get the hash of all A[i:i+\text{guess}] in linear time. We will also use the fact that p^{-1} = p^{\mathcal{M}-2} \mod \mathcal{M}.
For every i >= length - 1, we will want to record the hash of A[i-length+1], A[i-length+2], ..., A[i]. After, we will truncate the first element by h = (h - A[i - (length - 1)]) * Pinv % MOD to get ready to add the next element.


To make our algorithm air tight, we also make a naive check when our work with rolling hashes says that we have found a match.
  • Time Complexity: O((M+N) * \log{(\min(M, N))}), where M, N are the lengths of A, B. The log factor is contributed by the binary search, while creating the rolling hashes is O(M + N). The checks for duplicate hashes are O(1). If we perform a naive check to make sure our answer is correct, it adds a factor of O(\min(M, N)) to our cost of check, which keeps the complexity the same.
  • Space Complexity: O(M), the space used to store hashes and the subarrays in our final naive check.
    int P = 113;
    int MOD = 1_000_000_007;
    int Pinv = BigInteger.valueOf(P).modInverse(BigInteger.valueOf(MOD)).intValue();

    private int[] rolling(int[] source, int length) {
        int[] ans = new int[source.length - length + 1];
        long h = 0, power = 1;
        if (length == 0) return ans;
        for (int i = 0; i < source.length; ++i) {
            h = (h + source[i] * power) % MOD;
            if (i < length - 1) {
                power = (power * P) % MOD;
            } else {
                ans[i - (length - 1)] = (int) h;
                h = (h - source[i - (length - 1)]) * Pinv % MOD;
                if (h < 0) h += MOD;
            }
        }
        return ans;
    }

    private boolean check(int guess, int[] A, int[] B) {
        Map<Integer, List<Integer>> hashes = new HashMap();
        int k = 0;
        for (int x: rolling(A, guess)) {
            hashes.computeIfAbsent(x, z -> new ArrayList()).add(k++);
        }
        int j = 0;
        for (int x: rolling(B, guess)) {
            for (int i: hashes.getOrDefault(x, new ArrayList<Integer>()))
                if (Arrays.equals(Arrays.copyOfRange(A, i, i+guess),
                                  Arrays.copyOfRange(B, j, j+guess))) {
                    return true;
                }
            j++;
        }
        return false;
    }

    public int findLength(int[] A, int[] B) {
        int lo = 0, hi = Math.min(A.length, B.length) + 1;
        while (lo < hi) {
            int mi = (lo + hi) / 2;
            if (check(mi, A, B)) lo = mi + 1;
            else hi = mi;
        }
        return lo - 1;
    }

X. DP
Approach #3: Dynamic Programming
Since a common subarray of A and B must start at some A[i] and B[j], let dp[i][j] be the longest common prefix of A[i:] and B[j:]. Whenever A[i] == B[j], we know dp[i][j] = dp[i+1][j+1] + 1. Also, the answer is max(dp[i][j]) over all i, j.
We can perform bottom-up dynamic programming to find the answer based on this recurrence. Our loop invariant is that the answer is already calculated correctly and stored in dp for any larger i, j.
  • Time Complexity: O(M*N), where M, N are the lengths of A, B.
  • Space Complexity: O(M*N), the space used by dp.
    public int findLength(int[] A, int[] B) {
        int ans = 0;
        int[][] memo = new int[A.length + 1][B.length + 1];
        for (int i = A.length - 1; i >= 0; --i) {
            for (int j = B.length - 1; j >= 0; --j) {
                if (A[i] == B[j]) {
                    memo[i][j] = memo[i+1][j+1] + 1;
                    if (ans < memo[i][j]) ans = memo[i][j];
                }
            }
        }
        return ans;
    }
https://leetcode.com/problems/maximum-length-of-repeated-subarray/discuss/109039/Concise-Java-DP%3A-Same-idea-of-Longest-Common-Substring
 * dp[i][j] = a[i] == b[j] ? 1 + dp[i + 1][j + 1] : 0;
 * dp[i][j] : max lenth of common subarray start at a[i] & b[j];
    public int findLength(int[] A, int[] B) {
        if(A == null||B == null) return 0;
        int m = A.length;
        int n = B.length;
        int max = 0;
        //dp[i][j] is the length of longest common subarray ending with nums[i] and nums[j]
        int[][] dp = new int[m + 1][n + 1];
        for(int i = 0;i <= m;i++){
            for(int j = 0;j <= n;j++){
                if(i == 0 || j == 0){
                    dp[i][j] = 0;
                }
                else{
                    if(A[i - 1] == B[j - 1]){
                        dp[i][j] = 1 + dp[i - 1][j - 1];
                        max = Math.max(max,dp[i][j]);
                    }
                }
            }
        }
        return max;
    }
O(n) space


  public int findLength2(int[] A, int[] B) {
        if(A == null || B == null || A.length == 0 || B.length == 0)return 0;
        int Rows = A.length;
        int Cols = B.length;

        // dp[j] is the length of longest common subarray ending with A[i-1], B[j-1]
        int[] dp = new int[Cols + 1];
        int maxLen = 0;
        for (int i = 1; i <=Rows ; i++) {
            for (int j =  Cols; j > 0 ; j--) {
                if(A[i -1] == B[j-1])
                {
                    dp[j] = 1 + dp[j-1] ;
                    maxLen = Math.max(maxLen, dp[j]);
                }else {
                    dp[j] = 0;
                }
            }
        }
        return maxLen;
    }
X. DP, O(1) space
https://leetcode.com/problems/maximum-length-of-repeated-subarray/discuss/109040/Java-O(mn)-time-O(1)-space
maxLen[i][j] --> the max length of common subarray ending at A[i] and B[j]
=> 0 , if A[i] != A[j]
maxLen[i-1][j-1] + 1, if A[i] = B[j]
So maxLen[i][j] only depends on maxLen[i-1][j-1], so you only need to keep maxLen[i-1][j-1] which is one element.
In below code we use maxLenEnding to keep maxLen[i-1][j-1] and traverse the matrix diagonally
public int findLength(int[] A, int[] B) {
        
        int maxLen = 0;
        
        
        for (int j = 0; j < B.length; j++) {
            int maxLenEnding = 0;
            for (int i = 0, k = j; i < A.length && k < B.length; i++, k++) {
                if (A[i] != B[k]) maxLenEnding = 0;
                else {
                    maxLenEnding++;
                    maxLen = Math.max(maxLen, maxLenEnding);
                }
            }
        }
        
        for (int i =1; i < A.length; i++) {
            int maxLenEnding = 0;
            for (int j = 0, k = i; k < A.length && j < B.length; j++, k++) {
                if (A[k] != B[j]) maxLenEnding = 0;
                else {
                    maxLenEnding++;
                    maxLen = Math.max(maxLen, maxLenEnding);
                }
            }
        }
        
        return maxLen;
        
        
        
    }
https://leetcode.com/problems/maximum-length-of-repeated-subarray/discuss/109064/Easy-O(n2)-Time-O(1)-Space-solution-No-DP

Approach #2: Binary Search with Naive Check
If there is a length k subarray common to A and B, then there is a length j <= k subarray as well.
Let check(length) be the answer to the question "Is there a subarray with length length, common to Aand B?" This is a function with range that must take the form [True, True, ..., True, False, False, ..., False] with at least one True. We can binary search on this function.
Algorithm
Focusing on the binary search, our invariant is that check(hi) will always be False. We'll start with hi = min(len(A), len(B)) + 1; clearly check(hi) is False.
Now we perform our check in the midpoint mi of lo and hi. When it is possible, then lo = mi + 1, and when it isn't, hi = mi. This maintains the invariant. At the end of our binary search, hi == lo and lo is the lowest value such that check(lo) is False, so we want lo - 1.
As for the check itself, we can naively check whether any A[i:i+k] == B[j:j+k] using set structures.
  • Time Complexity: O((M + N) * \min(M, N) * \log{(\min(M, N))}), where M, N are the lengths of A, B. The log factor comes from the binary search. The complexity of our naive check of a given \text{length} is O((M+N) * \text{length}), as we will create the seen strings with complexity O(M * \text{length}), then search for them with complexity O(N * \text{length}), and our total complexity when performing our check is the addition of these two.
  • Space Complexity: O(M^2), the space used by seen.
    public boolean check(int length, int[] A, int[] B) {
        Set<String> seen = new HashSet();
        for (int i = 0; i + length <= A.length; ++i) {
            seen.add(Arrays.toString(Arrays.copyOfRange(A, i, i+length)));
        }
        for (int j = 0; j + length <= B.length; ++j) {
            if (seen.contains(Arrays.toString(Arrays.copyOfRange(B, j, j+length)))) {
                return true;
            }
        }
        return false;
    }

    public int findLength(int[] A, int[] B) {
        int lo = 0, hi = Math.min(A.length, B.length) + 1;
        while (lo < hi) {
            int mi = (lo + hi) / 2;
            if (check(mi, A, B)) lo = mi + 1;
            else hi = mi;
        }
        return lo - 1;
    }

Approach #1: Brute Force with Initial Character Map
In a typical brute force, for all starting indices i of A and j of B, we will check for the longest matching subarray A[i:i+k] == B[j:j+k] of length k. This would look roughly like the following psuedocode:
ans = 0
for i in [0 .. A.length - 1]:
    for j in [0 .. B.length - 1]:
        k = 0
        while (A[i+k] == B[j+k]): k += 1 #and i+k < A.length etc.
        ans = max(ans, k)
Our insight is that in typical cases, most of the time A[i] != B[j]. We could instead keep a hashmap Bstarts[A[i]] = all j such that B[j] == A[i], and only loop through those in our j loop.
  • Time Complexity: O(M*N*\min(M, N)), where M, N are the lengths of A, B. The worst case is when all the elements are equal.
  • Space Complexity: O(N), the space used by Bstarts. (Of course, we could amend our algorithm to make this O(\min(M, N)).)
    public int findLength(int[] A, int[] B) {
        int ans = 0;
        Map<Integer, ArrayList<Integer>> Bstarts = new HashMap();
        for (int j = 0; j < B.length; j++) {
            Bstarts.computeIfAbsent(B[j], x -> new ArrayList()).add(j);
        }

        for (int i = 0; i < A.length; i++) if (Bstarts.containsKey(A[i])) {
            for (int j: Bstarts.get(A[i])) {
                int k = 0;
                while (i+k < A.length && j+k < B.length && A[i+k] == B[j+k]) {
                    k++
                }
                ans = Math.max(ans, k);
            }
        }
        return ans;
    }



Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts