Google – Multiple Strings Pattern Matching


Google – Multiple Strings Pattern Matching
输入一个pattern,和一堆strings,做跨string的pattern matching.
比如:
patter: "abc"
strs: "ab", "cd" => true
strs: "aa", "bcd" => true
strs: "ab", "ac" => false
strings的输入是一个iterator,只能往前,不能往后。

[Solution]
Greedy
KMP
两种方法各有利弊,greedy edge cases很多。kmp虽然简单,但是把所有strings串起来内存很可能不够。
[Solution #1] – Greedy
Greedy的解法要记住前面是否有已经匹配的,随时要判断是继续匹配前面的,还是重新开始匹配当前的。
[Solution #2] – KMP
把iterator里的所有string串起来,对pattern做kmp
Cons: 注意内存不够的情况。
[Solution #3] – KMP with Cut
但是内存不够的情况也还是有解决办法。分段处理。(前提是pattern必须得放的进内存)
比如内存只能放m个字符(这里排除了pattern所占的空间,假设除去pattern之后,还能塞m个),串起来之后的总长度为n,那么每次取m个字符和pattern来匹配,
  1. 如果匹配成功,返回true。
  2. 如果完全没有匹配,把下一段m长度丢进来。
  3. 如果匹配了k个字符,这k个字符必须是m个字符的suffix,那么就丢掉前面(m – k)个字符,加入下一段(m – k),从pattern的k + 1位置开始继续往下匹配。
  4. 如果在第3种情况下,后面加入的(m – k)个字符又都被匹配成功了,那么我们可以把目前内存里所有m个字符全部丢掉,加入下一段,从pattern的m + 1位置继续往下匹配。(不需要假设pattern长度小于m)
这样时间复杂度依然是O(LEN), where LEN is the sum of length of all strings in the iterator.

class Solution {
  public boolean patternMatch(String pattern, Iterator<String> it) {
    if (pattern == null || pattern.isEmpty() || it == null) {
      return false;
    }
 
    int prevStart = 0;
    while (it.hasNext()) {
      String curr = it.next();
      int currStart = 0;
 
      for (int i = 0; i < curr.length(); i++) {
        char c = curr.charAt(i);
        if (prevStart != -1 && currStart != -1 && pattern.charAt(prevStart) == c && pattern.charAt(currStart) == c) {
          prevStart++;
          currStart++;
        }
        else if (prevStart != -1 && pattern.charAt(prevStart) == c) {
          prevStart++;
          if (c == pattern.charAt(0)) {
            currStart = 1;
          }
          else {
            currStart = -1;
          }
        }
        else if (currStart != -1 && pattern.charAt(currStart) == c) {
          currStart++;
          prevStart = -1;
        }
        else {
          if (c == pattern.charAt(0)) {
            currStart = 1;
          }
          else {
            currStart = -1;
          }
          prevStart = -1;
        }
 
        if (prevStart == pattern.length() || currStart == pattern.length()) {
          return true;
        }
 
      }
 
      if (prevStart == -1) {
        prevStart = currStart;
      }
    }
 
    return false;
  }
}
 
class Solution2 {
  public boolean patternMatch(String pattern, Iterator<String> it) {
    if (pattern == null || pattern.isEmpty() || it == null) {
      return false;
    }
 
    StringBuilder sb = new StringBuilder();
    while (it.hasNext()) {
      sb.append(it.next());
    }
 
    return strStr(sb.toString(), pattern);
  }
 
  private boolean strStr(String s, String t) {
 
    int[] lookup = preSuffix(t);
    int i = 0;
    int j = 0;
    while (i < s.length()) {
      if (s.charAt(i) == t.charAt(j)) {
        i++;
        j++;
      }
      else if (j == 0) {
        i++;
      }
      else {
        j = lookup[j];
      }
 
      if (j == t.length()) {
        return true;
      }
    }
 
    return false;
  }
 
  private int[] preSuffix(String t) {
    int[] lookup = new int[t.length()];
 
    int i = 1;
    int j = 0;
    while (i < t.length()) {
      if (t.charAt(i) == t.charAt(j)) {
        lookup[i] = j + 1;
        i++;
        j++;
      }
      else if (j == 0) {
        i++;
      }
      else {
        j = lookup[j];
      }
    }
 
    return lookup;
  }
}
Read full article from Google – Multiple Strings Pattern Matching

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts