LeetCode 392 - Is Subsequence


http://www.cnblogs.com/grandyang/p/5842033.html
Given a string s and a string t, check if s is subsequence of t.
You may assume that there is only lower case English letters in both s and tt is potentially a very long (length ~= 500,000) string, and s is a short string (<=100).
A subsequence of a string is a new string which is formed from the original string by deleting some (can be none) of the characters without disturbing the relative positions of the remaining characters. (ie, "ace" is a subsequence of "abcde" while "aec" is not).
Example 1:
s = "abc"t = "ahbgdc"
Return true.
Example 2:
s = "axc"t = "ahbgdc"
Return false.
Follow up:
If there are lots of incoming S, say S1, S2, ... , Sk where k >= 1B, and you want to check one by one to see if T has its subsequence. In this scenario, how would you change your code?
这道题算比较简单的一种,我们可以用两个指针分别指向字符串s和t,然后如果字符相等,则i和j自增1,反之只有j自增1,最后看i是否等于s的长度,等于说明s已经遍历完了,而且字符都有在t中出现过
    bool isSubsequence(string s, string t) {
        if (s.empty()) return true;
        int i = 0, j = 0;
        while (i < s.size() && j < t.size()) {
            if (s[i] == t[j]) {
                ++i; ++j;
            } else {
                ++j;
            }
        }
        return i == s.size();
    }
};
    bool isSubsequence(string s, string t) {
        if (s.empty()) return true;
        int i = 0, j = 0;
        while (i < s.size() && j < t.size()) {
            if (s[i] == t[j]) ++i;
            ++j;
        }
        return i == s.size();
    }
http://blog.csdn.net/mebiuw/article/details/52444480
No need to convert to char[]
public boolean isSubsequence(String s, String t) { int i=0,j=0; char[] ss=s.toCharArray(); char[] tt=t.toCharArray(); while(i<ss.length && j<tt.length){ if(ss[i] == tt[j]){ i++; } j++; } return i==s.length(); }
http://www.cnblogs.com/dongling/p/5843697.html
public boolean isSubsequence(String s, String t) { if(s==null||s.length()==0) return true; int index=0; char ch; for(int i=0;i<s.length();i++){ ch=s.charAt(i); while(index<t.length()&&t.charAt(index)!=ch){ index++; } if(index>=t.length()){ return false; } index++; } return true; }
X. https://discuss.leetcode.com/topic/57205/java-only-2ms-much-faster-than-normal-2-pointers
    public boolean isSubsequence(String s, String t) 
    {
        if(t.length() < s.length()) return false;
        int prev = 0;
        for(int i = 0; i < s.length();i++)
        {
            char tempChar = s.charAt(i);
            prev = t.indexOf(tempChar,prev);
            if(prev == -1) return false;
            prev++;
        }
        
        return true;
    }
X. Follow up
https://scottduan.gitbooks.io/leetcode-review/content/is_subsequence.html
If there are lots of incoming S, say S1, S2, ... , Sk where k >= 1B, and you want to check one by one to see if T has its subsequence. In this scenario, how would you change your code?

https://leetcode.com/problems/is-subsequence/discuss/87264/Easy-to-understand-binary-search-solution
https://www.jianshu.com/p/2e2d6e82115c
Say length of s = a and length of t = b. The first for loop takes O(b). The second for loop takes O(alogb). The overall time complexity is O(alogb).
  private List<Integer>[] t_pos_map = new List[256];

  public boolean isSubsequence(String s, String t) {
    buildTPosMap(t);

    int prev = 0;
    for (int i = 0; i < s.length(); i++) {
      if (t_pos_map[s.charAt(i)] == null)
        return false;
      int j = Collections.binarySearch(t_pos_map[s.charAt(i)], prev);
      if (j < 0)
        j = -j - 1;
      if (j == t_pos_map[s.charAt(i)].size())
        return false;
      prev = t_pos_map[s.charAt(i)].get(j) + 1;
    }
    return true;
  }

  private void buildTPosMap(String t) {
    for (int i = 0; i < t.length(); i++) {
      if (t_pos_map[t.charAt(i)] == null)
        t_pos_map[t.charAt(i)] = new ArrayList<>();
      t_pos_map[t.charAt(i)].add(i);
    }

  }
https://discuss.leetcode.com/topic/57994/java-binary-search-using-treeset-got-tle
https://discuss.leetcode.com/topic/60134/java-code-for-the-problem-two-pointer-and-the-follow-up-binary-search
Java binary search using TreeSet, but got TLE. For single s, it performs worse than linear 2-pointer solution for sure, but if there are many s like in the follow up, should it be better since t only got processed once?
    public boolean isSubsequence(String s, String t) {
        int sLen = s.length(), tLen = t.length();
        if(sLen == 0) return true;
        if(sLen > tLen) return false;
        
        Map<Character, TreeSet<Integer>> map = new HashMap<>();
        for(int i = 0; i < tLen; i++) {
            char c = t.charAt(i);
            if(!map.containsKey(c)) map.put(c, new TreeSet<Integer>());
            map.get(c).add(i);
        }
        
        int lowerIndex = -1;
        for(int j = 0; j < sLen; j++) {
            char c = s.charAt(j);
            if(!map.containsKey(c)) return false;
            
            Integer index = map.get(c).higher(lowerIndex);
            if(index == null) return false;
            lowerIndex = index;
        }
        
        return true;
    }
I think the Map and TreeSet could be simplified by Array and binarySearch. Since we scan T from beginning to the end (index itself is in increasing order), List will be sufficient. Then we can use binarySearch to replace with TreeSet ability which is a little overkill for this problem.


Binary search:
  • record the indexes for each character in t, if s[i] matches t[j], then s[i+1] should match a character in t with index bigger than j. This can be reduced to find the first element larger than a value in an sorted array (find upper bound), which can be achieved using binary search.
Trie:

  • For example, if s1 has been matched, s1[last char] matches t[j]. Now, s2 comes, if s1 is a prefix of s2, i.e., s1 == s2.substr[0, i-1], we can start match s2 from s2[i], right?
  • So, the idea is to create a Trie for all string that have been matched so far. At a node, we record the position in t which matches this char represented by the node. Now, for an incoming string s, we first search the longest prefix in the Trie, find the matching position of the last prefix-char in t, say j. Then, we can start matching the first non-prefix-char of s from j+1.
  • Now, if we have done the preprocessing as stated in the binary search approach, we can be even faster.
    // Follow-up: O(N) time for pre-processing, O(Mlog?) for each S.
    // Eg-1. s="abc", t="bahbgdca"
    // idx=[a={1,7}, b={0,3}, c={6}]
    //  i=0 ('a'): prev=1
    //  i=1 ('b'): prev=3
    //  i=2 ('c'): prev=6 (return true)
    // Eg-2. s="abc", t="bahgdcb"
    // idx=[a={1}, b={0,6}, c={5}]
    //  i=0 ('a'): prev=1
    //  i=1 ('b'): prev=6
    //  i=2 ('c'): prev=? (return false)
    public boolean isSubsequence(String s, String t) {
        List<Integer>[] idx = new List[256]; // Just for clarity
        for (int i = 0; i < t.length(); i++) {
            if (idx[t.charAt(i)] == null)
                idx[t.charAt(i)] = new ArrayList<>();
            idx[t.charAt(i)].add(i);
        }
        
        int prev = 0;
        for (int i = 0; i < s.length(); i++) {
            if (idx[s.charAt(i)] == null) return false; // Note: char of S does NOT exist in T causing NPE
            int j = Collections.binarySearch(idx[s.charAt(i)], prev);
            if (j < 0) j = -j - 1;
            if (j == idx[s.charAt(i)].size()) return false;
            prev = idx[s.charAt(i)].get(j) + 1;
        }
        return true;
    }
X. https://discuss.leetcode.com/topic/57718/easy-to-understand-binary-search-solution
space complexity is length of t, we need to add an index value to our dictionary foreach element of t. You also have the overhead of the keys for t but this does not changed that fact that this is still O(length of t)
time complexity is 1 binary search through 1 dictionary entry for each character of s. Worst case the dictionary entry has all indexes of tso it's cost would be log of the length of t. Multiply that against length of s and you get O(length of s * log(length of t)) ???
https://discuss.leetcode.com/topic/58078/simple-c-code-as-well-as-the-followup-solution
My solution is to preprocessing, exactly constructing a hash map to store the positions for every character. Then scan the incoming string one by one, for every character, if there is no such character in the hash map, or the number of such character is greater than the original string, or most critically, the position is not behind the position of its previous character, it will return false. So I need another array to record the index for every character. The time complexity is just the sum of length of incoming strings???


we will build an array mem where mem[i+1][j+1] means that S[0..j] contains T[0..i] that many times as distinct subsequences. Therefor the result will be mem[T.length()][S.length()]. we can build this array rows-by-rows: the first row must be filled with 1. That's because the empty string is a subsequence of any string but only 1 time. So mem[0][j] = 1 for every j. So with this we not only make our lives easier, but we also return correct value if T is an empty string. the first column of every rows except the first must be 0. This is because an empty string cannot contain a non-empty string as a substring -- the very first item of the array: mem[0][0] = 1, because an empty string contains the empty string 1 time.
So the matrix looks like this:
  S 0123....j
T +----------+
  |1111111111|
0 |0         |
1 |0         |
2 |0         |
. |0         |
. |0         |
i |0         |
From here we can easily fill the whole grid: for each (x, y), we check if S[x] == T[y] we add the previous item and the previous item in the previous row, otherwise we copy the previous item in the same row. The reason is simple:
if the current character in S doesn't equal to current character T, then we have the same number of distinct subsequences as we had without the new character. if the current character in S equal to the current character T, then the distinct number of subsequences: the number we had before plus the distinct number of subsequences we had with less longer T and less longer S.
An example:
S: [acdabefbc] and T: [ab]
first we check with a:
           *  *
      S = [acdabefbc]
mem[1] = [0111222222]
then we check with ab:
               *  * ]
      S = [acdabefbc]
mem[1] = [0111222222]
mem[2] = [0000022244]
And the result is 4, as the distinct subsequences are:
  S = [a   b    ]
  S = [a      b ]
  S = [   ab    ]
  S = [   a   b ]

    if (s == null && t == null) {
        return true;
    }
    if (s == null || t == null) {
        return false;
    }
    int m = s.length();
    int n = t.length();
    boolean[][] dp = new boolean[n+1][m+1];
    for (int i = 0; i <= n; i++) {
        dp[i][0] = true;
    }
    for (int i = 1; i <= n; i++) {
        for (int j = 1; j <= m; j++) {
            if (t.charAt(i-1) == s.charAt(j-1)) {
                dp[i][j] = dp[i-1][j-1] || dp[i-1][j];
            } else {
                dp[i][j] = dp[i-1][j];
            }
            if (dp[i][m] == true) {
                return true;
            }
        }
    }
    return dp[n][m];
https://discuss.leetcode.com/topic/67167/java-code-for-the-follow-up-question/
public boolean isSubsequence(String s, String t) {
    if (s == null || t == null) return false;
    
    Map<Character, List<Integer>> map = new HashMap<>(); //<character, index>
    
    //preprocess t
    for (int i = 0; i < t.length(); i++) {
        char curr = t.charAt(i);
        if (!map.containsKey(curr)) {
            map.put(curr, new ArrayList<Integer>());
        }
        map.get(curr).add(i);
    }
    
    int prev = -1;  //index of previous character
    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        
        if (map.get(c) == null)  {
            return false;
        } else {
            List<Integer> list = map.get(c);
            prev = binarySearch(prev, list, 0, list.size() - 1);
            if (prev == -1) {
                return false;
            }
            prev++;
        }
    }
    
    return true;
}

private int binarySearch(int index, List<Integer> list, int start, int end) {
    while (start <= end) {
        int mid = start + (end - start) / 2;
        if (list.get(mid) < index) {
            start = mid + 1;
        } else {
            end = mid - 1;
        }
    }
    
    return start == list.size() ? -1 : list.get(start);
}
X. DP
http://blog.csdn.net/u014688145/article/details/71180132
public boolean isSubsequence(String s, String t){ if (s.isEmpty()) return true; int n = s.length(); boolean[] isSeq = new boolean[n]; int index = 0; for (int i = 0; i < t.length(); i++){ if (s.charAt(index) == t.charAt(i)){ isSeq[index++] = true; if (isSeq[n-1]) return true; } } return isSeq[n-1]; }

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts