LeetCode 139 - Word Break


Related: LeetCode 140 - Word Break II
LeetCode 139 – Word Break
Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".

Time complexity:
Time complexity is O(len(wordDict) ^ len(s / minWordLenInDict)), because there're len(wordDict) possibilities for each cut
https://stackoverflow.com/questions/39519667/whats-the-time-complexity-of-this-recursive-algorithm
Time complexity is O(2^n * n).
The number of maximal such word breaks is 2^(n-1), with the bijection to binary vectors of length n-1: For each space between two characters, you can either break the word there (1 in the vector) or not (0 in the vector), giving you 2^(n-1) such possible words.
Since creating each string, calculating its hash and adding it to the set is O(n) (the substring's length), you get O(2^n * n).
You get the worst case for:
dict = [a, aa, aaa, aaaa, ....]
s = "aaaaaaaaa...a"
Your map ensures you do not do duplicated work, using memoization. (Without that, complexity would have been factorial of n, but using it you avoid duplicate work like recalculating from scratch the prefix of "aaa...a|a|a" and "aa...a|aa"

X. DP
http://www.programcreek.com/2012/12/leetcode-solution-word-break/
  • Define an array t[] such that t[i]==true => 0-(i-1) can be segmented using dictionary
  • Initial state t[0] == true
    public boolean wordBreak(String s, Set<String> dict) {
        boolean[] t = new boolean[s.length()+1];
        t[0] = true; //set first to be true, why?
        //Because we need initial state
 
        for(int i=0; i<s.length(); i++){
            //should continue from match position
            if(!t[i]) 
                continue;
 
            for(String a: dict){
                int len = a.length();
                int end = i + len;
                if(end > s.length())
                    continue;
 
                if(t[end]) continue;
 
                if(s.substring(i, end).equals(a)){
                    t[end] = true;
                }
            }
        }
 
        return t[s.length()];
    }
Time: O(string length * dict size)
In Solution 2, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).
-Not that efficient
public boolean wordBreak(String s, Set<String> wordDict) {
    int[] pos = new int[s.length()+1];
 
    Arrays.fill(pos, -1);
 
    pos[0]=0;
 
    for(int i=0; i<s.length(); i++){
        if(pos[i]!=-1){
            for(int j=i+1; j<=s.length(); j++){
                String sub = s.substring(i, j);
                if(wordDict.contains(sub)){
                    pos[j]=i;
                }
            } 
        }
    }
 
    return pos[s.length()]!=-1;
}

https://www.cnblogs.com/lightwindy/p/8509053.html
public boolean wordBreak(String s, Set<String> dict) { 
    if(s==null || s.length()==0
        return true
    boolean[] res = new boolean[s.length()+1]; 
    res[0] = true
    for(int i=0;i<s.length();i++) 
    
        StringBuilder str = new StringBuilder(s.substring(0,i+1)); 
        for(int j=0;j<=i;j++) 
        
            if(res[j] && dict.contains(str.toString())) 
            
                res[i+1] = true
                break
            
            str.deleteCharAt(0); 
        
    
    return res[s.length()]; 
https://leetcode.com/problems/word-break/discuss/43842/Java-dp-solution-without-using-substring-O(N2)
    public boolean wordBreak(String s, List<String> wordDict) {
        //base case
        if(s == null || s.length() == 0) return false;
        //use dp and scan from right to left to avoid using substring
        boolean[] M = new boolean[s.length() + 1];
        M[s.length()] = true;
        for(int i = s.length() - 1; i >= 0; --i) {
            StringBuilder sb = new StringBuilder();
            for(int j = i; j < s.length(); ++j) {
                sb.append(s.charAt(j));
                //remember convert stringbuilder to string
                if(wordDict.contains(sb.toString()) && M[j + 1]) {
                    M[i] = true;
                    break;
                }
            }
        }
        return M[0];
    }


https://leetcode.com/discuss/63212/java-solution-using-dp
Your code can be optimized a bit. In the inner loop, start from the tail to head instead of head to tail to check if the substring exists. We can break immediately when we see a match.
We can do even better if we precalculate the min and max of words' lengths in the dictionary. Just need to check from (i-wordLengthMax) to (i-wordLengthMin),

public boolean wordBreak(String s, Set<String> wordDict) { if (s == null && wordDict == null) return true; if (s == null || wordDict == null) return false; //dp[i] represents if s.substring(0, i+1) is wordbreakable. boolean[] dp = new boolean[s.length()+1]; dp[0] = true; for (int i = 1; i <= s.length(); i++) { for (int j = i-1; j >=0; j--) { if (dp[j] && wordDict.contains(s.substring(j, i))){ dp[i] = true; break; } } } return dp[s.length()]; }

public boolean wordBreak(String s, Set<String> wordDict) {
    if (s == null && wordDict == null)
        return true;
    if (s == null || wordDict == null)
        return false;
    //dp[i] represents if s.substring(0, i) is wordbreakable.
    boolean[] dp = new boolean[s.length()+1];
    dp[0] = true;
    for (int i = 1; i <= s.length(); i++) {
        for (int j = 0; j < i; j++) {
            if (dp[j] && wordDict.contains(s.substring(j, i))) {
                dp[i] = true;
                break;
            }
        }
    }
    return dp[s.length()];
}
http://www.jiuzhang.com/solutions/word-break/
    private int getMaxLength(Set<String> dict) {
        int maxLength = 0;
        for (String word : dict) {
            maxLength = Math.max(maxLength, word.length());
        }
        return maxLength;
    }

    public boolean wordBreak(String s, Set<String> dict) {
        if (s == null || s.length() == 0) {
            return true;
        }

        int maxLength = getMaxLength(dict);
        boolean[] canSegment = new boolean[s.length() + 1];

        canSegment[0] = true;
        for (int i = 1; i <= s.length(); i++) {
            canSegment[i] = false;
            for (int lastWordLength = 1;
                     lastWordLength <= maxLength && lastWordLength <= i;
                     lastWordLength++) {
                if (!canSegment[i - lastWordLength]) {
                    continue;
                }
                String word = s.substring(i - lastWordLength, i);
                if (dict.contains(word)) {
                    canSegment[i] = true;
                    break;
                }
            }
        }

        return canSegment[s.length()];
    }

Scan the dict to get the max possible length of words first before the loop
public boolean wordBreak(String s, Set<String> wordDict) { if(s == null || s.length() == 0){ return true; } int n = s.length(); boolean [] dp = new boolean[n+1]; dp[0] = true ; int maxLength = 0; for(String t : wordDict){ maxLength = Math.max(maxLength, t.length()); } for(int i = 1; i <= n; i++){ dp[i] = false; for(int j = i-1; j >= Math.max(0, i - maxLength) && !dp[i]; j--){ if(dp[j]){ if(wordDict.contains(s.substring(j, i))){ dp[i] = true; } } } } return dp[n]; }
https://leetcode.com/discuss/8482/a-java-solution-with-similar-dp-idea
The idea is pretty similar to other DP solution. 1)keep all positions which could form substring contained in the set in a linkedlist 2) Iterate the target string, check substring between current position and stored positions. If new sub string hits the dictionary,add it the front of linkedlist 3)After iteration, check if the front element of linkedlist equals to the length of string.
It consumes 296ms
This solution is still a time O(n^2) and space O(n) one. It is better if dictionary contains long words.

public boolean wordBreak(String s, Set<String> dict) { if (s==null||s.length()==0) return false; else if (dict.contains(s)) return true; List<Integer> starts = new LinkedList<Integer>(); starts.add(0); for (int end=1;end<=s.length();end++) for (int i=0;i<starts.size();i++) if (dict.contains(s.substring(starts.get(i),end))){ starts.add(0,end); break; } return (starts.get(0)==s.length()); }
http://bangbingsyb.blogspot.com/2014/11/leetcode-word-break-i-ii.html
看到这题第一反应是DFS枚举查找,直到“探”到string尾部则算成功。但题目并不要求给出是如何break的,而只要判断是否能break。对这类判断“是”与“否”的可以用DFS暴力解决的题,可以尝试用DP做book keeping中间判断状态,避免DFS的反复搜索。比如这题,可以用一个数组dp记录S[0:i]是否能break。

dp[i]:S[0:i-1]是否能被break。例如dp[1]表示s[0]是否能被break。
dp[0] = true
dp[i] = true if and only if:
1. 存在一个i-1>=k>=0,使s[k:i-1]存在于dict中。
2. s[0: k-1]可以被break,即dp[k] = true。
http://www.cnblogs.com/jcliBlogger/p/4615620.html
 3     bool wordBreak(string s, unordered_set<string>& wordDict) {
 4         int n = s.length();
 5         vector<bool> dp(n + 1, false);
 6         dp[0] = true;
 7         for (string word : wordDict) {
 8             minlen = min(minlen, (int)word.length());
 9             maxlen = max(maxlen, (int)word.length());
10         }
11         for (int i = 1; i <= n; i++) {
12             for (int j = i - minlen; j >= max(0, j - maxlen); j--) {
13                 if (dp[j] && wordDict.find(s.substr(j, i - j)) != wordDict.end()) {
14                     dp[i] = true;
15                     break;
16                 }
17             }
18         }
19         return dp[n];
20     }
X. DFS + cache
https://leetcode.com/problems/word-break/discuss/43819/dfs-with-path-memorizing-java-solution
https://leetcode.com/problems/word-break/discuss/43798/Java-simple-recursing-with-memorization
Say length of the string s = N, when dfs() is called for first time, it takes O(N) to find a valid sub-string t.
Then, we call dfs(s, i, dict, set) and all invalid index will be kept in set when it returns.
The remaining loop iterations take O(N), since we simply look up hash tables.
So, the time complexity = O(N) + O(f(N-i)) + O(N) = O(2N + f(N-1)) = O(2N + 2(N-1) + ... + 1) = O(N^2).
    public boolean wordBreak(String s, Set<String> dict) {
        // DFS
        Set<Integer> set = new HashSet<Integer>();
        return dfs(s, 0, dict, set);
    }
    
    private boolean dfs(String s, int index, Set<String> dict, Set<Integer> set){
        // base case
        if(index == s.length()) return true;
        // check memory
        if(set.contains(index)) return false;
        // recursion
        for(int i = index+1;i <= s.length();i++){
            String t = s.substring(index, i);
            if(dict.contains(t))
                if(dfs(s, i, dict, set))
                    return true;
                else
                    set.add(i);
        }
        set.add(index);
        return false;
    }

 3     bool wordBreak(string s, unordered_set<string>& wordDict) {
 4         for (string word : wordDict) {
 5             minlen = min(minlen, (int)word.length());
 6             maxlen = max(maxlen, (int)word.length());
 7         }
 8         unordered_map<string, bool> records;
 9         return wordBreakDFS(s, 0, wordDict, records);
10     }
11 private:
12     int minlen, maxlen;
13     bool wordBreakDFS(string& s, int idx, unordered_set<string>& wordDict, unordered_map<string, bool>& records) {
14         int n = s.length();
15         if (idx == n) return true;
16         string tail = s.substr(idx, n - idx);
17         if (records.find(tail) != records.end())
18             return records[tail];
19         for (int i = idx + minlen; i <= min(idx + maxlen, n); i++) {
20             string part = s.substr(idx, i - idx);
21             if (wordDict.find(part) != wordDict.end() && wordBreakDFS(s, i, wordDict, records))
22                 return records[part] = true;
23         }
24         return records[tail] = false;
25     }
X. BFS:
https://leetcode.com/problems/word-break/discuss/43797/A-solution-using-BFS
We can use a graph to represent the possible solutions. The vertices of the graph are simply the positions of the first characters of the words and each edge actually represents a word. For example, the input string is "nightmare", there are two ways to break it, "night mare" and "nightmare". The graph would be
0-->5-->9
|__ __ _^
The question is simply to check if there is a path from 0 to 9. The most efficient way is traversing the graph using BFS with the help of a queue and a hash set. The hash set is used to keep track of the visited nodes to avoid repeating the same work.
For this problem, the time complexity is O(n^2) and space complexity is O(n), the same with DP. This idea can be used to solve the problem word break II. We can simple construct the graph using BFS, save it into a map and then find all the paths using DFS
    public boolean wordBreak(String s, List<String> wordDict) {
        Set<String> dict = new HashSet<>(wordDict);
        Set<Integer> visited = new HashSet<>();
        Queue<Integer> queue = new LinkedList<>();
        queue.add(0);
        while (!queue.isEmpty()) {
            int size = queue.size();
            for (int j = 0; j < size; j++) {
                int n = queue.remove();
                if (n == s.length()) {
                    return true;
                }
                if (!visited.contains(n)) {
                    for (int i = n + 1; i <= s.length(); i++) {
                        if (dict.contains(s.substring(n, i))) {
                            queue.add(i);
                            visited.add(n);
                        }
                    }
                }
            }
        }
        return false;
    }
 3     bool wordBreak(string s, unordered_set<string>& wordDict) {
 4         int n = s.length();
 5         for (string word : wordDict) {
 6             minlen = min(minlen, (int)word.length());
 7             maxlen = max(maxlen, (int)word.length());
 8         }
 9         queue<int> toVisit;
10         unordered_set<int> visited;
11         toVisit.push(0);
12         while (!toVisit.empty()) {
13             int start = toVisit.front();
14             toVisit.pop();
15             if (visited.find(start) == visited.end()) {
16                 visited.insert(start);
17                 for (int end = start + minlen - 1; end < min(start + maxlen, n); end++) {
18                     string part = s.substr(start, end - start + 1);
19                     if (wordDict.find(part) != wordDict.end()) {
20                         if (end + 1 == n) return true;
21                         toVisit.push(end + 1);
22                     }
23                 }
24             }
25         }
26         return false;
27     }
http://siddontang.gitbooks.io/leetcode-solution/content/greedy/word_break.html
假设dp(i)表示长度为i的字串能否别切分,dp方程如下:
dp(i) = dp(j) && s[j, i) in dict, 0 <= j < i
    bool wordBreak(string s, unordered_set<string> &dict) {
        int len = (int)s.size();
        vector<bool> dp(len + 1, false);
        dp[0] = true;

        for(int i = 1; i <= len; i++) {
            for(int j = i - 1; j >= 0; j--) {
                if(dp[j] && dict.find(s.substr(j, i - j)) != dict.end()) {
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[len];
    }
http://segmentfault.com/a/1190000003698693
    public boolean wordBreak(String s, Set<String> wordDict) {
        boolean[] dp = new boolean[s.length()+1];
        Arrays.fill(dp,false);
        dp[s.length()]=true;
        // 外层循环递增长度
        for(int i = s.length()-1; i >=0 ; i--){
            // 内层循环寻找分割点
            for(int j = i; j < s.length(); j++){
                String sub = s.substring(i,j+1);
                if(wordDict.contains(sub) && dp[j+1]){
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp[0];
    }

https://cheonhyangzhang.wordpress.com/2015/10/06/139-leetcode-java-word-break-medium/
A[i] stands for for the substring [0, i], if it is a valid word break string.
A[i] = A[j] && dict.contains(substring(j+1, i))
    public boolean wordBreak(String s, Set<String> wordDict){
        int length = s.length();
        boolean[] valid = new boolean[length];
        for (int i = 0; i < length; i ++ ){
            String word = s.substring(0, i + 1);
            if (wordDict.contains(word))
                valid[i] = true;
            else{
                valid[i] = false;
                for (int j = 0; j < i; j ++){
                    if (valid[j] && wordDict.contains(s.substring(j + 1, i + 1))){
                        valid[i] = true;   
                        break;
                    }
                }
            }//else
        }//for
        return valid[s.length() - 1 ];
    }//wordBreak
http://blog.csdn.net/derrantcm/article/details/47774547
public boolean wordBreak(String s, Set<String> wordDict) { // 参数校验 if (s == null || s.length() < 1 || wordDict == null || wordDict.size() < 1) { return false; } // 标记是否匹配,match[i]表表[0, i-1]都匹配 int length = s.length(); boolean[] match = new boolean[length + 1]; match[0] = true; for (int i = 1; i < length + 1; i++) { for (int j = 0; j < i; j++) { if (match[j] && wordDict.contains(s.substring(j, i))) { // f(0,n) = f(0,i) + f(i,j) + f(j,n) match[i] = true; break; } } } return match[length]; }
http://codesniper.blogspot.com/2015/03/139-word-break-leetcode-java.html
 public boolean wordBreak(String s, Set<String> dict) {  
     if(s==null) return false;  
     if(s.length()==0) return true;  
     boolean[] res=new boolean[s.length()+1];  
     res[0]=true;  
     for(int i=0;i<s.length();i++){  
       for(int j=0;j<=i;j++){  
         if(res[j] && dict.contains(s.substring(j,i+1))){  
           res[i+1]=true;  
           break;  
         }  
       }  
     }  
     return res[s.length()];  
   } 
http://www.acmerblog.com/leetcode-word-break-5984.html
可以使用回溯法求解,每次从dict中查找是否有匹配的单词(下面用java实现的,调用startsWith函数),有的话就将字符串s前面的部分截取掉。从而缩小s的范围,直到s为空说明匹配成功。但是这种方法效率太低,运行超时。
02    public static boolean wordBreak(String s, Set<String> dict) {
03         if(s==null || s.length()==0return false;
04         return find(s,dict);
05      }
06     public static boolean find(String s, Set<String> set){
07         if(s.length() == 0return true;
08         boolean find = false;
09         for(String str:set){
10             if(s.startsWith(str)){
11                 find =  find(s.substring(str.length()), set);
12             }
13             if(find) return true;
14         }
15         return false;
16     }
可以利用记忆化存储的方式改进上面的算法,因为s会有很多重复的字符被重复计算。利用数组dp[][]记录下面字串的匹配结果为 true或false。初始化为0,表示没有计算过,dp[i][j] = 1表示 字串(i,j)的匹配结果为true,dp[i][j] = -1表示匹配结果为false。
01public class WordBreak {
02        static int dp[][];
03        static String string;
04     public static boolean wordBreak(String s, Set<String> dict) {
05         if(s==null || s.length()==0return false;
06         dp = new int[s.length()+1][s.length()+1];
07         string = s;
08         return find(0, s.length() ,dict);
09      }
10     public static boolean find(int s, int e, Set<String> set){
11         if(dp[s][e] == -1return false;
12         if( s == e || dp[s][e] == 1return true;
13         String tmpstr = string.substring(s,e);
14         boolean find = false;
15         for(String str:set){
16             if(tmpstr.startsWith(str)){
17                 find =  find(s+str.length(), e, set);
18             }
19             if(find){
20                dp[s][e] = 1;
21                 return true;
22             }
23         }
24         dp[s][e] = -1;
25         return false;
26     }

突然发现使用二维数组是没有必要的。可以使用一个数组 dp[i]表示 str[0.... i-1]能否成功分词。dp[0]初始化为true.
02    public boolean wordBreak(String s, Set<String> dict) {
03        boolean[] t = new boolean[s.length()+1];
04        t[0] = true//set first to be true, why?
05        //Because we need initial state
06
07        for(int i=0; i<s.length(); i++){
08            //should continue from match position
09            if(!t[i])
10                continue;
11
12            for(String a: dict){
13                int len = a.length();
14                int end = i + len;
15                if(end > s.length())
16                    continue;
17
18                if(t[end]) continue;
19
20                if(s.substring(i, end).equals(a)){
21                    t[end] = true;
22                }
23            }
24        }
25
26        return t[s.length()];
27    }
http://blog.csdn.net/doc_sgl/article/details/12323015
第一种方法:递归(超时)Time Limit Exceeded
思路:从s的第一个字母向后匹配,如果i前面的前缀可以匹配,就看s字符串i以后的后缀是否匹配
  1. bool wordBreak(string s, unordered_set<string> &dict) {  
  2.         // Note: The Solution object is instantiated only once.  
  3.         if(s.length() < 1) return true;  
  4.         bool flag = false;  
  5.         for(int i = 1; i <= s.length(); i++)  
  6.         {  
  7.             string tmpstr = s.substr(0,i);  
  8.             unordered_set<string>::iterator it = dict.find(tmpstr);  
  9.             if(it != dict.end())  
  10.             {  
  11.                 if(tmpstr.length() == s.length())return true;  
  12.                 flag = wordBreak(s.substr(i),dict);  
  13.             }  
  14.             if(flag)return true;  
  15.         }  
  16.         return false;  
  17.     } 
思路:从s的第一个字母向后匹配,如果i前面的前缀可以匹配,就看s字符串i以后的后缀是否匹配,在找后缀是否匹配时添加了记忆功能。
  1. bool wordBreakHelper(string s, unordered_set<string> &dict,set<string> &unmatch) {  
  2.         if(s.length() < 1) return true;  
  3.         bool flag = false;  
  4.         for(int i = 1; i <= s.length(); i++)  
  5.         {  
  6.             string prefixstr = s.substr(0,i);  
  7.             unordered_set<string>::iterator it = dict.find(prefixstr);  
  8.             if(it != dict.end())  
  9.             {  
  10.                 string suffixstr = s.substr(i);  
  11.                 set<string>::iterator its = unmatch.find(suffixstr);  
  12.                 if(its != unmatch.end())continue;  
  13.                 else{  
  14.                     flag = wordBreakHelper(suffixstr,dict,unmatch);  
  15.                     if(flag) return true;  
  16.                     else unmatch.insert(suffixstr);  
  17.                 }  
  18.             }  
  19.         }  
  20.         return false;  
  21.     } 
dp改进:dict中的单词有的长有的短,当prefixstr串小于最短串时就不匹配了,当prefixstr串大于最长的串时也不用匹配了
  1. bool wordBreakHelper(string s,unordered_set<string> &dict,set<string> &unmatched,int mn,int mx) {  
  2.         if(s.size() < 1) return true;  
  3.         int i = mx < s.length() ? mx : s.length();  
  4.         for(; i >= mn ; i--)  
  5.         {  
  6.             string preffixstr = s.substr(0,i);  
  7.             if(dict.find(preffixstr) != dict.end()){  
  8.                 string suffixstr = s.substr(i);  
  9.                 if(unmatched.find(suffixstr) != unmatched.end())  
  10.                     continue;  
  11.                 else  
  12.                     if(wordBreakHelper(suffixstr, dict, unmatched,mn,mx))  
  13.                         return true;  
  14.                     else  
  15.                         unmatched.insert(suffixstr);  
  16.             }  
  17.         }  
  18.         return false;  
  19.     }  
  20.     bool wordBreak(string s, unordered_set<string> &dict) {  
  21.         // Note: The Solution object is instantiated only once.  
  22.         if(s.length() < 1) return true;  
  23.         if(dict.empty()) return false;  
  24.         unordered_set<string>::iterator it = dict.begin();  
  25.         int maxlen=(*it).length(), minlen=(*it).length();  
  26.         for(it++; it != dict.end(); it++)  
  27.             if((*it).length() > maxlen)  
  28.                 maxlen = (*it).length();  
  29.             else if((*it).length() < minlen)  
  30.                 minlen = (*it).length();  
  31.         set<string> unmatched;  
  32.         return wordBreakHelper(s,dict,unmatched,minlen,maxlen);  
  33.     }  
X. top down DP solution
  1. Any breakable string must also be breakable if we divide it into two substrings properly. There are only N - 1 ways of dividing a string into two parts, with N the length of the string. So our task is to find a proper way out of these N - 1 ones. (only one will do since the problem does not ask us to find all ways.)
  2. To avoid repetition, we use a HashMap to store the results for each substring, which are obtained in turn by calling wordBreak() method.
  3. The total string is breakable if its two complementary substrings are breakable. Once we get the results for both parts, we can check this by getting the results from the HashMap.
In total the program will run in O(n^2) time with n the length of the string

private Map<String, Boolean> myMap = new HashMap<>(); public boolean wordBreak(String s, Set<String> dict) { // check if the dictionary contains the whole string if (dict.contains(s)) { return true; } // divide the string and check for each substring for (int i = 1; i < s.length(); i++) { // check for the left half substring if (!myMap.containsKey(s.substring(0,i))) { myMap.put(s.substring(0,i), wordBreak(s.substring(0,i), dict)); } // check for the right half substring if (!myMap.containsKey(s.substring(i, s.length()))) { myMap.put(s.substring(i, s.length()), wordBreak(s.substring(i, s.length()), dict)); } // if both substrings are breakable, then the total string is also breakable if (myMap.get(s.substring(0,i)) && myMap.get(s.substring(i, s.length()))) { return true; // add false to map } } // else the string is not breakable return false; }
Viewing it as a graph, you need to visit each node at most once. The hash table saves you from visiting each node more than once.
So N nodes with O(N) for the loop in each node gives O(N^2)
Here is my similar, but slightly simpler and slightly more efficient code using array instead of hash table:
public boolean wordBreak(String s, Set<String> dict) { boolean []cannotBreak = new boolean[s.length()]; return wordBreak(s,dict,0,cannotBreak); } boolean wordBreak(String s, Set<String>dict, int start,boolean [] cannotBreak){ int n=s.length(); if (start==n) return true; if (cannotBreak[start]) return false; for (int i=start+1;i<=n;i++) if (dict.contains(s.substring(start,i)) && wordBreak(s, dict,i,cannotBreak)) return true; cannotBreak[start] = true; return false; }
I think it's O(n^2).
Without the map cache, your time cost will be T(n) = T(n-1) + T(n-2) + T(n-3) ...... + T(1).
But with cache, each of T(n-2) T(n-3) ....... and T(1) is also computed and cached during T(n-1)'s computing. So each of them costs only constant time.
In this case, T(n) = T(n-1) + (n-2) = O(n^2).
X.  DP with Trie
https://leetcode.com/problems/word-break/discuss/219467/Java-iterative-solution-beats-99-using-Trie-and-DP
    TrieNode root;
    int[] chars;
    boolean dp[];

    public boolean wordBreak(String s, List<String> wordDict) {
        root = buildTrie(wordDict);
        
        chars = new int[s.length()];
        for (int i = 0; i < s.length(); i++) {
            chars[i] = s.charAt(i) - 'a';
        }
        
        dp = new boolean[chars.length + 1];
        dp[0] = true;
        
        for (int i = 0; i < chars.length; i++) {
            if (dp[i]) {
                fillDp(i);
            }
        }

        return dp[chars.length];
    }

    public void fillDp(int start) {
        TrieNode node = root;
        for (; start < chars.length; start++) {
            if (node.word) dp[start] = true;
            int c = chars[start];
            node = node.next[c];
            if (node == null) return;
        }

        if (node.word) dp[start] = true;
    }

    private static class TrieNode {
        TrieNode[] next = new TrieNode[26];
        boolean word;
    }

    private TrieNode buildTrie(List<String> dict) {
        TrieNode root = new TrieNode();
        for (String word : dict) {
            TrieNode node = root;
            for (char c : word.toCharArray()) {
                int i = c - 'a';
                if (node.next[i] == null) node.next[i] = new TrieNode();
                node = node.next[i];
            }
            node.word = true;
        }
        return root;
    }
https://www.techiedelight.com/word-break-problem-using-trie/
这道题目可以使用 Trie 来预处理构建 dictionary,该方法特别适合两种情况:1. dictionary 非常大; 2. 该function调用次数非常多。
https://www.geeksforgeeks.org/word-break-problem-trie-solution/
here I used trie tree search to assist the DP, 4ms 90%:
public class TrieNode { boolean isWord; TrieNode[] c; public TrieNode(){ isWord = false; c = new TrieNode[128]; } } public void addWord(TrieNode t, String w) { for (int i = 0; i < w.length(); i++) { int j = (int)w.charAt(i); if (t.c[j] == null) t.c[j] = new TrieNode(); t = t.c[j]; } t.isWord = true; } public boolean wordBreak(String s, List<String> wordDict) { TrieNode t = new TrieNode(), cur; for (String i : wordDict) addWord(t, i); char[] str = s.toCharArray(); int len = str.length; boolean[] f = new boolean[len + 1]; f[len] = true; for (int i = len - 1; i >= 0; i--) { //System.out.println(str[i]); cur = t; for (int j = i; cur != null && j < len ; j++) { cur = cur.c[(int)str[j]]; if (cur != null && cur.isWord && f[j + 1]) { f[i] = true; break; } } } return f[0]; }
X.DFS
https://discuss.leetcode.com/topic/44941/java-dfs-memorization-solution-beats-92
public boolean wordBreak(String s, Set<String> wordDict) {
        if (s == null || s.length() == 0) {
            return false;
        }
        int maxLength = maxLengthStr(wordDict);
        return dfs(s, 0, wordDict, new boolean[s.length()], maxLength);
    }
    
    private boolean dfs(String s, int idx, Set<String> wordDict, boolean[] unbreakable, int maxLength) {
        if (idx >= s.length()) {
            return true;
        } else if (unbreakable[idx]) {
            return false;
        } else {
            int end = Math.min(idx + maxLength, s.length());
            for (int i = idx + 1; i <= end; i++) {
                if (wordDict.contains(s.substring(idx, i))) {
                    if (dfs(s, i, wordDict, unbreakable, maxLength)) {
                        return true;
                    }
                }
            }
            unbreakable[idx] = true;
            return false;
        } 
    }
https://discuss.leetcode.com/topic/29840/concise-dfs-backtracking-solution/2
public boolean wordBreak(String s, Set<String> wordDict) {
    return dfs(s, wordDict, new HashSet<>());
}

private boolean dfs(String s, Set<String> wordDict, Set<String> checked) {
    if (s.isEmpty()) return true;
    if (checked.contains(s)) return false;
    checked.add(s);
    
    for (String w : wordDict) {
        if (s.startsWith(w) && dfs(s.substring(w.length()), wordDict, checked)) return true;
    }
    return false;
}
https://leetcode.com/discuss/26956/dfs-with-path-memorizing-java-solution
Use a set to record all position that cannot find a match in dict. That cuts down the run time of DFS to O(n^2)
public boolean wordBreak(String s, Set<String> dict) { // DFS Set<Integer> set = new HashSet<Integer>(); return dfs(s, 0, dict, set); } private boolean dfs(String s, int index, Set<String> dict, Set<Integer> set){ // base case if(index == s.length()) return true; // check memory if(set.contains(index)) return false; // recursion for(int i = index+1;i <= s.length();i++){ String t = s.substring(index, i); if(dict.contains(t)) if(dfs(s, i, dict, set)) return true; else set.add(i); } set.add(index); return false; }
X. BFS
https://discuss.leetcode.com/topic/2545/a-solution-using-bfs
The solution I post below using BFS is no better than those. Just to share some new thoughts.
We can use a graph to represent the possible solutions. The vertices of the graph are simply the positions of the first characters of the words and each edge actually represents a word. For example, the input string is "nightmare", there are two ways to break it, "night mare" and "nightmare". The graph would be
0-->5-->9
|__ __ _^
The question is simply to check if there is a path from 0 to 9. The most efficient way is traversing the graph using BFS with the help of a queue and a hash set. The hash set is used to keep track of the visited nodes to avoid repeating the same work.
For this problem, the time complexity is O(n^2) and space complexity is O(n), the same with DP. This idea can be used to solve the problem word break II. We can simple construct the graph using BFS, save it into a map and then find all the paths using DFS.

This BFS is just a optimized DP which avoids irrelevant states.
public boolean wordBreak(String s, Set<String> dict) { if (dict.contains(s)) return true; Queue<Integer> queue = new LinkedList<Integer>(); queue.offer(0); // use a set to record checked index to avoid repeated work. // This is the key to reduce the running time to O(N^2). Set<Integer> visited = new HashSet<Integer>(); visited.add(0); while (!queue.isEmpty()) { int curIdx = queue.poll(); for (int i = curIdx+1; i <= s.length(); i++) { if (visited.contains(i)) continue; if (dict.contains(s.substring(curIdx, i))) { if (i == s.length()) return true; queue.offer(i); visited.add(i); } } } return false; }
The code can still be optimized if we record the maximum and minimum length of the words in wordDict and only traverse the places within the minimum and maximum length.
bool wordBreak(string s, unordered_set<string>& wordDict) { for (string word : wordDict) { minlen = min(minlen, (int)word.length()); maxlen = max(maxlen, (int)word.length()); } queue<int> toVisit; unordered_set<int> isVisited; toVisit.push(0); while (!toVisit.empty()) { int start = toVisit.front(); toVisit.pop(); if (isVisited.find(start) == isVisited.end()) { isVisited.insert(start); for (int i = start + minlen - 1; i < min(start + maxlen, (int)s.length()); i++) { if (wordDict.find(s.substr(start, i - start + 1)) != wordDict.end()) { if (i + 1 == s.length()) return true; toVisit.push(i + 1); } } } } return false; }

http://yucoding.blogspot.com/2013/12/leetcode-question-word-break-i.html
The complexity of this problem is O(n^3), where n is the length of the original string. The code below has been tested and passed all the test cases in the OJ.
bool wordBreak(string s, unordered_set<string> &dict) {
        // IMPORTANT: Please reset any member data you declared, as
        // the same Solution instance will be reused for each test case.
        if (s.empty()){return false;}
        vector<vector<bool> > mp(s.size(),vector<bool>(s.size(),false));
         
        for (int i=0;i<s.size();i++){
            for (int j=i;j<s.size();j++){
                if (dict.find(s.substr(i,j-i+1))!=dict.end()){
                    mp[i][j]=true;
                }
            }
        }
         
        for (int i=0;i<s.size();i++){
            for (int j=i;j<s.size();j++){
                for(int k=i;k<j;k++){
                    if (!mp[i][j]){
                        mp[i][j]=mp[i][k]&&mp[k+1][j];
                    }
                }
            }
        }
         
        return mp[0][s.size()-1];
    }
https://leetcode.com/discuss/24964/o-2-n-or-o-n
http://rleetcode.blogspot.com/2014/01/word-break.html
https://leetcode.com/discuss/17365/java-8-functional-programming-rocks
public class WordBreaker { /** * Creates the word breaker. * * @param dictionary each element of the set corresponds to a word inside the * dictionary. */ public WordBreaker(final HashSet<String> dictionary) { this.dictionary = dictionary; // The size of the largest dictionary word: this.maxSize = dictionary.stream() .mapToInt(String::length) .max() .orElse(0); this.failures = new HashSet<String>(); } /** * Tells if the given word can be de-compounded with the current dictionary. * * @param word The word to be tested. * * @return True on success, false otherwise. */ public boolean wordBreak(String word) { if (word.isEmpty()) { return false; // End of recursion. } else if (failures.contains(word)) { return false; // Failure memoized. } else if (dictionary.contains(word)) { return true; // The whole word belongs to the dictionary. } int splitPos = Integer.min(maxSize, word.length() - 1); /* Smart solution: - splits the world into two non empty parts - if the prefix is in the dictionary - then recursively call wordBreak with the prefix - at the first successful recursion, succeeds. */ if (!IntStream.range(0, splitPos) .mapToObj(pos -> word.substring(0, splitPos - pos)) .filter(prefix -> dictionary.contains(prefix)) .map(prefix -> word.substring(prefix.length())) .anyMatch(suffix -> wordBreak(suffix))) { failures.add(word); // memoizes any failure. return false; } return true; } private final HashSet<String> dictionary; private HashSet<String> failures; private int maxSize; }

http://www.programcreek.com/2012/12/leetcode-solution-word-break/
The problem is equivalent to matching the regular expression (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n). (Thanks to hdante.) Leetcode online judge does not allow using the Pattern class though.
public static void main(String[] args) {
 HashSet<String> dict = new HashSet<String>();
 dict.add("go");
 dict.add("goal");
 dict.add("goals");
 dict.add("special");
 
 StringBuilder sb = new StringBuilder();
 
 for(String s: dict){
  sb.append(s + "|");
 }
 
 String pattern = sb.toString().substring(0, sb.length()-1);
 pattern = "("+pattern+")*";
 Pattern p = Pattern.compile(pattern);
 Matcher m = p.matcher("goalspecial");
 
 if(m.matches()){
  System.out.println("match");
 }
}

X.
This problem can be solve by using a naive approach, which is trivial. A discussion can always start from that though.
    public boolean wordBreak(String s, Set<String> dict) {
             return wordBreakHelper(s, dict, 0);
    }
 
    public boolean wordBreakHelper(String s, Set<String> dict, int start){
        if(start == s.length()) 
            return true;
 
        for(String a: dict){
            int len = a.length();
            int end = start+len;
 
            //end index sould be <= string length
            if(end > s.length()) 
                continue;
 
            if(s.substring(start, start+len).equals(a))
                if(wordBreakHelper(s, dict, start+len))
                    return true;
        }
 
        return false;
    }
https://leetcode.com/discuss/7181/what-time-complexity-recursive-solution-this-problem-how-get
Most recursive algorithm like this have an exponential time complexity. To see this, we expand T(n-1) to
T(n-1) = k(n-1) + T(n-2) + T(n-3) + ... T(1) <=> T(n-1) - k(n-1) = T(n-2)+...+T(1) ---------- (1)
and we already have
T(n) = kn + T(n-1) + T(n-2) + ... + T(1) <=> T(n) - kn - T(n-1) = T(n-2) +... + T(1) ----------(2)
Combine (1) and (2) we have:
T(n) - kn - T(n-1) = T(n-1) - k(n-1) <=> T(n) = 2T(n-1) + k
Therefore, we have O(n) = 2^n
https://discuss.leetcode.com/topic/3454/hi-anyone-who-knows-the-time-complexity-of-recursion-with-dfs-here-is-the-code/5
bool wordBreakHelper(string s,int len, unordered_set<string> &dict, int pos)
    {
        if (pos == len)
            return true;
        for (int i = 1; i <= len - pos; i++)
        {
            string t = s.substr(pos, i);
            if (dict.find(t) != dict.end())
            {
                if (wordBreakHelper(s, len, dict, pos + i))
                    return true;
            }
        }
        return false;
    }
    bool wordBreak(string s, unordered_set<string> &dict)
    {
        return wordBreakHelper(s, s.length(), dict, 0);
    }
The time complexity should be O(2^n).
Let f(n) be the total loop times, where n is s.length() - pos, i.e. the length of the string.
Given s="aaa…ab", dict=["a","aa","aaa",…"aaa…a"],
  • f(n) = f(n-1) + f(n-2) + … + f(1)
  • f(n-1) = f(n-2) + f(n-3) + … + f(1)
  • f(1) = 1
then substitute f(n-1) in f(n):
f(n) = ( f(n-2) + f(n-3) + … + f(1) ) + ( f(n-2) + f(n-3) + … + f(1) )
     = 2 × ( f(n-2) + f(n-3) + … + f(1) )
     = 2 × 2 × ( f(n-3) + f(n-4) + … + f(1) )
     = 2^(n-1)
And searching in the dict take O(1) time in average. So the total time is O(2^n).
https://www.cnblogs.com/lightwindy/p/8509053.html
Followup: 返回其中的一个解,如果要返回全部解需要用140. Word Break II的方法,一个解要简单很多。F jia
  public String wordBreak(String s, Set<String> dict) {
      if (s == null || s.isEmpty() || dict == null) {
          return "";
      }
      boolean[] dp = new boolean[s.length() + 1];
      String[] words = new String[s.length() + 1];
      dp[0] = true;
      words[0] = "";
      for (int i = 1; i <= s.length(); i++) {
          for (int j = 0; j < i; j++) {
              if (dp[j] && dict.contains(s.substring(j, i))) {
                  dp[i] = true;
                  if (words[j].isEmpty()) {
                      words[i] = s.substring(j, i);
                  else {
                      words[i] = words[j] + " " + s.substring(j, i);
                  }
              }
          }
      }
      if (dp[s.length()]) {
          return words[s.length()];
      else {
          return "";
      }
  }  
https://leetcode.com/problems/word-break/discuss/143310/Possible-Follow-Up-Question
Has anybody thought about the solution when the problem is changed such that each word in the dictionary can be only used once?
My initial thoughts:
• DP:
At each point i where the string is breakable, we need to use a set to remember which words we have used to break the string up to this point. And when we are considering a later point j, based on which previous breakable point we are using as a pivot, we will need to consider the "breakability" of the string at index j with (input set - set stored at index j)
• DFS + Memo:
The memo is not only keyed by the starting index, but should also encode the information of what words are still available in the set, because both of those together represent the state of the DFS (whether the substring starting at index i, with the remaining usable words, is breakable). But I am not exactly sure how to encode a set of strings as a key. (sort the strings and concatenate?)


140
变种题,求最⼩小合理理分割次数
最后⼀一道简单的Dictionary中能不不能组成target word,最后follow up关于
dictionary很⼤大的情况怎么办,⽐比如⽤用cache。
2. 变种 同样⽤用DP 原来维护能否拼成改成 维护最⼩小拼接次数

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts