LeetCode 140 - Word Break II


Related: LeetCode 139 - Word Break I
[LeetCode] Word Break II | 程序员达达
Given a string s and a dictionary of words dict, add spaces in s to construct a sentence where each word is a valid dictionary word.
Return all such possible sentences.
For example, given
s = "catsanddog",
dict = ["cat", "cats", "and", "sand", "dog"].
A solution is ["cats and dog", "cat sand dog"].
https://stackoverflow.com/questions/39519667/whats-the-time-complexity-of-this-recursive-algorithm
Time complexity is O(2^n * n).
The number of maximal such word breaks is 2^(n-1), with the bijection to binary vectors of length n-1: For each space between two characters, you can either break the word there (1 in the vector) or not (0 in the vector), giving you 2^(n-1) such possible words.
Since creating each string, calculating its hash and adding it to the set is O(n) (the substring's length), you get O(2^n * n).
You get the worst case for:
dict = [a, aa, aaa, aaaa, ....]
s = "aaaaaaaaa...a"
Your map ensures you do not do duplicated work, using memoization. (Without that, complexity would have been factorial of n, but using it you avoid duplicate work like recalculating from scratch the prefix of "aaa...a|a|a" and "aa...a|aa"
http://blog.csdn.net/linhuanmars/article/details/22452163
这道题目要求跟Word Break比较类似,不过返回的结果不仅要知道能不能break,如果可以还要返回所有合法结果。一般来说这种要求会让动态规划的效果减弱很多,因为我们要在过程中记录下所有的合法结果,中间的操作会使得算法的复杂度不再是动态规划的两层循环,因为每次迭代中还需要不是constant的操作,最终复杂度会主要取决于结果的数量,而且还会占用大量的空间,因为不仅要保存最终结果,包括中间的合法结果也要一一保存,否则后面需要历史信息会取不到。所以这道题目我们介绍两种方法,一种是直接brute force用递归解,另一种是跟Word Break思路类似的动态规划。
对于brute force解法,代码比较简单,每次维护一个当前结果集,然后遍历剩下的所有子串,如果子串在字典中出现,则保存一下结果,并放入下一层递归剩下的字符。思路接近于我们在N-Queens这些NP问题中经常用到的套路。
  1. public ArrayList<String> wordBreak(String s, Set<String> dict) {  
  2.     ArrayList<String> res = new ArrayList<String>();  
  3.     if(s==null || s.length()==0)  
  4.         return res;  
  5.     helper(s,dict,0,"",res);  
  6.     return res;  
  7. }  
  8. private void helper(String s, Set<String> dict, int start, String item, ArrayList<String> res)  
  9. {  
  10.     if(start>=s.length())  
  11.     {  
  12.         res.add(item);  
  13.         return;  
  14.     }  
  15.     StringBuilder str = new StringBuilder();  
  16.     for(int i=start;i<s.length();i++)  
  17.     {  
  18.         str.append(s.charAt(i));  
  19.         if(dict.contains(str.toString()))  
  20.         {  
  21.             String newItem = item.length()>0?(item+" "+str.toString()):str.toString();  
  22.             helper(s,dict,i+1,newItem,res);  
  23.         }  
  24.     }  
  25. }
X. DP
https://blog.csdn.net/Zzy_ZhangZeyu_/article/details/80317513
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        vector<vector<string> > dp(s.length() + 1, vector<string>());
        dp[0].push_back(string());

        unordered_set<string> dict;
        for(string str : wordDict)
            dict.insert(str);

        if( !breakable(s, dict) )
            return dp.back();

        for(int i = 1; i <= s.length(); i++){
            for(int j = 0; j < i; j++){
                string ss = s.substr(j, i - j);
                if(dp[j].size() > 0 && dict.find(ss) != dict.end()){
                    for(string prefix : dp[j]){
                        dp[i].push_back(prefix + (prefix == "" ? "" : " ") + ss);
                    }
                }
            }
        }

        return dp.back();
    }

    bool breakable(string s, unordered_set<string> &dict){
        vector<bool> dp(s.length() + 1, false);
        dp[0] = true;

        for(int i = 1; i <= s.length(); i++){
            for(int j = 0; j < i; j++){
                string ss = s.substr(j, i - j);
                if(dp[j] && dict.find(ss) != dict.end()){
                    dp[i] = true;
                    break;
                }
            }
        }
        return dp.back();
    }

X. DFS + cache
https://leetcode.com/problems/word-break-ii/discuss/44167/My-concise-JAVA-solution-based-on-memorized-DFS

X. Fastest
public List<String> wordBreak(String s, List<String> wordDict) { List<String> res = new ArrayList<>(); int n = s.length(); if (n == 0) return res; Set<String> dict = new HashSet<>(); int maxLen = 0; for (String word : wordDict) { dict.add(word); maxLen = Math.max(maxLen, word.length()); } boolean[] dp = new boolean[n + 1]; Arrays.fill(dp, true); dfs(s, new StringBuilder(), res, dp, dict, maxLen, 0); return res; } private void dfs(String s, StringBuilder tmp, List<String> res, boolean[] dp, Set<String> dict, int maxLen, int depth) { if (depth == s.length()) { res.add(tmp.toString()); } else { for (int i = depth + 1; i <= Math.min(s.length(), depth + maxLen); i++) { if (dp[i]) { String sub = s.substring(depth, i); if (dict.contains(sub)) { int oldLen = tmp.length(); if (oldLen != 0) tmp.append(" "); tmp.append(sub); int oldSize = res.size(); dfs(s, tmp, res, dp, dict, maxLen, i); if (res.size() == oldSize) dp[i] = false; tmp.delete(oldLen, tmp.length()); } } } } }


public List<String> wordBreak(String s, List<String> wordDict) { // write your code here if (s == null || s.length() == 0) { return new ArrayList<String>(); } Set<String> wordSet = new HashSet<>(); int maxLen = 0; for (String word : wordDict) { wordSet.add(word); maxLen = Math.max(maxLen, word.length()); } Map<Integer, List<String>> memo = new HashMap<>(); List<String> res = helper(s, wordSet, memo, 0, maxLen); return res; } private List<String> helper(String s, Set<String> wordSet, Map<Integer, List<String>> memo, int startIdx, int maxLen) { if (memo.containsKey(startIdx)) { return memo.get(startIdx); } List<String> result = new ArrayList<>(); for (int i = startIdx; i < s.length() && i < startIdx + maxLen; i++) { String pref = s.substring(startIdx, i + 1); if (wordSet.contains(pref)) { if (i == s.length() - 1) { result.add(pref); return result; } List<String> tmp = helper(s, wordSet, memo, i + 1, maxLen); for (String suff : tmp) { result.add(pref + " " + suff); } } } memo.put(startIdx, result); return result; }
另一种思路是Backtracking。如果[start, end]之间组成的单词在wordDict中,就递归调用dfs(end + 1, ...)来查找以end + 1开头所能组成的句子。再将[start, end]和dfs(end + 1, ...)返回的句子拼接起来,组成完成的句子。(注:若dfs返回为空,则表明不能组成句子)。

一个关键的地方就是使用memory来避免重复计算。比如下图中,黄色高亮部分的dog,可以直接根据之前相同位置的搜索结果来获得。其中memo[7],表示以index为7的位置为起始所能组成的句子。
    vector<string> wordBreak(string s, vector<string>& wordDict) {
        unordered_set<string> dict;
        for(string str : wordDict)
            dict.insert(str);

        vector<string >ret = dfs(s, 0, dict);

        return ret;
    }

    vector<string> dfs(string &s, int start, unordered_set<string> &dict){
        if(memo.find(start) != memo.end())
            return memo[start];

        if(start == s.length()){
            cout << "create memo: " << start << endl;
            memo[start] = vector<string>(1, string());
            return memo[start];
        }

        vector<string> ret;
        for(int i = start; i < s.length(); i++){
            string ss = s.substr(start, i - start + 1);
            if(dict.find(ss) != dict.end()){
                vector<string> suffixes = dfs(s, i + 1, dict);

                for(string suffix : suffixes)
                    ret.push_back(ss + (suffix == "" ? "" : " ") + suffix);
            }
        }

        cout << "create memo: " << start << endl;
        memo[start] = ret;
        return ret;
    }

private:
    map<int, vector<string> > memo;

http://www.jiuzhang.com/solutions/word-break-ii/
https://leetcode.com/discuss/65692/my-concise-java-solution-based-on-memorized-dfs
Using DFS directly will lead to TLE, so I just used HashMap to save the previous results to prune duplicated branches.
public List<String> wordBreak(String s, Set<String> wordDict) { return DFS(s, wordDict, new HashMap<String, LinkedList<String>>()); } // DFS function returns an array including all substrings derived from s. List<String> DFS(String s, Set<String> wordDict, HashMap<String, LinkedList<String>>map) { if (map.containsKey(s)) return map.get(s); LinkedList<String>res = new LinkedList<String>(); if (s.length() == 0) { res.add(""); return res; } for (String word : wordDict) { if (s.startsWith(word)) { List<String>sublist = DFS(s.substring(word.length()), wordDict, map); for (String sub : sublist) res.add(word + (sub.isEmpty() ? "" : " ") + sub); } } map.put(s, res); return res; }
Time complexity is O(len(wordDict) ^ len(s / minWordLenInDict)), because there're len(wordDict) possibilities for each cut
First check whether there is solution.
http://www.danielbit.com/blog/puzzle/leetcode/leetcode-word-break-ii
    public List<String> wordBreak(String s, Set<String> dict) {
        List<String> ret = new ArrayList<String>();
        if (s==null||s.length()==0) return ret;
 
        boolean[] table = new boolean[s.length()+1];
        table[0] = true;
        for (int i=1; i<s.length()+1; ++i)
            for (int k=0; k<i; ++k) {
                table[i] = table[k] && dict.contains(s.substring(k,i));
                if (table[i] == true)
                    break;
            }
        // s can not be divided
        if (table[s.length()]==false) return ret;
 
        StringBuilder sb = new StringBuilder();
        dfs(s, 0, sb, ret, dict);
        return ret;
    }
 
    private void dfs(String s, int start, StringBuilder sb, List<String> ret, Set<String> dict) {
        // termination condition of recursion
        if ( start>= s.length() ) {
            ret.add(new String(sb));
            return;
        }
        for (int i=start+1; i<=s.length(); ++i) {
            String sub = s.substring(start, i); // i is exclusive
            if (dict.contains(sub)) {
                int curLen = sb.length();
                if ( curLen!=0 )
                    sb.append(" ");
                sb.append(sub);
 
                dfs(s, i, sb, ret, dict);
                sb.delete(curLen, sb.length());
            }
        }
 
    }
https://leetcode.com/discuss/91894/java-6ms-simple-solution-beating-88%25
Save it to hashmap to memorize the intermediate results and max length of valid word in dict. 
Can also get the min length. But is it worth? As it need iterate the dict which may be huge.
HashMap<Integer, List<String>> dp = new HashMap<>(); public List<String> wordBreak(String s, Set<String> wordDict) { int maxLength = -1; for(String ss : wordDict) maxLength = Math.max(maxLength, ss.length()); return addSpaces(s, wordDict, 0, maxLength); } private List<String> addSpaces(String s, Set<String> wordDict, int start, int max){ List<String> words = new ArrayList<>(); if(start == s.length()) { words.add(""); return words; } for(int i = start + 1; i <= max + start && i <= s.length(); i++){ String temp = s.substring(start, i); if(wordDict.contains(temp)){ List<String> ll; if(dp.containsKey(i)) ll = dp.get(i); else ll = addSpaces(s, wordDict, i, max); for(String ss : ll) words.add(temp + (ss.equals("") ? "" : " ") + ss); } } dp.put(start, words); return words; }
X. Great solution: 
http://www.jiuzhang.com/solutions/word-break-ii/
    private void search(int index, String s, List<Integer> path,
                   boolean[][] isWord, boolean[] possible,
                   List<String> result) {
        if (!possible[index]) {
            return;
        }
        
        if (index == s.length()) {
            StringBuilder sb = new StringBuilder();
            int lastIndex = 0;
            for (int i = 0; i < path.size(); i++) {
                sb.append(s.substring(lastIndex, path.get(i)));
                if (i != path.size() - 1) sb.append(" ");
                lastIndex = path.get(i);
            }
            result.add(sb.toString());
            return;
        }
        
        for (int i = index; i < s.length(); i++) {
            if (!isWord[index][i]) {
                continue;
            }
            path.add(i + 1);
            search(i + 1, s, path, isWord, possible, result);
            path.remove(path.size() - 1);
        }
    }
    
    public List<String> wordBreak(String s, Set<String> wordDict) {
        ArrayList<String> result = new ArrayList<String>();
        if (s.length() == 0) {
            return result;
        }
        
        boolean[][] isWord = new boolean[s.length()][s.length()];
        for (int i = 0; i < s.length(); i++) {
            for (int j = i; j < s.length(); j++) {
                String word = s.substring(i, j + 1);
                isWord[i][j] = wordDict.contains(word);
            }
        }
        
        boolean[] possible = new boolean[s.length() + 1];
        possible[s.length()] = true;
        for (int i = s.length() - 1; i >= 0; i--) {
            for (int j = i; j < s.length(); j++) {
                if (isWord[i][j] && possible[j + 1]) {
                    possible[i] = true;
                    break;
                }
            }
        }
        
        List<Integer> path = new ArrayList<Integer>();
        search(0, s, path, isWord, possible, result);
        return result;
    }

https://leetcode.com/discuss/11564/my-dp-solution-in-java
Recursively scan through the array from left to right. At each point, perform a for loop from the current index to the end of the string to see if you can form words along the way. If a word is found, update the current index to the position right after the word, and call the recursive function.
The key to minimizing time complexity is by keeping a hash set of indexes that are "unbreakable", meaning you cannot break up the words properly from the current index to the end of the string. When we encounter these kind of situations, we know to return and save time. An additional boolean variable "wordFound" is used to keep track of whether or not the remaining substring is "breakable".
public List<String> wordBreak(String s, Set<String> wordDict) { List<String> result = new LinkedList<String>(); wordBreakHelper(new HashSet(), 0,result, s, "", wordDict); return result; } public boolean wordBreakHelper(HashSet badIndex, int curIndex, List<String> result,
String s, String curSentence, Set<String> wordDict) { boolean foundWord = false; for(int i = curIndex; i < s.length(); i++) { if(wordDict.contains(s.substring(curIndex,i+1))) { String newSentence = curSentence + " " + s.substring(curIndex,i+1); if(i == s.length()-1) { result.add(newSentence.substring(1)); foundWord = true; } else { if(badIndex.contains(i+1)) continue; if(wordBreakHelper(badIndex, i+1, result, s, newSentence, wordDict)) foundWord = true; } } } if(foundWord) return true; else { badIndex.add(curIndex); return false; } }


X. DP
http://www.programcreek.com/2014/03/leetcode-word-break-ii-java/
The following diagram shows the structure of the tracking array.
public static List<String> wordBreak(String s, Set<String> dict) {
    //create an array of ArrayList<String>
    List<String> dp[] = new ArrayList[s.length()+1];
    dp[0] = new ArrayList<String>();
 
    for(int i=0; i<s.length(); i++){
        if( dp[i] == null ) 
            continue; 
 
        for(String word:dict){
            int len = word.length();
            int end = i+len;
            if(end > s.length()) 
                continue;
 
            if(s.substring(i,end).equals(word)){
                if(dp[end] == null){
                    dp[end] = new ArrayList<String>();
                }
                dp[end].add(word);
            }
        }
    }
 
    List<String> result = new LinkedList<String>();
    if(dp[s.length()] == null) 
        return result; 
 
    ArrayList<String> temp = new ArrayList<String>();
    dfs(dp, s.length(), result, temp);
 
    return result;
}
 
public static void dfs(List<String> dp[],int end,List<String> result, ArrayList<String> tmp){
    if(end <= 0){
        String path = tmp.get(tmp.size()-1);
        for(int i=tmp.size()-2; i>=0; i--){
            path += " " + tmp.get(i) ;
        }
 
        result.add(path);
        return;
    }
 
    for(String str : dp[end]){
        tmp.add(str);
        dfs(dp, end-str.length(), result, tmp);
        tmp.remove(tmp.size()-1);
    }
}
http://www.danielbit.com/blog/puzzle/leetcode/leetcode-word-break-ii

    public List<String> wordBreak(String s, Set<String> dict) {
        if (s == null || s.length() == 0)
            return new ArrayList<String>();
        int n = s.length();
        boolean[] wb = new boolean[n+1];
        wb[0] = true;
        List<List<String>> words = new ArrayList<List<String>>();
        for (int i = 0; i <= n; i++)
            words.add(new ArrayList<String>());
        words.get(0).add("");
        for (int i = 1; i <= n; i++) {
            for (int j = 0; j < i; j++) {
                String temp = s.substring(j, i);
                if (wb[j] && dict.contains(temp)) {
                    wb[i] = true;
                    for (String str : words.get(j)) {
                        if (str.equals(""))
                            words.get(i).add(String.format("%s", temp));//?
                        else
                            words.get(i).add(String.format("%s %s", str, temp));//?
                    }
                }
            }
        }
        return words.get(n);
    }
http://fisherlei.blogspot.com/2013/11/leetcode-wordbreak-ii-solution.html
http://www.csdn123.com/html/topnews201408/2/5102.htm
我们可以加一个boolean的数组,b[i]表示从i到len的的字串可不可以进行word break. 如果我们在当前根本没有找到任何的word, 也就表明这一串是不能word break的,记一个false在数组里。这样下次进入dfs这里的时候,直接就返回一个false.通过这个剪枝我们也可以减少复杂度。
     * // 解法3:重新剪枝。
     */
    // 我们用DFS来解决这个问题吧
    public static List<String> wordBreak(final String s, final Set<String> dict) {
        if (s == null || s.length() == 0 || dict == null) {
            return null;
        }

        final List<String> ret = new ArrayList<String>();

        // 记录切割过程中生成的字母
        final List<String> path = new ArrayList<String>();

        final int len = s.length();

        // 注意:一定要分配 Len+1 否则会爆哦.
        final boolean canBreak[] = new boolean[len + 1];
        for (int i = 0; i < len + 1; i++) {
            canBreak[i] = true;
        }

        dfs3(s, dict, path, ret, 0, canBreak);

        return ret;
    }

    // 我们用DFS模板来解决这个问题吧
    public static void dfs3(final String s, final Set<String> dict, final List<String> path, final List<String> ret,
            final int index, final boolean canBreak[]) {
        final int len = s.length();
        if (index == len) {
            // 结束了。index到了末尾
            final StringBuilder sb = new StringBuilder();
            for (final String str : path) {
                sb.append(str);
                sb.append(" ");
            }
            // remove the last " "
            sb.deleteCharAt(sb.length() - 1);
            ret.add(sb.toString());
            return;
        }

        // if can't break, we exit directly.
        if (!canBreak[index]) {
            return;
        }

        for (int i = index; i < len; i++) {
            // 注意这些索引的取值。左字符串的长度从0到len
            final String left = s.substring(index, i + 1);
            if (!dict.contains(left) || !canBreak[i + 1]) {
                // 如果左字符串不在字典中,不需要继续递归
                continue;
            }

            // if can't find any solution, return false, other set it
            // to be true;
            path.add(left);

            final int beforeChange = ret.size();
            dfs3(s, dict, path, ret, i + 1, canBreak);
            // 注意这些剪枝的代码. 关键在于此以减少复杂度
            if (ret.size() == beforeChange) {
                canBreak[i + 1] = false;
            }
            path.remove(path.size() - 1);
        }

    }
http://zjalgorithm.blogspot.com/2014/11/leetcode-in-java-word-break-ii-dp.html
只要在最前面用WB I的dp先检测一下输入的有效性,就能过。

这题有人用像WB I一样的dp,boolean改为存储中间结果,最后TLE。为什么呢,一看test case,因为一个不能break的词,所以过不了。从左往右DP过不了,有人就从右往左写,一下就过了,因为那个超长test case的只管一边,还是自欺欺人。

其实DP是最快的,test case的目的是告诉大家:针对invalid input的优化,程序应该fail fast。而不是傻傻的去算完。


java do not allow generic array:
List<Integer>[] arrayOfLists = new List<Integer>[2];  // compile-time error
public List<String> wordBreak(String s, Set<String> dict) {
    if(!isPossible(s, dict)) return new LinkedList<String>();
    // index: subString(0,index) -> result List<String>
    HashMap<Integer, List<String>> dp = new HashMap<Integer, List<String>>();
    List<String> base = new LinkedList<String>();
    base.add("");
    dp.put(0, base);
    for(int i=0; i<s.length(); i++) {
        List<String> cur = new LinkedList<String>();
        dp.put(i+1, cur);
        for(int j=0; j<=i; j++) {// cache substring result, it's o(n)
            if( dict.contains(s.substring(j,i+1)) ) {    // [0...j-1] && [j...i]; condition: dp[j].size()!=0 simplified
                for( String str : dp.get(j) ) {
                    dp.get(i+1).add(str+ (str.length()==0? "":" ") + s.substring(j, i+1));
                }
            }
        }
    }
    return dp.get(s.length());
}
public boolean isPossible(String s, Set<String> dict) { // exactly same code of with WB I
    int len = s.length();
    boolean[] isSub = new boolean[len+1];
    isSub[0] = true;    // empty str is substring
    for(int i=0; i<len; i++) {
        for(int j=0; j<=i; j++) {
            isSub[i+1] = isSub[j] && dict.contains(s.substring(j,i+1));
            if( isSub[i+1] ) {
                break// break from inner for loop
            }
        }
    }
    return isSub[len];  // return last member
     
}
http://www.acmerblog.com/word-break-ii-6128.html
第一题是用dp[i]表示 字符串s[0....i-1]是否可以分词成功。
为了记录是怎么分词的,我们可以用dp[i]表示 字符串s[0.....i-1]的最后一个分词word,则s[0.....i-1]的倒数第二个分词即为dp[o....i-word.length()-1]。但这种方法只能记录一种分词方法,因此有必要把 dp[i] 定义 List<String>,这样就可以表示多种分词方法,最后使用DFS搜索,找到所有的分词方法。
例如对于上面的例子:  s.length()=10, 则 dp[10] = {“cat”}, dp[7]={“and”,”sand”}, dp[4]={“cats”}, dp[3]={“cat”}, dp[0]={}, 其它的都为null。
dfs搜索的时,可以从后向前,也可以从前向后,只要保证最终结果的顺序即可。下面是用的从后向前搜索
    public static List<String> wordBreak(String s, Set<String> dict) {
03        List<String> dp[] = new ArrayList[s.length()+1];
04        dp[0] = new ArrayList<String>();
05        for(int i=0; i<s.length(); i++){
06            //i是开始位置
07            if( dp[i] == null continue//前面的部分必须是可以匹配的
08            for(String word:dict){
09                int len = word.length();
10                int end = i+len;
11                if(end > s.length()) continue;
12                if(s.substring(i,end).equals(word)){
13                    if(dp[end] == null){
14                        dp[end] = new ArrayList<String>();
15                    }
16                    dp[end].add(word);//记录上一个位置
17                }
18            }
19        }
20
21        List<String> ans = new LinkedList<String>();
22        if(dp[s.length()] == nullreturn ans;
23        ArrayList<String> tmp = new ArrayList<String>();
24        dfsStringList(dp,s.length(),ans, tmp);
25        return ans;
26    }
27
28    public static void dfsStringList(List<String> dp[],int end,List<String> res, ArrayList<String> tmp){
29        if(end <= 0){
30            String ans = tmp.get(tmp.size()-1);
31            for(int i=tmp.size()-2; i>=0; i--)
32                ans += (" " + tmp.get(i) );
33            res.add(ans);
34            return;
35        }
36        for(String str:dp[end]){
37            tmp.add(str);
38            dfsStringList(dp,end-str.length(), res, tmp);
39            tmp.remove(tmp.size()-1);
40        }
41    }
http://yucoding.blogspot.com/2014/01/leetcode-question-word-break-ii.html
For the "Return all" problems, usually DFS or BFS will work well.
In this problem, a DFS with a simple cutting edge condition will pass all the test cases.
It seems that the last test case will cause the "time exceed" error:
Last executed input:"aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaab", ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa"]
All the searching rounds in this case are not needed because this string has no solutions (last "b" cannot match any word in dict).

So, according to the idea of the previous question word break I, after get the index array mp, a initial test is provided and if there is no possible solution, no searching is needed and return the empty vector.
要求所有可能的word break的解集,DFS是跑不了的。所以问题的关键在于如何剪枝。
https://stupidcodergoodluck.wordpress.com/2013/11/16/leetcode-word-break-ii/
第一步dp的结果只用来判断一下从0到n-1的字符串能不能用字典里的词组成,不行直接返回(原来就是少了这一步…),行的话就普通的DFS
    public ArrayList<String> wordBreak(String s, Set<String> dict) {
        ArrayList<String> ret = new ArrayList<String>();
        if (s==null || s.length()==0) return ret;
        int n = s.length();
        boolean[] dp = new boolean[n+1];
        dp[0] = true;
        for (int i=1; i<=n; i++) {
            if (dict.contains(s.substring(0, i))) {
                dp[i] = true;
                continue;
            }
            for (int j=0; j<i; j++) {
                if (dp[j] && dict.contains(s.substring(j, i))) {
                    dp[i] = true;
                }
            }
        }
        if (dp[n] == false) return ret; //DP的作用就这一行!!!
        StringBuilder cur = new StringBuilder();
        dfs(s, 0, cur, ret, dict);
        return ret;
    }

    public void dfs(String s, int start, StringBuilder cur, ArrayList<String> ret, Set<String> dict)  {
        int n = s.length();
        if (start >= n) {
            ret.add(new String(cur));
            return;
        }
        for (int i=start+1; i<=n; i++) {
            String sub = s.substring(start, i);
            if (dict.contains(sub)) {
                int oldLen = cur.length();
                if (oldLen!=0) cur.append(" ");
                cur.append(sub);
                dfs(s, i, cur, ret, dict);
                cur.delete(oldLen, cur.length());
            }
        }
    }

https://leetcode.com/discuss/11564/my-dp-solution-in-java
    public List<String> wordBreak(String s, Set<String> dict) {
        Map<Integer, List<String>> validMap = new HashMap<Integer, List<String>>();

        // initialize the valid values
        List<String> l = new ArrayList<String>();
        l.add("");
        validMap.put(s.length(), l);

        // generate solutions from the end
        for(int i = s.length() - 1; i >= 0; i--) {
            List<String> values = new ArrayList<String>();
            for(int j = i + 1; j <= s.length(); j++) {
                if (dict.contains(s.substring(i, j))) {
                    for(String word : validMap.get(j)) {
                        values.add(s.substring(i, j) + (word.isEmpty() ? "" : " ") + word);
                    }
                }
            }
            validMap.put(i, values);
        }
        return validMap.get(0);
    }
Basically my idea is the following:
  1. Scan the the string from the tail
  2. Build possible solution for the current index based on DP results
  3. Return the solution when index==0
The key to minimizing time complexity is by keeping a hash set of indexes that are "unbreakable", meaning you cannot break up the words properly from the current index to the end of the string. When we encounter these kind of situations, we know to return and save time. An additional boolean variable "wordFound" is used to keep track of whether or not the remaining substring is "breakable".
public List<String> wordBreak(String s, Set<String> wordDict) { List<String> result = new LinkedList<String>(); wordBreakHelper(new HashSet(), 0,result, s, "", wordDict); return result; } public boolean wordBreakHelper(HashSet badIndex, int curIndex, List<String> result, String s, String curSentence, Set<String> wordDict) { boolean foundWord = false; for(int i = curIndex; i < s.length(); i++) { if(wordDict.contains(s.substring(curIndex,i+1))) { String newSentence = curSentence + " " + s.substring(curIndex,i+1); if(i == s.length()-1) { result.add(newSentence.substring(1)); foundWord = true; } else { if(badIndex.contains(i+1)) continue; if(wordBreakHelper(badIndex, i+1, result, s, newSentence, wordDict)) foundWord = true; } } } if(foundWord) return true; else { badIndex.add(curIndex); return false; } }

You can't have an O(n^2) algorithm. The worst cases have 2^n solutions. The OJ test cases are just too easy.
Try the input: "aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa", ["a","aa","aaa","aaaa","aaaaa","aaaaaa","aaaaaaa","aaaaaaaa","aaaaaaaaa","aaaaaaaaaa", "aaaaaaaaaaa","aaaaaaaaaaaa","aaaaaaaaaaaaa","aaaaaaaaaaaaaa","aaaaaaaaaaaaaaa",]
https://leetcode.com/discuss/80266/java-dfs-memoization-dfs-and-pruning-solutions-with-analysis
http://www.cnblogs.com/yuzhangcmu/p/4037299.html
http://blog.csdn.net/u014688145/article/details/70702445
原先我们递归结束时,返回true,并且让true一层层向上传递。而此时,递归能结束(说明匹配一定成功),那么递归返回时,把所有结果记录在list中即可。此时,返回的结果不是单纯的true or false,所以用map来存放键值对。
public List<String> wordBreak(String s, List<String> wordDict) { return canForm(s, wordDict, new HashMap<>()); } private List<String> canForm(String s, List<String> wordDict, Map<String, List<String>> map){ if (map.containsKey(s)) return map.get(s); List<String> ans = new ArrayList<>(); if (s.length() == 0){ ans.add(""); return ans; } for (String word : wordDict){ if (s.startsWith(word)){ List<String> subList = canForm(s.substring(word.length()), wordDict, map); for (String sub : subList){ ans.add(word + (sub.isEmpty() ? "" : " ") + sub); } } } map.put(s, ans); return ans; }
Read full article from [LeetCode] Word Break II | 程序员达达

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts