Related: LeetCode 140 - Word Break II
LeetCode 139 – Word Break
Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".
Time complexity:
https://stackoverflow.com/questions/39519667/whats-the-time-complexity-of-this-recursive-algorithm
X. DP
http://www.programcreek.com/2012/12/leetcode-solution-word-break/
https://leetcode.com/problems/word-break/discuss/43842/Java-dp-solution-without-using-substring-O(N2)
https://leetcode.com/discuss/63212/java-solution-using-dp
public boolean wordBreak(String s, Set<String> wordDict) { if (s == null && wordDict == null) return true; if (s == null || wordDict == null) return false; //dp[i] represents if s.substring(0, i+1) is wordbreakable. boolean[] dp = new boolean[s.length()+1]; dp[0] = true; for (int i = 1; i <= s.length(); i++) { for (int j = i-1; j >=0; j--) { if (dp[j] && wordDict.contains(s.substring(j, i))){ dp[i] = true; break; } } } return dp[s.length()]; }
LeetCode 139 – Word Break
Given a string s and a dictionary of words dict, determine if s can be segmented into a space-separated sequence of one or more dictionary words.
For example, given
s = "leetcode",
dict = ["leet", "code"].
Return true because "leetcode" can be segmented as "leet code".
Time complexity:
https://stackoverflow.com/questions/39519667/whats-the-time-complexity-of-this-recursive-algorithm
Time complexity is
O(2^n * n)
.
The number of maximal such word breaks is
2^(n-1)
, with the bijection to binary vectors of length n-1
: For each space between two characters, you can either break the word there (1 in the vector) or not (0 in the vector), giving you 2^(n-1)
such possible words.
Since creating each string, calculating its hash and adding it to the set is
O(n)
(the substring's length), you get O(2^n * n)
.
You get the worst case for:
dict = [a, aa, aaa, aaaa, ....]
s = "aaaaaaaaa...a"
Your
map
ensures you do not do duplicated work, using memoization. (Without that, complexity would have been factorial of n
, but using it you avoid duplicate work like recalculating from scratch the prefix of "aaa...a|a|a" and "aa...a|aa"X. DP
http://www.programcreek.com/2012/12/leetcode-solution-word-break/
- Define an array t[] such that t[i]==true => 0-(i-1) can be segmented using dictionary
- Initial state t[0] == true
Time: O(string length * dict size)
In Solution 2, if the size of the dictionary is very large, the time is bad. Instead we can solve the problem in O(n^2) time (n is the length of the string).
-Not that efficient
https://www.cnblogs.com/lightwindy/p/8509053.htmlpublic boolean wordBreak(String s, Set<String> dict) { if (s== null || s.length()== 0 ) return true ; boolean [] res = new boolean [s.length()+ 1 ]; res[ 0 ] = true ; for ( int i= 0 ;i<s.length();i++) { StringBuilder str = new StringBuilder(s.substring( 0 ,i+ 1 )); for ( int j= 0 ;j<=i;j++) { if (res[j] && dict.contains(str.toString())) { res[i+ 1 ] = true ; break ; } str.deleteCharAt( 0 ); } } return res[s.length()]; } |
public boolean wordBreak(String s, List<String> wordDict) {
//base case
if(s == null || s.length() == 0) return false;
//use dp and scan from right to left to avoid using substring
boolean[] M = new boolean[s.length() + 1];
M[s.length()] = true;
for(int i = s.length() - 1; i >= 0; --i) {
StringBuilder sb = new StringBuilder();
for(int j = i; j < s.length(); ++j) {
sb.append(s.charAt(j));
//remember convert stringbuilder to string
if(wordDict.contains(sb.toString()) && M[j + 1]) {
M[i] = true;
break;
}
}
}
return M[0];
}
https://leetcode.com/discuss/63212/java-solution-using-dp
Your code can be optimized a bit. In the inner loop, start from the tail to head instead of head to tail to check if the substring exists. We can break immediately when we see a match.
We can do even better if we precalculate the min and max of words' lengths in the dictionary. Just need to check from (i-wordLengthMax) to (i-wordLengthMin),
public boolean wordBreak(String s, Set<String> wordDict) { if (s == null && wordDict == null) return true; if (s == null || wordDict == null) return false; //dp[i] represents if s.substring(0, i+1) is wordbreakable. boolean[] dp = new boolean[s.length()+1]; dp[0] = true; for (int i = 1; i <= s.length(); i++) { for (int j = i-1; j >=0; j--) { if (dp[j] && wordDict.contains(s.substring(j, i))){ dp[i] = true; break; } } } return dp[s.length()]; }
public boolean wordBreak(String s, Set<String> wordDict) {
if (s == null && wordDict == null)
return true;
if (s == null || wordDict == null)
return false;
//dp[i] represents if s.substring(0, i) is wordbreakable.
boolean[] dp = new boolean[s.length()+1];
dp[0] = true;
for (int i = 1; i <= s.length(); i++) {
for (int j = 0; j < i; j++) {
if (dp[j] && wordDict.contains(s.substring(j, i))) {
dp[i] = true;
break;
}
}
}
return dp[s.length()];
}
http://www.jiuzhang.com/solutions/word-break/private int getMaxLength(Set<String> dict) { int maxLength = 0; for (String word : dict) { maxLength = Math.max(maxLength, word.length()); } return maxLength; } public boolean wordBreak(String s, Set<String> dict) { if (s == null || s.length() == 0) { return true; } int maxLength = getMaxLength(dict); boolean[] canSegment = new boolean[s.length() + 1]; canSegment[0] = true; for (int i = 1; i <= s.length(); i++) { canSegment[i] = false; for (int lastWordLength = 1; lastWordLength <= maxLength && lastWordLength <= i; lastWordLength++) { if (!canSegment[i - lastWordLength]) { continue; } String word = s.substring(i - lastWordLength, i); if (dict.contains(word)) { canSegment[i] = true; break; } } } return canSegment[s.length()]; }
Scan the dict to get the max possible length of words first before the loop
public boolean wordBreak(String s, Set<String> wordDict) {
if(s == null || s.length() == 0){
return true;
}
int n = s.length();
boolean [] dp = new boolean[n+1];
dp[0] = true ;
int maxLength = 0;
for(String t : wordDict){
maxLength = Math.max(maxLength, t.length());
}
for(int i = 1; i <= n; i++){
dp[i] = false;
for(int j = i-1; j >= Math.max(0, i - maxLength) && !dp[i]; j--){
if(dp[j]){
if(wordDict.contains(s.substring(j, i))){
dp[i] = true;
}
}
}
}
return dp[n];
}
https://leetcode.com/discuss/8482/a-java-solution-with-similar-dp-idea
public boolean wordBreak(String s, Set<String> dict) { if (s==null||s.length()==0) return false; else if (dict.contains(s)) return true; List<Integer> starts = new LinkedList<Integer>(); starts.add(0); for (int end=1;end<=s.length();end++) for (int i=0;i<starts.size();i++) if (dict.contains(s.substring(starts.get(i),end))){ starts.add(0,end); break; } return (starts.get(0)==s.length()); }
http://bangbingsyb.blogspot.com/2014/11/leetcode-word-break-i-ii.html
https://leetcode.com/problems/word-break/discuss/43819/dfs-with-path-memorizing-java-solution
https://leetcode.com/problems/word-break/discuss/43798/Java-simple-recursing-with-memorization
https://leetcode.com/problems/word-break/discuss/43797/A-solution-using-BFS
https://cheonhyangzhang.wordpress.com/2015/10/06/139-leetcode-java-word-break-medium/
http://blog.csdn.net/derrantcm/article/details/47774547
public boolean wordBreak(String s, Set<String> wordDict) { // 参数校验 if (s == null || s.length() < 1 || wordDict == null || wordDict.size() < 1) { return false; } // 标记是否匹配,match[i]表表[0, i-1]都匹配 int length = s.length(); boolean[] match = new boolean[length + 1]; match[0] = true; for (int i = 1; i < length + 1; i++) { for (int j = 0; j < i; j++) { if (match[j] && wordDict.contains(s.substring(j, i))) { // f(0,n) = f(0,i) + f(i,j) + f(j,n) match[i] = true; break; } } } return match[length]; }
http://codesniper.blogspot.com/2015/03/139-word-break-leetcode-java.html
可以使用回溯法求解,每次从dict中查找是否有匹配的单词(下面用java实现的,调用startsWith函数),有的话就将字符串s前面的部分截取掉。从而缩小s的范围,直到s为空说明匹配成功。但是这种方法效率太低,运行超时。
可以利用记忆化存储的方式改进上面的算法,因为s会有很多重复的字符被重复计算。利用数组dp[][]记录下面字串的匹配结果为 true或false。初始化为0,表示没有计算过,dp[i][j] = 1表示 字串(i,j)的匹配结果为true,dp[i][j] = -1表示匹配结果为false。
突然发现使用二维数组是没有必要的。可以使用一个数组 dp[i]表示 str[0.... i-1]能否成功分词。dp[0]初始化为true.
http://blog.csdn.net/doc_sgl/article/details/12323015
https://leetcode.com/discuss/8482/a-java-solution-with-similar-dp-idea
The idea is pretty similar to other DP solution. 1)keep all positions which could form substring contained in the set in a linkedlist 2) Iterate the target string, check substring between current position and stored positions. If new sub string hits the dictionary,add it the front of linkedlist 3)After iteration, check if the front element of linkedlist equals to the length of string.
It consumes 296ms
This solution is still a time O(n^2) and space O(n) one. It is better if dictionary contains long words.
public boolean wordBreak(String s, Set<String> dict) { if (s==null||s.length()==0) return false; else if (dict.contains(s)) return true; List<Integer> starts = new LinkedList<Integer>(); starts.add(0); for (int end=1;end<=s.length();end++) for (int i=0;i<starts.size();i++) if (dict.contains(s.substring(starts.get(i),end))){ starts.add(0,end); break; } return (starts.get(0)==s.length()); }
http://bangbingsyb.blogspot.com/2014/11/leetcode-word-break-i-ii.html
看到这题第一反应是DFS枚举查找,直到“探”到string尾部则算成功。但题目并不要求给出是如何break的,而只要判断是否能break。对这类判断“是”与“否”的可以用DFS暴力解决的题,可以尝试用DP做book keeping中间判断状态,避免DFS的反复搜索。比如这题,可以用一个数组dp记录S[0:i]是否能break。
dp[i]:S[0:i-1]是否能被break。例如dp[1]表示s[0]是否能被break。
dp[0] = true
dp[i] = true if and only if:
1. 存在一个i-1>=k>=0,使s[k:i-1]存在于dict中。
2. s[0: k-1]可以被break,即dp[k] = true。
http://www.cnblogs.com/jcliBlogger/p/4615620.html3 bool wordBreak(string s, unordered_set<string>& wordDict) { 4 int n = s.length(); 5 vector<bool> dp(n + 1, false); 6 dp[0] = true; 7 for (string word : wordDict) { 8 minlen = min(minlen, (int)word.length()); 9 maxlen = max(maxlen, (int)word.length()); 10 } 11 for (int i = 1; i <= n; i++) { 12 for (int j = i - minlen; j >= max(0, j - maxlen); j--) { 13 if (dp[j] && wordDict.find(s.substr(j, i - j)) != wordDict.end()) { 14 dp[i] = true; 15 break; 16 } 17 } 18 } 19 return dp[n]; 20 }X. DFS + cache
https://leetcode.com/problems/word-break/discuss/43819/dfs-with-path-memorizing-java-solution
https://leetcode.com/problems/word-break/discuss/43798/Java-simple-recursing-with-memorization
3 bool wordBreak(string s, unordered_set<string>& wordDict) { 4 for (string word : wordDict) { 5 minlen = min(minlen, (int)word.length()); 6 maxlen = max(maxlen, (int)word.length()); 7 } 8 unordered_map<string, bool> records; 9 return wordBreakDFS(s, 0, wordDict, records); 10 } 11 private: 12 int minlen, maxlen; 13 bool wordBreakDFS(string& s, int idx, unordered_set<string>& wordDict, unordered_map<string, bool>& records) { 14 int n = s.length(); 15 if (idx == n) return true; 16 string tail = s.substr(idx, n - idx); 17 if (records.find(tail) != records.end()) 18 return records[tail]; 19 for (int i = idx + minlen; i <= min(idx + maxlen, n); i++) { 20 string part = s.substr(idx, i - idx); 21 if (wordDict.find(part) != wordDict.end() && wordBreakDFS(s, i, wordDict, records)) 22 return records[part] = true; 23 } 24 return records[tail] = false; 25 }X. BFS:
https://leetcode.com/problems/word-break/discuss/43797/A-solution-using-BFS
We can use a graph to represent the possible solutions. The vertices of the graph are simply the positions of the first characters of the words and each edge actually represents a word. For example, the input string is "nightmare", there are two ways to break it, "night mare" and "nightmare". The graph would be
0-->5-->9
|__ __ _^
The question is simply to check if there is a path from 0 to 9. The most efficient way is traversing the graph using BFS with the help of a queue and a hash set. The hash set is used to keep track of the visited nodes to avoid repeating the same work.
For this problem, the time complexity is O(n^2) and space complexity is O(n), the same with DP. This idea can be used to solve the problem word break II. We can simple construct the graph using BFS, save it into a map and then find all the paths using DFS
public boolean wordBreak(String s, List<String> wordDict) {
Set<String> dict = new HashSet<>(wordDict);
Set<Integer> visited = new HashSet<>();
Queue<Integer> queue = new LinkedList<>();
queue.add(0);
while (!queue.isEmpty()) {
int size = queue.size();
for (int j = 0; j < size; j++) {
int n = queue.remove();
if (n == s.length()) {
return true;
}
if (!visited.contains(n)) {
for (int i = n + 1; i <= s.length(); i++) {
if (dict.contains(s.substring(n, i))) {
queue.add(i);
visited.add(n);
}
}
}
}
}
return false;
}
3 bool wordBreak(string s, unordered_set<string>& wordDict) { 4 int n = s.length(); 5 for (string word : wordDict) { 6 minlen = min(minlen, (int)word.length()); 7 maxlen = max(maxlen, (int)word.length()); 8 } 9 queue<int> toVisit; 10 unordered_set<int> visited; 11 toVisit.push(0); 12 while (!toVisit.empty()) { 13 int start = toVisit.front(); 14 toVisit.pop(); 15 if (visited.find(start) == visited.end()) { 16 visited.insert(start); 17 for (int end = start + minlen - 1; end < min(start + maxlen, n); end++) { 18 string part = s.substr(start, end - start + 1); 19 if (wordDict.find(part) != wordDict.end()) { 20 if (end + 1 == n) return true; 21 toVisit.push(end + 1); 22 } 23 } 24 } 25 } 26 return false; 27 }http://siddontang.gitbooks.io/leetcode-solution/content/greedy/word_break.html
假设dp(i)表示长度为i的字串能否别切分,dp方程如下:
dp(i) = dp(j) && s[j, i) in dict, 0 <= j < i
bool wordBreak(string s, unordered_set<string> &dict) {
int len = (int)s.size();
vector<bool> dp(len + 1, false);
dp[0] = true;
for(int i = 1; i <= len; i++) {
for(int j = i - 1; j >= 0; j--) {
if(dp[j] && dict.find(s.substr(j, i - j)) != dict.end()) {
dp[i] = true;
break;
}
}
}
return dp[len];
}
http://segmentfault.com/a/1190000003698693 public boolean wordBreak(String s, Set<String> wordDict) {
boolean[] dp = new boolean[s.length()+1];
Arrays.fill(dp,false);
dp[s.length()]=true;
// 外层循环递增长度
for(int i = s.length()-1; i >=0 ; i--){
// 内层循环寻找分割点
for(int j = i; j < s.length(); j++){
String sub = s.substring(i,j+1);
if(wordDict.contains(sub) && dp[j+1]){
dp[i] = true;
break;
}
}
}
return dp[0];
}
https://cheonhyangzhang.wordpress.com/2015/10/06/139-leetcode-java-word-break-medium/
A[i] stands for for the substring [0, i], if it is a valid word break string.
A[i] = A[j] && dict.contains(substring(j+1, i))
public
boolean
wordBreak(String s, Set<String> wordDict){
int
length = s.length();
boolean
[] valid =
new
boolean
[length];
for
(
int
i =
0
; i < length; i ++ ){
String word = s.substring(
0
, i +
1
);
if
(wordDict.contains(word))
valid[i] =
true
;
else
{
valid[i] =
false
;
for
(
int
j =
0
; j < i; j ++){
if
(valid[j] && wordDict.contains(s.substring(j +
1
, i +
1
))){
valid[i] =
true
;
break
;
}
}
}
//else
}
//for
return
valid[s.length() -
1
];
}
//wordBreak
public boolean wordBreak(String s, Set<String> wordDict) { // 参数校验 if (s == null || s.length() < 1 || wordDict == null || wordDict.size() < 1) { return false; } // 标记是否匹配,match[i]表表[0, i-1]都匹配 int length = s.length(); boolean[] match = new boolean[length + 1]; match[0] = true; for (int i = 1; i < length + 1; i++) { for (int j = 0; j < i; j++) { if (match[j] && wordDict.contains(s.substring(j, i))) { // f(0,n) = f(0,i) + f(i,j) + f(j,n) match[i] = true; break; } } } return match[length]; }
http://codesniper.blogspot.com/2015/03/139-word-break-leetcode-java.html
public boolean wordBreak(String s, Set<String> dict) {
if(s==null) return false;
if(s.length()==0) return true;
boolean[] res=new boolean[s.length()+1];
res[0]=true;
for(int i=0;i<s.length();i++){
for(int j=0;j<=i;j++){
if(res[j] && dict.contains(s.substring(j,i+1))){
res[i+1]=true;
break;
}
}
}
return res[s.length()];
}
http://www.acmerblog.com/leetcode-word-break-5984.html可以使用回溯法求解,每次从dict中查找是否有匹配的单词(下面用java实现的,调用startsWith函数),有的话就将字符串s前面的部分截取掉。从而缩小s的范围,直到s为空说明匹配成功。但是这种方法效率太低,运行超时。
02 | public static boolean wordBreak(String s, Set<String> dict) { |
03 | if (s== null || s.length()== 0 ) return false ; |
04 | return find(s,dict); |
05 | } |
06 | public static boolean find(String s, Set<String> set){ |
07 | if (s.length() == 0 ) return true ; |
08 | boolean find = false ; |
09 | for (String str:set){ |
10 | if (s.startsWith(str)){ |
11 | find = find(s.substring(str.length()), set); |
12 | } |
13 | if (find) return true ; |
14 | } |
15 | return false ; |
16 | } |
01 | public class WordBreak { |
02 | static int dp[][]; |
03 | static String string; |
04 | public static boolean wordBreak(String s, Set<String> dict) { |
05 | if (s== null || s.length()== 0 ) return false ; |
06 | dp = new int [s.length()+ 1 ][s.length()+ 1 ]; |
07 | string = s; |
08 | return find( 0 , s.length() ,dict); |
09 | } |
10 | public static boolean find( int s, int e, Set<String> set){ |
11 | if (dp[s][e] == - 1 ) return false ; |
12 | if ( s == e || dp[s][e] == 1 ) return true ; |
13 | String tmpstr = string.substring(s,e); |
14 | boolean find = false ; |
15 | for (String str:set){ |
16 | if (tmpstr.startsWith(str)){ |
17 | find = find(s+str.length(), e, set); |
18 | } |
19 | if (find){ |
20 | dp[s][e] = 1 ; |
21 | return true ; |
22 | } |
23 | } |
24 | dp[s][e] = - 1 ; |
25 | return false ; |
26 | } |
突然发现使用二维数组是没有必要的。可以使用一个数组 dp[i]表示 str[0.... i-1]能否成功分词。dp[0]初始化为true.
02 | public boolean wordBreak(String s, Set<String> dict) { |
03 | boolean [] t = new boolean [s.length()+ 1 ]; |
04 | t[ 0 ] = true ; //set first to be true, why? |
05 | //Because we need initial state |
06 |
07 | for ( int i= 0 ; i<s.length(); i++){ |
08 | //should continue from match position |
09 | if (!t[i]) |
10 | continue ; |
11 |
12 | for (String a: dict){ |
13 | int len = a.length(); |
14 | int end = i + len; |
15 | if (end > s.length()) |
16 | continue ; |
17 |
18 | if (t[end]) continue ; |
19 |
20 | if (s.substring(i, end).equals(a)){ |
21 | t[end] = true ; |
22 | } |
23 | } |
24 | } |
25 |
26 | return t[s.length()]; |
27 | } |
第一种方法:递归(超时)Time Limit Exceeded
思路:从s的第一个字母向后匹配,如果i前面的前缀可以匹配,就看s字符串i以后的后缀是否匹配
- bool wordBreak(string s, unordered_set<string> &dict) {
- // Note: The Solution object is instantiated only once.
- if(s.length() < 1) return true;
- bool flag = false;
- for(int i = 1; i <= s.length(); i++)
- {
- string tmpstr = s.substr(0,i);
- unordered_set<string>::iterator it = dict.find(tmpstr);
- if(it != dict.end())
- {
- if(tmpstr.length() == s.length())return true;
- flag = wordBreak(s.substr(i),dict);
- }
- if(flag)return true;
- }
- return false;
- }
- bool wordBreakHelper(string s, unordered_set<string> &dict,set<string> &unmatch) {
- if(s.length() < 1) return true;
- bool flag = false;
- for(int i = 1; i <= s.length(); i++)
- {
- string prefixstr = s.substr(0,i);
- unordered_set<string>::iterator it = dict.find(prefixstr);
- if(it != dict.end())
- {
- string suffixstr = s.substr(i);
- set<string>::iterator its = unmatch.find(suffixstr);
- if(its != unmatch.end())continue;
- else{
- flag = wordBreakHelper(suffixstr,dict,unmatch);
- if(flag) return true;
- else unmatch.insert(suffixstr);
- }
- }
- }
- return false;
- }
- bool wordBreakHelper(string s,unordered_set<string> &dict,set<string> &unmatched,int mn,int mx) {
- if(s.size() < 1) return true;
- int i = mx < s.length() ? mx : s.length();
- for(; i >= mn ; i--)
- {
- string preffixstr = s.substr(0,i);
- if(dict.find(preffixstr) != dict.end()){
- string suffixstr = s.substr(i);
- if(unmatched.find(suffixstr) != unmatched.end())
- continue;
- else
- if(wordBreakHelper(suffixstr, dict, unmatched,mn,mx))
- return true;
- else
- unmatched.insert(suffixstr);
- }
- }
- return false;
- }
- bool wordBreak(string s, unordered_set<string> &dict) {
- // Note: The Solution object is instantiated only once.
- if(s.length() < 1) return true;
- if(dict.empty()) return false;
- unordered_set<string>::iterator it = dict.begin();
- int maxlen=(*it).length(), minlen=(*it).length();
- for(it++; it != dict.end(); it++)
- if((*it).length() > maxlen)
- maxlen = (*it).length();
- else if((*it).length() < minlen)
- minlen = (*it).length();
- set<string> unmatched;
- return wordBreakHelper(s,dict,unmatched,minlen,maxlen);
- }
- Any breakable string must also be breakable if we divide it into two substrings properly. There are only N - 1 ways of dividing a string into two parts, with N the length of the string. So our task is to find a proper way out of these N - 1 ones. (only one will do since the problem does not ask us to find all ways.)
- To avoid repetition, we use a HashMap to store the results for each substring, which are obtained in turn by calling wordBreak() method.
- The total string is breakable if its two complementary substrings are breakable. Once we get the results for both parts, we can check this by getting the results from the HashMap.
In total the program will run in O(n^2) time with n the length of the string
private Map<String, Boolean> myMap = new HashMap<>();
public boolean wordBreak(String s, Set<String> dict) {
// check if the dictionary contains the whole string
if (dict.contains(s)) {
return true;
}
// divide the string and check for each substring
for (int i = 1; i < s.length(); i++) {
// check for the left half substring
if (!myMap.containsKey(s.substring(0,i))) {
myMap.put(s.substring(0,i), wordBreak(s.substring(0,i), dict));
}
// check for the right half substring
if (!myMap.containsKey(s.substring(i, s.length()))) {
myMap.put(s.substring(i, s.length()), wordBreak(s.substring(i, s.length()), dict));
}
// if both substrings are breakable, then the total string is also breakable
if (myMap.get(s.substring(0,i)) && myMap.get(s.substring(i, s.length()))) {
return true; // add false to map
}
}
// else the string is not breakable
return false;
}
X. DP with Trie
https://leetcode.com/problems/word-break/discuss/219467/Java-iterative-solution-beats-99-using-Trie-and-DP
这道题目可以使用 Trie 来预处理构建 dictionary,该方法特别适合两种情况:1. dictionary 非常大; 2. 该function调用次数非常多。
https://www.geeksforgeeks.org/word-break-problem-trie-solution/
X.DFS
https://discuss.leetcode.com/topic/44941/java-dfs-memorization-solution-beats-92
X. BFS
The code can still be optimized if we record the maximum and minimum length of the words in
bool wordBreak(string s, unordered_set<string>& wordDict) { for (string word : wordDict) { minlen = min(minlen, (int)word.length()); maxlen = max(maxlen, (int)word.length()); } queue<int> toVisit; unordered_set<int> isVisited; toVisit.push(0); while (!toVisit.empty()) { int start = toVisit.front(); toVisit.pop(); if (isVisited.find(start) == isVisited.end()) { isVisited.insert(start); for (int i = start + minlen - 1; i < min(start + maxlen, (int)s.length()); i++) { if (wordDict.find(s.substr(start, i - start + 1)) != wordDict.end()) { if (i + 1 == s.length()) return true; toVisit.push(i + 1); } } } } return false; }
http://yucoding.blogspot.com/2013/12/leetcode-question-word-break-i.html
The complexity of this problem is O(n^3), where n is the length of the original string. The code below has been tested and passed all the test cases in the OJ.
https://leetcode.com/discuss/24964/o-2-n-or-o-n
http://rleetcode.blogspot.com/2014/01/word-break.html
https://leetcode.com/discuss/17365/java-8-functional-programming-rocks
public class WordBreaker { /** * Creates the word breaker. * * @param dictionary each element of the set corresponds to a word inside the * dictionary. */ public WordBreaker(final HashSet<String> dictionary) { this.dictionary = dictionary; // The size of the largest dictionary word: this.maxSize = dictionary.stream() .mapToInt(String::length) .max() .orElse(0); this.failures = new HashSet<String>(); } /** * Tells if the given word can be de-compounded with the current dictionary. * * @param word The word to be tested. * * @return True on success, false otherwise. */ public boolean wordBreak(String word) { if (word.isEmpty()) { return false; // End of recursion. } else if (failures.contains(word)) { return false; // Failure memoized. } else if (dictionary.contains(word)) { return true; // The whole word belongs to the dictionary. } int splitPos = Integer.min(maxSize, word.length() - 1); /* Smart solution: - splits the world into two non empty parts - if the prefix is in the dictionary - then recursively call wordBreak with the prefix - at the first successful recursion, succeeds. */ if (!IntStream.range(0, splitPos) .mapToObj(pos -> word.substring(0, splitPos - pos)) .filter(prefix -> dictionary.contains(prefix)) .map(prefix -> word.substring(prefix.length())) .anyMatch(suffix -> wordBreak(suffix))) { failures.add(word); // memoizes any failure. return false; } return true; } private final HashSet<String> dictionary; private HashSet<String> failures; private int maxSize; }
http://www.programcreek.com/2012/12/leetcode-solution-word-break/
The problem is equivalent to matching the regular expression (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n). (Thanks to hdante.) Leetcode online judge does not allow using the Pattern class though.
X.
https://leetcode.com/problems/word-break/discuss/143310/Possible-Follow-Up-Question
140
变种题,求最⼩小合理理分割次数
最后⼀一道简单的Dictionary中能不不能组成target word,最后follow up关于
dictionary很⼤大的情况怎么办,⽐比如⽤用cache。
2. 变种 同样⽤用DP 原来维护能否拼成改成 维护最⼩小拼接次数
Viewing it as a graph, you need to visit each node at most once. The hash table saves you from visiting each node more than once.
So N nodes with O(N) for the loop in each node gives O(N^2)
Here is my similar, but slightly simpler and slightly more efficient code using array instead of hash table:
public boolean wordBreak(String s, Set<String> dict) {
boolean []cannotBreak = new boolean[s.length()];
return wordBreak(s,dict,0,cannotBreak);
}
boolean wordBreak(String s, Set<String>dict, int start,boolean [] cannotBreak){
int n=s.length();
if (start==n) return true;
if (cannotBreak[start]) return false;
for (int i=start+1;i<=n;i++)
if (dict.contains(s.substring(start,i)) && wordBreak(s, dict,i,cannotBreak)) return true;
cannotBreak[start] = true;
return false;
}
I think it's O(n^2).
Without the map cache, your time cost will be T(n) = T(n-1) + T(n-2) + T(n-3) ...... + T(1).
But with cache, each of T(n-2) T(n-3) ....... and T(1) is also computed and cached during T(n-1)'s computing. So each of them costs only constant time.
In this case, T(n) = T(n-1) + (n-2) = O(n^2).
https://leetcode.com/problems/word-break/discuss/219467/Java-iterative-solution-beats-99-using-Trie-and-DP
TrieNode root;
int[] chars;
boolean dp[];
public boolean wordBreak(String s, List<String> wordDict) {
root = buildTrie(wordDict);
chars = new int[s.length()];
for (int i = 0; i < s.length(); i++) {
chars[i] = s.charAt(i) - 'a';
}
dp = new boolean[chars.length + 1];
dp[0] = true;
for (int i = 0; i < chars.length; i++) {
if (dp[i]) {
fillDp(i);
}
}
return dp[chars.length];
}
public void fillDp(int start) {
TrieNode node = root;
for (; start < chars.length; start++) {
if (node.word) dp[start] = true;
int c = chars[start];
node = node.next[c];
if (node == null) return;
}
if (node.word) dp[start] = true;
}
private static class TrieNode {
TrieNode[] next = new TrieNode[26];
boolean word;
}
private TrieNode buildTrie(List<String> dict) {
TrieNode root = new TrieNode();
for (String word : dict) {
TrieNode node = root;
for (char c : word.toCharArray()) {
int i = c - 'a';
if (node.next[i] == null) node.next[i] = new TrieNode();
node = node.next[i];
}
node.word = true;
}
return root;
}
https://www.techiedelight.com/word-break-problem-using-trie/这道题目可以使用 Trie 来预处理构建 dictionary,该方法特别适合两种情况:1. dictionary 非常大; 2. 该function调用次数非常多。
https://www.geeksforgeeks.org/word-break-problem-trie-solution/
here I used trie tree search to assist the DP, 4ms 90%:
public class TrieNode {
boolean isWord;
TrieNode[] c;
public TrieNode(){
isWord = false;
c = new TrieNode[128];
}
}
public void addWord(TrieNode t, String w) {
for (int i = 0; i < w.length(); i++) {
int j = (int)w.charAt(i);
if (t.c[j] == null) t.c[j] = new TrieNode();
t = t.c[j];
}
t.isWord = true;
}
public boolean wordBreak(String s, List<String> wordDict) {
TrieNode t = new TrieNode(), cur;
for (String i : wordDict) addWord(t, i);
char[] str = s.toCharArray();
int len = str.length;
boolean[] f = new boolean[len + 1];
f[len] = true;
for (int i = len - 1; i >= 0; i--) {
//System.out.println(str[i]);
cur = t;
for (int j = i; cur != null && j < len ; j++) {
cur = cur.c[(int)str[j]];
if (cur != null && cur.isWord && f[j + 1]) {
f[i] = true;
break;
}
}
}
return f[0];
}https://discuss.leetcode.com/topic/44941/java-dfs-memorization-solution-beats-92
public boolean wordBreak(String s, Set<String> wordDict) {
if (s == null || s.length() == 0) {
return false;
}
int maxLength = maxLengthStr(wordDict);
return dfs(s, 0, wordDict, new boolean[s.length()], maxLength);
}
private boolean dfs(String s, int idx, Set<String> wordDict, boolean[] unbreakable, int maxLength) {
if (idx >= s.length()) {
return true;
} else if (unbreakable[idx]) {
return false;
} else {
int end = Math.min(idx + maxLength, s.length());
for (int i = idx + 1; i <= end; i++) {
if (wordDict.contains(s.substring(idx, i))) {
if (dfs(s, i, wordDict, unbreakable, maxLength)) {
return true;
}
}
}
unbreakable[idx] = true;
return false;
}
}
https://discuss.leetcode.com/topic/29840/concise-dfs-backtracking-solution/2public boolean wordBreak(String s, Set<String> wordDict) {
return dfs(s, wordDict, new HashSet<>());
}
private boolean dfs(String s, Set<String> wordDict, Set<String> checked) {
if (s.isEmpty()) return true;
if (checked.contains(s)) return false;
checked.add(s);
for (String w : wordDict) {
if (s.startsWith(w) && dfs(s.substring(w.length()), wordDict, checked)) return true;
}
return false;
}
https://leetcode.com/discuss/26956/dfs-with-path-memorizing-java-solution
Use a set to record all position that cannot find a match in dict. That cuts down the run time of DFS to O(n^2)
public boolean wordBreak(String s, Set<String> dict) {
// DFS
Set<Integer> set = new HashSet<Integer>();
return dfs(s, 0, dict, set);
}
private boolean dfs(String s, int index, Set<String> dict, Set<Integer> set){
// base case
if(index == s.length()) return true;
// check memory
if(set.contains(index)) return false;
// recursion
for(int i = index+1;i <= s.length();i++){
String t = s.substring(index, i);
if(dict.contains(t))
if(dfs(s, i, dict, set))
return true;
else
set.add(i);
}
set.add(index);
return false;
}X. BFS
https://discuss.leetcode.com/topic/2545/a-solution-using-bfs
The solution I post below using BFS is no better than those. Just to share some new thoughts.
The solution I post below using BFS is no better than those. Just to share some new thoughts.
We can use a graph to represent the possible solutions. The vertices of the graph are simply the positions of the first characters of the words and each edge actually represents a word. For example, the input string is "nightmare", there are two ways to break it, "night mare" and "nightmare". The graph would be
0-->5-->9
|__ __ _^
The question is simply to check if there is a path from 0 to 9. The most efficient way is traversing the graph using BFS with the help of a queue and a hash set. The hash set is used to keep track of the visited nodes to avoid repeating the same work.
For this problem, the time complexity is O(n^2) and space complexity is O(n), the same with DP. This idea can be used to solve the problem word break II. We can simple construct the graph using BFS, save it into a map and then find all the paths using DFS.
This BFS is just a optimized DP which avoids irrelevant states.
public boolean wordBreak(String s, Set<String> dict) {
if (dict.contains(s)) return true;
Queue<Integer> queue = new LinkedList<Integer>();
queue.offer(0);
// use a set to record checked index to avoid repeated work.
// This is the key to reduce the running time to O(N^2).
Set<Integer> visited = new HashSet<Integer>();
visited.add(0);
while (!queue.isEmpty()) {
int curIdx = queue.poll();
for (int i = curIdx+1; i <= s.length(); i++) {
if (visited.contains(i)) continue;
if (dict.contains(s.substring(curIdx, i))) {
if (i == s.length()) return true;
queue.offer(i);
visited.add(i);
}
}
}
return false;
}This BFS is just a optimized DP which avoids irrelevant states.
The code can still be optimized if we record the maximum and minimum length of the words in
wordDict
and only traverse the places within the minimum and maximum length.bool wordBreak(string s, unordered_set<string>& wordDict) { for (string word : wordDict) { minlen = min(minlen, (int)word.length()); maxlen = max(maxlen, (int)word.length()); } queue<int> toVisit; unordered_set<int> isVisited; toVisit.push(0); while (!toVisit.empty()) { int start = toVisit.front(); toVisit.pop(); if (isVisited.find(start) == isVisited.end()) { isVisited.insert(start); for (int i = start + minlen - 1; i < min(start + maxlen, (int)s.length()); i++) { if (wordDict.find(s.substr(start, i - start + 1)) != wordDict.end()) { if (i + 1 == s.length()) return true; toVisit.push(i + 1); } } } } return false; }
http://yucoding.blogspot.com/2013/12/leetcode-question-word-break-i.html
The complexity of this problem is O(n^3), where n is the length of the original string. The code below has been tested and passed all the test cases in the OJ.
bool
wordBreak(string s, unordered_set<string> &dict) {
// IMPORTANT: Please reset any member data you declared, as
// the same Solution instance will be reused for each test case.
if
(s.empty()){
return
false
;}
vector<vector<
bool
> > mp(s.size(),vector<
bool
>(s.size(),
false
));
for
(
int
i=0;i<s.size();i++){
for
(
int
j=i;j<s.size();j++){
if
(dict.find(s.substr(i,j-i+1))!=dict.end()){
mp[i][j]=
true
;
}
}
}
for
(
int
i=0;i<s.size();i++){
for
(
int
j=i;j<s.size();j++){
for
(
int
k=i;k<j;k++){
if
(!mp[i][j]){
mp[i][j]=mp[i][k]&&mp[k+1][j];
}
}
}
}
return
mp[0][s.size()-1];
}
http://rleetcode.blogspot.com/2014/01/word-break.html
https://leetcode.com/discuss/17365/java-8-functional-programming-rocks
public class WordBreaker { /** * Creates the word breaker. * * @param dictionary each element of the set corresponds to a word inside the * dictionary. */ public WordBreaker(final HashSet<String> dictionary) { this.dictionary = dictionary; // The size of the largest dictionary word: this.maxSize = dictionary.stream() .mapToInt(String::length) .max() .orElse(0); this.failures = new HashSet<String>(); } /** * Tells if the given word can be de-compounded with the current dictionary. * * @param word The word to be tested. * * @return True on success, false otherwise. */ public boolean wordBreak(String word) { if (word.isEmpty()) { return false; // End of recursion. } else if (failures.contains(word)) { return false; // Failure memoized. } else if (dictionary.contains(word)) { return true; // The whole word belongs to the dictionary. } int splitPos = Integer.min(maxSize, word.length() - 1); /* Smart solution: - splits the world into two non empty parts - if the prefix is in the dictionary - then recursively call wordBreak with the prefix - at the first successful recursion, succeeds. */ if (!IntStream.range(0, splitPos) .mapToObj(pos -> word.substring(0, splitPos - pos)) .filter(prefix -> dictionary.contains(prefix)) .map(prefix -> word.substring(prefix.length())) .anyMatch(suffix -> wordBreak(suffix))) { failures.add(word); // memoizes any failure. return false; } return true; } private final HashSet<String> dictionary; private HashSet<String> failures; private int maxSize; }
http://www.programcreek.com/2012/12/leetcode-solution-word-break/
The problem is equivalent to matching the regular expression (leet|code)*, which means that it can be solved by building a DFA in O(2^m) and executing it in O(n). (Thanks to hdante.) Leetcode online judge does not allow using the Pattern class though.
public static void main(String[] args) { HashSet<String> dict = new HashSet<String>(); dict.add("go"); dict.add("goal"); dict.add("goals"); dict.add("special"); StringBuilder sb = new StringBuilder(); for(String s: dict){ sb.append(s + "|"); } String pattern = sb.toString().substring(0, sb.length()-1); pattern = "("+pattern+")*"; Pattern p = Pattern.compile(pattern); Matcher m = p.matcher("goalspecial"); if(m.matches()){ System.out.println("match"); } } |
X.
This problem can be solve by using a naive approach, which is trivial. A discussion can always start from that though.
https://leetcode.com/discuss/7181/what-time-complexity-recursive-solution-this-problem-how-get
Most recursive algorithm like this have an exponential time complexity. To see this, we expand T(n-1) to
T(n-1) = k(n-1) + T(n-2) + T(n-3) + ... T(1) <=> T(n-1) - k(n-1) = T(n-2)+...+T(1) ---------- (1)
and we already have
T(n) = kn + T(n-1) + T(n-2) + ... + T(1) <=> T(n) - kn - T(n-1) = T(n-2) +... + T(1) ----------(2)
Combine (1) and (2) we have:
T(n) - kn - T(n-1) = T(n-1) - k(n-1) <=> T(n) = 2T(n-1) + k
Therefore, we have O(n) = 2^n
https://discuss.leetcode.com/topic/3454/hi-anyone-who-knows-the-time-complexity-of-recursion-with-dfs-here-is-the-code/5bool wordBreakHelper(string s,int len, unordered_set<string> &dict, int pos)
{
if (pos == len)
return true;
for (int i = 1; i <= len - pos; i++)
{
string t = s.substr(pos, i);
if (dict.find(t) != dict.end())
{
if (wordBreakHelper(s, len, dict, pos + i))
return true;
}
}
return false;
}
bool wordBreak(string s, unordered_set<string> &dict)
{
return wordBreakHelper(s, s.length(), dict, 0);
}
https://www.cnblogs.com/lightwindy/p/8509053.html
Followup: 返回其中的一个解,如果要返回全部解需要用140. Word Break II的方法,一个解要简单很多。F jia
public
String wordBreak(String s, Set<String> dict) {
if
(s ==
null
|| s.isEmpty() || dict ==
null
) {
return
""
;
}
boolean
[] dp =
new
boolean
[s.length() +
1
];
String[] words =
new
String[s.length() +
1
];
dp[
0
] =
true
;
words[
0
] =
""
;
for
(
int
i =
1
; i <= s.length(); i++) {
for
(
int
j =
0
; j < i; j++) {
if
(dp[j] && dict.contains(s.substring(j, i))) {
dp[i] =
true
;
if
(words[j].isEmpty()) {
words[i] = s.substring(j, i);
}
else
{
words[i] = words[j] +
" "
+ s.substring(j, i);
}
}
}
}
if
(dp[s.length()]) {
return
words[s.length()];
}
else
{
return
""
;
}
}
Has anybody thought about the solution when the problem is changed such that each word in the dictionary can be only used once?
My initial thoughts:
• DP:
At each point i where the string is breakable, we need to use a set to remember which words we have used to break the string up to this point. And when we are considering a later point j, based on which previous breakable point we are using as a pivot, we will need to consider the "breakability" of the string at index j with (input set - set stored at index j)
• DP:
At each point i where the string is breakable, we need to use a set to remember which words we have used to break the string up to this point. And when we are considering a later point j, based on which previous breakable point we are using as a pivot, we will need to consider the "breakability" of the string at index j with (input set - set stored at index j)
• DFS + Memo:
The memo is not only keyed by the starting index, but should also encode the information of what words are still available in the set, because both of those together represent the state of the DFS (whether the substring starting at index i, with the remaining usable words, is breakable). But I am not exactly sure how to encode a set of strings as a key. (sort the strings and concatenate?)
变种题,求最⼩小合理理分割次数
最后⼀一道简单的Dictionary中能不不能组成target word,最后follow up关于
dictionary很⼤大的情况怎么办,⽐比如⽤用cache。
2. 变种 同样⽤用DP 原来维护能否拼成改成 维护最⼩小拼接次数