Related: Leetcode 44 - Wildcard Matching
http://segmentfault.com/a/1190000003904286
时间 O(NM) 空间 O(N)
https://chazyhabit.wordpress.com/2014/08/11/regular-expression-matching-leetcode-46/
The recursion mainly breaks down elegantly to the following two cases:
If the next character of p is NOT ‘*’, then it must match the current character of s. Continue pattern matching with the next character of both s and p.
If the next character of p is ‘*’, then we do a brute force exhaustive matching of 0, 1, or more repeats of current character of p… Until we could not match any more characters.
X. DP
http://cwind.iteye.com/blog/2228723
X. Recursive
Not good: https://discuss.leetcode.com/topic/2818/the-shortest-ac-code/30
http://www.cnblogs.com/yrbbest/p/4430699.html
思路2: 双指针扫描
why the regex pattern “.*” matches the string “ab”. “.*” means repeat thepreceding element 0 or more times. Here, the “preceding” element is the dot character in the pattern, which can match any characters. Therefore, the regex pattern “.*” allows the dot to be repeated any number of times, which matches any string (even an empty string).
A sample diagram of a deterministic finite state automata (DFA). DFAs are useful for doing lexical analysis and pattern matching. An example is UNIX’s grep tool. Please note that this post does not attempt to descibe a solution using DFA.
http://harrifeng.github.io/algo/leetcode/regular-expression-matching.html
(1)p[j+1]不是'*'。情况比较简单,只要判断当前s的i和p的j上的字符是否一样(如果有p在j上的字符是'.',也是相同),如果不同,返回false,否则,递归下一层i+1,j+1;
(2)p[j+1]是'*'。那么此时看从s[i]开始的子串,假设s[i],s[i+1],...s[i+k]都等于p[j]那么意味着这些都有可能是合适的匹配,那么递归对于剩下的(i,j+2),(i+1,j+2),...,(i+k,j+2)都要尝试(j+2是因为跳过当前和下一个'*'字符)。
假设我们维护一个布尔数组res[i][j],代表s的前i个字符和p的前j个字符是否匹配(注意这里res的维度是s.length()+1,p.length()+1)。递推公式跟上面类似,分三种种情况:
(1)p[j+1]不是'*'。情况比较简单,只要判断如果当前s的i和p的j上的字符一样(如果有p在j上的字符是'.',也是相同),并且res[i][j]==true,则res[i+1][j+1]也为true,res[i+1][j+1]=false;
(2)p[j+1]是'*',但是p[j]!='.'。那么只要以下条件有一个满足即可对res[i+1][j+1]赋值为true:
1)res[i+1][j]为真('*'只取前面字符一次);
2)res[i+1][j-1]为真('*'前面字符一次都不取,也就是忽略这两个字符);
3)res[i][j+1] && s[i]==s[i-1] && s[i-1]==p[j-1](这种情况是相当于i从0到s.length()扫过来,如果p[j+1]对应的字符是‘*’那就意味着接下来的串就可以依次匹配下来,如果下面的字符一直重复,并且就是‘*’前面的那个字符)。
(3)p[j+1]是'*',并且p[j]=='.'。因为".*"可以匹配任意字符串,所以在前面的res[i+1][j-1]或者res[i+1][j]中只要有i+1是true,那么剩下的res[i+1][j+1],res[i+2][j+1],...,res[s.length()][j+1]就都是true了。
public static boolean isMatch(String r, String s) {
// Case (2.) : starts with '^'.
if (r.charAt(0) == '^') {
return isMatchHere(r.substring(1), s);
}
for (int i = 0; i <= s.length(); ++i) {
if (isMatchHere(r, s.substring(i))) {
return true;
}
}
return false;
}
private static boolean isMatchHere(String r, String s) {
// Case (1.)
if (r.isEmpty()) {
return true;
}
// Case (2) : ends with '$'.
if ("$".equals(r)) {
return s.isEmpty();
}
// Case (4.)
if (r.length() >= 2 && r.charAt(1) == '*') {
for (int i = 0; i < s.length()
&& (r.charAt(0) == '.' || r.charAt(0) == s.charAt(i)); ++i) {
if (isMatchHere(r.substring(2), s.substring(i + 1))) {
return true;
}
}
return isMatchHere(r.substring(2), s);
}
// Case (3.)
return !s.isEmpty() && (r.charAt(0) == '.' || r.charAt(0) == s.charAt(0))
&& isMatchHere(r.substring(1), s.substring(1));
}
https://discuss.leetcode.com/topic/54908/evolve-from-brute-force-to-dp
TODO: http://www.cnblogs.com/yrbbest/p/4430699.html
我们先来判断p是否为空,若为空则根据s的为空的情况返回结果。当p的第二个字符为*号时,由于*号前面的字符的个数可以任意,可以为0,那么我们先用递归来调用为0的情况,就是直接把这两个字符去掉再比较,或者当s不为空,且第一个字符和p的第一个字符相同时,我们再对去掉首字符的s和p调用递归,注意p不能去掉首字符,因为*号前面的字符可以有无限个;如果第二个字符不为*号,那么我们就老老实实的比较第一个字符,然后对后面的字符串调用递归
https://github.com/mintycc/OnlineJudge-Solutions/blob/master/Leetcode/010_Regular_Expression_Matching.java
X. DFS + cache
https://blog.csdn.net/roufoo/article/details/82659855
1. 不不考虑星号
2. 楼主想⽤用dp,但是他坚持要先写 dfs / recurrsion
3. regex变成“+”,“*”(我的妈,上来我⼀一看题就知道要挂…我⽤用dp写的,
但是天竺⼩小哥说最好⽤用recursion..)
http://segmentfault.com/a/1190000003904286
Implement regular expression matching with support for
'.'
and'*'
.'.'
Matches any single character. '*'
Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).
The function prototype should be:
bool isMatch(const char *s, const char *p)
Some examples:
isMatch("aa","a") → false
isMatch("aa","aa") → true
isMatch("aaa","aa") → false
isMatch("aa", "a*") → true
isMatch("aa", ".*") → true
isMatch("ab", ".*") → true
isMatch("aab", "c*a*b") → true
http://buttercola.blogspot.com/2014/10/leetcode-regular-expression-matching.html
X. DP O(n) space
https://discuss.leetcode.com/topic/31974/java-4ms-dp-solution-with-o-n-2-time-and-o-n-space-beats-95
X. DP, n--
X. DP
https://discuss.leetcode.com/topic/27988/java-solution-o-n-2-dp-with-some-explanations
https://discuss.leetcode.com/topic/40371/easy-dp-java-solution-with-detailed-explanation
https://discuss.leetcode.com/topic/31974/java-4ms-dp-solution-with-o-n-2-time-and-o-n-space-beats-95
public boolean isMatch(String s, String p) {
/**
* This solution is assuming s has no regular expressions.
*
* dp: res[i][j]=is s[0,...,i-1] matched with p[0,...,j-1];
*
* If p[j-1]!='*', res[i][j] = res[i-1][j-1] &&
* (s[i-1]==p[j-1]||p[j-1]=='.'). Otherwise, res[i][j] is true if
* res[i][j-1] or res[i][j-2] or
* res[i-1][j]&&(s[i-1]==p[j-2]||p[j-2]=='.'), and notice the third
* 'or' case includes the first 'or'.
*
*
* Boundaries: res[0][0]=true;//s=p="". res[i][0]=false, i>0.
* res[0][j]=is p[0,...,j-1] empty, j>0, and so res[0][1]=false,
* res[0][j]=p[j-1]=='*'&&res[0][j-2].
*
* O(n) space is enough to store a row of res.
*/
int m = s.length(), n = p.length();
boolean[] res = new boolean[n + 1];
res[0] = true;
int i, j;
for (j = 2; j <= n; j++)
res[j] = res[j - 2] && p.charAt(j - 1) == '*';
char pc, sc, tc;
boolean pre, cur; // pre=res[i - 1][j - 1], cur=res[i-1][j]
for (i = 1; i <= m; i++) {
pre = res[0];
res[0] = false;
sc = s.charAt(i - 1);
for (j = 1; j <= n; j++) {
cur = res[j];
pc = p.charAt(j - 1);
if (pc != '*')
res[j] = pre && (sc == pc || pc == '.');
else {
// pc == '*' then it has a preceding char, i.e. j>1
tc = p.charAt(j - 2);
res[j] = res[j - 2] || (res[j] && (sc == tc || tc == '.'));
}
pre = cur;
}
}
return res[n];
}
https://discuss.leetcode.com/topic/6388/sharing-my-44ms-dp-code-o-mn-time-and-o-n-spaceX. DP, n--
public boolean isMatch(String text, String pattern) {
boolean[][] dp = new boolean[text.length() + 1][pattern.length() + 1];
dp[text.length()][pattern.length()] = true;
for (int i = text.length(); i >= 0; i--) {
for (int j = pattern.length() - 1; j >= 0; j--) {
boolean first_match = (i < text.length() && (pattern.charAt(j) == text.charAt(i) || pattern.charAt(j) == '.'));
if (j + 1 < pattern.length() && pattern.charAt(j + 1) == '*') {
dp[i][j] = dp[i][j + 2] || first_match && dp[i + 1][j];
} else {
dp[i][j] = first_match && dp[i + 1][j + 1];
}
}
}
return dp[0][0];
}
X. DP
https://discuss.leetcode.com/topic/27988/java-solution-o-n-2-dp-with-some-explanations
https://discuss.leetcode.com/topic/40371/easy-dp-java-solution-with-detailed-explanation
1, If p.charAt(j) == s.charAt(i) : dp[i][j] = dp[i-1][j-1];
2, If p.charAt(j) == '.' : dp[i][j] = dp[i-1][j-1];
3, If p.charAt(j) == '*':
here are two sub conditions:
1 if p.charAt(j-1) != s.charAt(i) : dp[i][j] = dp[i][j-2] //in this case, a* only counts as empty
2 if p.charAt(i-1) == s.charAt(i) or p.charAt(i-1) == '.':
dp[i][j] = dp[i-1][j] //in this case, a* counts as multiple a
or dp[i][j] = dp[i][j-1] // in this case, a* counts as single a
or dp[i][j] = dp[i][j-2] // in this case, a* counts as empty
public boolean isMatch(String s, String p) {
if (s == null || p == null) {
return false;
}
boolean[][] dp = new boolean[s.length()+1][p.length()+1];
dp[0][0] = true;
for (int i = 0; i < p.length(); i++) {
if (p.charAt(i) == '*' && dp[0][i-1]) {
dp[0][i+1] = true;
}
}
for (int i = 0 ; i < s.length(); i++) {
for (int j = 0; j < p.length(); j++) {
if (p.charAt(j) == '.') {
dp[i+1][j+1] = dp[i][j];
}
if (p.charAt(j) == s.charAt(i)) {
dp[i+1][j+1] = dp[i][j];
}
if (p.charAt(j) == '*') {
if (p.charAt(j-1) != s.charAt(i) && p.charAt(j-1) != '.') {
dp[i+1][j+1] = dp[i+1][j-1];
} else {
dp[i+1][j+1] = (dp[i+1][j] || dp[i][j+1] || dp[i+1][j-1]);
}
}
}
}
return dp[s.length()][p.length()];
}
2-sequence problem:
-- dp[s.length() + 1][p.length() + 1], where dp[i][j] means the first i characters from string i matches the first j characters in string j.
-- Initial state: dp[0][0] = true, e.g. "" -> "", true.
dp[i][0] = false, i >= 1, any string cannot match a empty string
dp[0][i], if (p.charAt(j) == '*'), dp[0][j] = dp[0][j - 2]
-- Transit function:
-- If p.charAt(j) != '*'. Then IF s.charAt(i - 1) == p.charAt(j - 1) || p.charAt(j - 1) == '.'.
-- dp[i][j] = dp[i - 1][j - 1];
-- Else // p.charAt(j - 1) == "*"
-- If s.charAt(i - 1) != p.charAt(j - 2) && p.charAt(j - 2) != '.'
Then dp[i][j] = dp[i][j - 2] // zero matched, e.g. s = acdd, p = acb*dd.
-- Else
Then dp[i][j] = dp[i][j - 2] || // zero matched
dp[i][j - 1] || // 1 matched
dp[i - 1][j] // 2+ matched
http://bangbingsyb.blogspot.com/2014/11/leetcode-regular-expression-matching.html
-- dp[s.length() + 1][p.length() + 1], where dp[i][j] means the first i characters from string i matches the first j characters in string j.
-- Initial state: dp[0][0] = true, e.g. "" -> "", true.
dp[i][0] = false, i >= 1, any string cannot match a empty string
dp[0][i], if (p.charAt(j) == '*'), dp[0][j] = dp[0][j - 2]
-- Transit function:
-- If p.charAt(j) != '*'. Then IF s.charAt(i - 1) == p.charAt(j - 1) || p.charAt(j - 1) == '.'.
-- dp[i][j] = dp[i - 1][j - 1];
-- Else // p.charAt(j - 1) == "*"
-- If s.charAt(i - 1) != p.charAt(j - 2) && p.charAt(j - 2) != '.'
Then dp[i][j] = dp[i][j - 2] // zero matched, e.g. s = acdd, p = acb*dd.
-- Else
Then dp[i][j] = dp[i][j - 2] || // zero matched
dp[i][j - 1] || // 1 matched
dp[i - 1][j] // 2+ matched
http://bangbingsyb.blogspot.com/2014/11/leetcode-regular-expression-matching.html
关键在于如何处理这个'*'号。
状态:和Mininum Edit Distance这类题目一样。
dp[i][j]表示s[0:i-1]是否能和p[0:j-1]匹配。
递推公式:由于只有p中会含有regular expression,所以以p[j-1]来进行分类。
p[j-1] != '.' && p[j-1] != '*':dp[i][j] = dp[i-1][j-1] && (s[i-1] == p[j-1])
p[j-1] == '.':dp[i][j] = dp[i-1][j-1]
而关键的难点在于 p[j-1] = '*'。由于星号可以匹配0,1,乃至多个p[j-2]。
1. 匹配0个元素,即消去p[j-2],此时p[0: j-1] = p[0: j-3]
dp[i][j] = dp[i][j-2]
2. 匹配1个元素,此时p[0: j-1] = p[0: j-2]
dp[i][j] = dp[i][j-1]
3. 匹配多个元素,此时p[0: j-1] = { p[0: j-2], p[j-2], ... , p[j-2] }
dp[i][j] = dp[i-1][j] && (p[j-2]=='.' || s[i-2]==p[j-2])
bool isMatch(const char *s, const char *p) { int m = strlen(s), n = strlen(p); vector<vector<bool>> dp(m+1, vector<bool>(n+1,false)); dp[0][0] = true; for(int i=0; i<=m; i++) { for(int j=1; j<=n; j++) { if(p[j-1]!='.' && p[j-1]!='*') { if(i>0 && s[i-1]==p[j-1] && dp[i-1][j-1]) dp[i][j] = true; } else if(p[j-1]=='.') { if(i>0 && dp[i-1][j-1]) dp[i][j] = true; } else if(j>1) { //'*' cannot be the 1st element if(dp[i][j-1] || dp[i][j-2]) // match 0 or 1 preceding element dp[i][j] = true; else if(i>0 && (p[j-2]==s[i-1] || p[j-2]=='.') && dp[i-1][j]) // match multiple preceding elements dp[i][j] = true; } } } return dp[m][n]; }http://www.cnblogs.com/jcliBlogger/p/4623211.html
This problem has a typical solution using Dynamic Programming. We define the state P[i][j] to be true if s[0..i)matches p[0..j) and false otherwise. Then the state equations are:
- P[i][j] = P[i - 1][j - 1], if p[j - 1] != '*' && (s[i - 1] == p[j - 1] || p[j - 1] == '.');
- P[i][j] = P[i][j - 2], if p[j - 1] == '*' and the pattern repeats for 0 times;
- P[i][j] = P[i - 1][j] && (s[i - 1] == p[j - 2] || p[j - 2] == '.'), if p[j - 1] == '*' and the pattern repeats for at least 1times.
Putting these together, we will have the following code.
3 bool isMatch(string s, string p) { 4 int m = s.length(), n = p.length(); 5 vector<vector<bool> > dp(m + 1, vector<bool> (n + 1, false)); 6 dp[0][0] = true; 7 for (int i = 0; i <= m; i++) 8 for (int j = 1; j <= n; j++) 9 if (p[j - 1] == '*') 10 dp[i][j] = dp[i][j - 2] || (i > 0 && (s[i - 1] == p[j - 2] || p[j - 2] == '.') && dp[i - 1][j]); 11 else dp[i][j] = i > 0 && dp[i - 1][j - 1] && (s[i - 1] == p[j - 1] || p[j - 1] == '.'); 12 return dp[m][n]; 13 }
public
boolean
isMatch(String s, String p) {
if
(p ==
null
|| p.length() ==
0
) {
return
s ==
null
|| s.length() ==
0
;
}
int
rows = s.length();
int
cols = p.length();
boolean
[][] dp =
new
boolean
[rows +
1
][cols +
1
];
dp[
0
][
0
] =
true
;
for
(
int
j =
1
; j <= cols; j++) {
if
(p.charAt(j -
1
) ==
'*'
) {
dp[
0
][j] = dp[
0
][j -
2
];
}
}
for
(
int
i =
1
; i <= rows; i++) {
for
(
int
j =
1
; j <= cols; j++) {
char
sChar = s.charAt(i -
1
);
char
pChar = p.charAt(j -
1
);
if
(pChar !=
'*'
) {
if
(sChar == pChar || pChar ==
'.'
) {
dp[i][j] = dp[i -
1
][j -
1
];
}
}
else
{
if
(sChar != p.charAt(j -
2
) && p.charAt(j -
2
) !=
'.'
) {
dp[i][j] = dp[i][j -
2
];
}
else
{
dp[i][j] = dp[i][j -
2
] || dp[i -
1
][j] || dp[i][j -
1
];
}
}
}
}
return
dp[rows][cols];
}
时间 O(NM) 空间 O(N)
https://chazyhabit.wordpress.com/2014/08/11/regular-expression-matching-leetcode-46/
解法一其实是一种brute force算法,那么我们如何进行优化呢。注意到字符串的比较我们可以建立一个boolean[][] matrix来记录检查过的信息(matrix[i][j]表示s的前i个字符和p的前j个字符否匹配)。
- p[j+1] != ‘*’:判断当前i、j上字符是否一样(如果有p在j上的字符是’.’,也是相同),有res[i][j] == res[i-1][j-1] && s[i]==p[j
- p[j+1] == ‘*’,但是p[j] != ‘.’,那么只要以下条件有一个满足即可对res[i+1][j+1]赋值为true:
– 1) res[i+1][j]为真(’*’只取前面字符一次);
– 2) res[i+1][j-1]为真(’*’前面字符一次都不取,也就是忽略这两个字符);
– 3) res[i][j+1] && s[i]==s[i-1] && s[i-1]==p[j-1](这种情况是相当于i从0到s.length()扫过来,如果p[j+1]对应的字符是‘*’那就意味着接下来的串就可以依次匹配下来,如果下面的字符一直重复,并且就是‘*’前面的那个字符)。 - p[j+1] == ‘*’,并且p[j] == ‘.’,因为”.*”可以匹配任意字符串,所以在前面的res[i+1][j-1]或者res[i+1][j]中只要有i+1是true,那么剩下的res[i+1][j+1],res[i+2][j+1],…,res[s.length()][j+1]就都是true了。
这道题有个很重要的点,就是实现的时候外层循环应该是p,然后待匹配串s内层循环扫过来。
public
boolean
isMatch(String s, String p) {
if
(s==
null
||p==
null
)
return
false
;
if
(s.length()==
0
&&p.length()==
0
)
return
true
;
if
(p.length()==
0
)
return
false
;
boolean
[][] matrix =
new
boolean
[s.length()+
1
][p.length()+
1
];
matrix[
0
][
0
] =
true
;
for
(
int
j=
0
;j<p.length();j++) {
if
(p.charAt(j)==
'*'
) {
if
(j>
0
&& matrix[
0
][j-
1
]) matrix[
0
][j+
1
]=
true
;
if
(j<
1
)
continue
;
if
(p.charAt(j-
1
)!=
'.'
) {
for
(
int
i=
0
;i<s.length();i++) {
if
(matrix[i+
1
][j] || j>
0
&&matrix[i+
1
][j-
1
] || i>
0
&& j>
0
&& matrix[i][j+
1
]&&s.charAt(i)==s.charAt(i-
1
)&&s.charAt(i-
1
)==p.charAt(j-
1
))
matrix[i+
1
][j+
1
] =
true
;
}
}
else
{
int
i=
0
;
while
(j>
0
&& i<s.length() && !matrix[i+
1
][j-
1
] && !matrix[i+
1
][j])
i++;
for
(;i<s.length();i++)
matrix[i+
1
][j+
1
] =
true
;
}
}
else
{
for
(
int
i=
0
;i<s.length();i++) {
if
(s.charAt(i)==p.charAt(j) || p.charAt(j)==
'.'
)
matrix[i+
1
][j+
1
] = matrix[i][j];
}
}
}
return
matrix[s.length()][p.length()];
}
这道题可以用递归解决,不过时间复杂度是指数级,这里介绍一个用动态规划在平方时间内解决的办法。
解法的核心理念是:从后往前看pattern的每一位,对于pattern的每一位,我们尽可能的把待匹配串string从后往前给匹配上。我们用一个数组
解法的核心理念是:从后往前看pattern的每一位,对于pattern的每一位,我们尽可能的把待匹配串string从后往前给匹配上。我们用一个数组
match[string.length() + 1]
来表示string被匹配的情况,这里如果match[j]
是true,而我们pattern正指向第i位,则说明string从第j位到最后都已经被pattern第i位之前的某些部分给成功匹配了,所以我们不用再操心了。match[i]
为true的条件是match[i + 1]
为true,且string第i个字符也能被成功匹配。
那我们就可以从后往前开始看pattern的每一位能匹配多少string的字符了:
- 如果pattern的这一位是
*
,那我们要用这一位,来从后往前尝试匹配string,因为string后面是已经匹配好的,前面是还没匹配好的,所以从前往后匹配星号可能会导致我们匹配了一些pattern该星号前面的星号应该匹配的部分。而从后往前匹配则不会影响pattern该星号后面星号所匹配的部分,因为已经匹配的部分我们会直接跳过。 - 如果pattern这一位不是
*
,那我们则不能匹配多个字符,我们只能匹配一个字符,这时候要对string从前往后匹配,因为如果后面没被匹配,前面也肯定不会被匹配,所以从前向后能保证我们把pattern的这一位匹配到string当前最后面那个还没匹配的字符。这样如果那个字符能被匹配就通过了。
我们举个例子
match: 0 0 0 1
string: a a b
pattern: a * b
|
这里我们先看pattern最后一位b能匹配到多少,这里因为b不是星号,所以我们从左往右尝试匹配string,第一个a不行,第二个a也不行,然后到b,这里因为
match[3]
是true,b也和b相同,所以匹配成功。match: 0 0 1 1
string: a a b
pattern: a * b
|
然后看pattern的这个星号,我们要从后往前匹配string。因为b已经被匹配了,
match[2]
是true,所以直接跳过。然后到a,发现个pattern中星号前面的字符a相同,所以匹配成功,match[1]
也置为true再看string的第一个a,还是可以匹配成功,这样整个string都被匹配成功了。
这里还有几个情况,首先,无论刚才那pattern中最后一个b有没有匹配到string中任何一个字符,
match[3]
也要置为false。这样才能防止pattern最后字母没有匹配上,而pattern前面的部分反而把string的结尾给匹配了。还有如果pattern中是句号的话,那相当于字符相同。 public boolean isMatch(String s, String p) {
boolean[] match = new boolean[s.length() + 1];
match[s.length()] = true;
for(int i = p.length() - 1; i >=0; i--){
if(p.charAt(i) == '*'){
// 如果是星号,从后往前匹配
for(int j = s.length() - 1; j >= 0; j--){
match[j] = match[j] || (match[j + 1] && (p.charAt(i - 1) == '.' || (p.charAt(i - 1) == s.charAt(j))));
}
// 记得把i多减一,因为星号是和其前面的字符配合使用的
i--;
} else {
// 如果不是星号,从前往后匹配
for(int j = 0; j < s.length(); j++){
match[j] = match[j + 1] && (p.charAt(i) == '.' || (p.charAt(i) == s.charAt(j)));
}
// 只要试过了pattern中最后一个字符,就要把match[s.length()]置为false
match[s.length()] = false;
}
}
return match[0];
}
The recursion mainly breaks down elegantly to the following two cases:
If the next character of p is NOT ‘*’, then it must match the current character of s. Continue pattern matching with the next character of both s and p.
If the next character of p is ‘*’, then we do a brute force exhaustive matching of 0, 1, or more repeats of current character of p… Until we could not match any more characters.
X. DP
http://cwind.iteye.com/blog/2228723
仍然从待检字符串尾部向前扫描,设0≤j<s.length(),考虑对于子串s[j..s.length()-1]能够在正则表达式p找到匹配(match[j])的条件为s[j+1...s.length()-1]匹配且s[j]也能够在pattern中找到匹配。如何判断“s[j]也能够在pattern中找到匹配”呢?需要分两种情况讨论,设i为pattern索引,第一种情况:若p[i]不为'*',则进行单字符判断,当p[i]=='.'或p[i]==s[j]时match[j]成立;第二种情况:p[i]为"*",则match[j]成立的条件为p[i-1]=='.'或p[i-1]==p[j]。另外,在这种情况下若match[j]已经被置为true,就算p[i-1]=='.'||p[i-1]==p[j]不成立也应将其值保持,因为*出现时,其之前元素可以为0个。
- public boolean isMatch(String /* string to check */ s, String /* patterns */ p) {
- boolean[] match = new boolean[s.length()+1];
- for(int i=0; i<match.length; i++){
- match[i] = false;
- }
- match[s.length()] = true;
- for(int i=p.length()-1; i>=0; i--){
- if(p.charAt(i)=='*'){
- for(int j=s.length()-1; j>=0; j--){
- match[j] = match[j]||match[j+1]&&(p.charAt(i-1)=='.'||p.charAt(i-1)==s.charAt(j));
- }
- i--;
- }else {
- for(int j=0; j<s.length(); j++){
- match[j] = match[j+1]&&(p.charAt(i)=='.'||p.charAt(i)==s.charAt(j));
- }
- match[s.length()] = false;
- }
- }
- return match[0];
- }
X. Recursive
Not good: https://discuss.leetcode.com/topic/2818/the-shortest-ac-code/30
http://www.cnblogs.com/yrbbest/p/4430699.html
public boolean isMatch(String s, String p) { if(p == null || s == null) return false; if (p.length() == 0) return s.length() == 0; if (p.length() == 1) return s.length() == 1 && (p.charAt(0) == '.' || p.charAt(0) == s.charAt(0)) ; if (p.charAt(1) == '*') { if (isMatch(s, p.substring(2))) return true; return s.length() > 0 && (p.charAt(0) == '.' || s.charAt(0) == p.charAt(0)) && isMatch(s.substring(1), p); } else { return s.length() > 0 && (p.charAt(0) == '.' || s.charAt(0) == p.charAt(0)) && isMatch(s.substring(1), p.substring(1)); } }http://articles.leetcode.com/regular-expression-matching
思路2: 双指针扫描
why the regex pattern “.*” matches the string “ab”. “.*” means repeat thepreceding element 0 or more times. Here, the “preceding” element is the dot character in the pattern, which can match any characters. Therefore, the regex pattern “.*” allows the dot to be repeated any number of times, which matches any string (even an empty string).
A sample diagram of a deterministic finite state automata (DFA). DFAs are useful for doing lexical analysis and pattern matching. An example is UNIX’s grep tool. Please note that this post does not attempt to descibe a solution using DFA.
s = “abcbcd”, p = “a.*c.*d”
Here, “.*” in p means repeat ‘.’ 0 or more times. Since ‘.’ can match any character, it is not clear how many times ‘.’ should be repeated. Should the ‘c’ in p matches the first or second ‘c’ in s? Unfortunately, there is no way to tell without using some kind of exhaustive search.
We need some kind of backtracking mechanism such that when a matching fails, we return to the last successful matching state and attempt to match more characters in s with ‘*’. This approach leads naturally to recursion.
The recursion mainly breaks down elegantly to the following two cases:
bool isMatch(const char *s, const char *p) {
assert(s && p);
if (*p == '\0') return *s == '\0';
// next char is not '*': must match current character
if (*(p+1) != '*') {
assert(*p != '*');
return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
}
// next char is '*'
while ((*p == *s) || (*p == '.' && *s != '\0')) {
if (isMatch(s, p+2)) return true;
s++;
}
return isMatch(s, p+2);
}
Further Thoughts:
Some extra exercises to this problem:
Some extra exercises to this problem:
- 首先要理解题意:
- "a"对应"a", 这种匹配不解释了
- 任意字母对应".", 这也是正则常见
- 0到多个相同字符x,对应"x*", 比起普通正则,这个地方多出来一个前缀x. x代表的是 相同的字符中取一个,比如"aaaab"对应是"a*b"
- "*"还有一个易于疏忽的地方就是它的"贪婪性"要有一个限度.比如"aaa"对应"a*a", 代码逻辑不能一路贪婪到底
- 正则表达式如果期望着一个字符一个字符的匹配,是非常不现实的.而"匹配"这个问题,非 常容易转换成"匹配了一部分",整个匹配不匹配,要看"剩下的匹配"情况.这就很好的把 一个大的问题转换成了规模较小的问题:递归
- 确定了递归以后,使用java来实现这个问题,会遇到很多和c不一样的地方,因为java对字符 的控制不像c语言指针那么灵活charAt一定要确定某个位置存在才可以使用.
- 如果pattern是"x*"类型的话,那么pattern每次要两个两个的减少.否则,就是一个一个 的减少. 无论怎样减少,都要保证pattern有那么多个.比如s.substring(n), 其中n 最大也就是s.length()
public boolean isMatch(String s, String p) { if (p.length() == 0) { return s.length() == 0; } // length == 1 is the case that is easy to forget. // as p is subtracted 2 each time, so if original // p is odd, then finally it will face the length 1 if (p.length() == 1) { return (s.length() == 1) && (p.charAt(0) == s.charAt(0) || p.charAt(0) == '.'); } // next char is not '*': must match current character if (p.charAt(1) != '*') { if (s.length() < 1) { return false; } else { return (s.charAt(0) == p.charAt(0) || p.charAt(0) == '.') && isMatch(s.substring(1), p.substring(1)); } } // next char is * while (s.length() > 0 && (p.charAt(0) == s.charAt(0) || p.charAt(0) == '.')) { if (isMatch(s, p.substring(2))) { return true; } s = s.substring(1); } return isMatch(s, p.substring(2)); }
上面的解法逻辑性不强,不应该用while循环做,而应该全部用递归,如下
The key of the problem is to check if p[j + 1] is a '*', and has two cases:
1. If p[j + 1] is a '.', then this case is simple. Just need to check s.charAt(i) == p.charAt(j) || p.charAt(j) == '.'. If not, return false, else s and p goes to the next character, ie. i + 1, j + 1.
2. If p[j + 1] is a "*", the case is a bit tricky.
Suppose that if s[i], s[i + 1], s[i + 2] .. s[i + k] is equal to p[j], that means all those could be the possible matches. So we need to check the rest of (i, j + 2), (i + 1, j + 2), (i + 2, j + 2), ... (i + k, j + 2).
1. If p[j + 1] is a '.', then this case is simple. Just need to check s.charAt(i) == p.charAt(j) || p.charAt(j) == '.'. If not, return false, else s and p goes to the next character, ie. i + 1, j + 1.
2. If p[j + 1] is a "*", the case is a bit tricky.
Suppose that if s[i], s[i + 1], s[i + 2] .. s[i + k] is equal to p[j], that means all those could be the possible matches. So we need to check the rest of (i, j + 2), (i + 1, j + 2), (i + 2, j + 2), ... (i + k, j + 2).
public boolean isMatch(String s, String p) { // base case if (p.length() == 0) { return s.length() == 0; } // special case if (p.length() == 1) { // if the length of s is 0, return false if (s.length() < 1) { return false; } //if the first does not match, return false else if ((p.charAt(0) != s.charAt(0)) && (p.charAt(0) != '.')) { return false; } // otherwise, compare the rest of the string of s and p. else { return isMatch(s.substring(1), p.substring(1)); } } // case 1: when the second char of p is not '*' if (p.charAt(1) != '*') { if (s.length() < 1) { return false; } if ((p.charAt(0) != s.charAt(0)) && (p.charAt(0) != '.')) { return false; } else { return isMatch(s.substring(1), p.substring(1)); } } // case 2: when the second char of p is '*', complex case. else { //case 2.1: a char & '*' can stand for 0 element if (isMatch(s, p.substring(2))) { return true; } //case 2.2: a char & '*' can stand for 1 or more preceding element, //so try every sub string int i = 0; while (i<s.length() && (s.charAt(i)==p.charAt(0) || p.charAt(0)=='.')){ if (isMatch(s.substring(i + 1), p.substring(2))) { return true; } i++; } return false; } }http://codeganker.blogspot.com/2014/02/regular-expression-matching-leetcode.html
(1)p[j+1]不是'*'。情况比较简单,只要判断当前s的i和p的j上的字符是否一样(如果有p在j上的字符是'.',也是相同),如果不同,返回false,否则,递归下一层i+1,j+1;
(2)p[j+1]是'*'。那么此时看从s[i]开始的子串,假设s[i],s[i+1],...s[i+k]都等于p[j]那么意味着这些都有可能是合适的匹配,那么递归对于剩下的(i,j+2),(i+1,j+2),...,(i+k,j+2)都要尝试(j+2是因为跳过当前和下一个'*'字符)。
public boolean isMatch(String s, String p) { return helper(s,p,0,0); } private boolean helper(String s, String p, int i, int j) { if(j==p.length()) return i==s.length(); if(j==p.length()-1 || p.charAt(j+1)!='*') { if(i==s.length()|| s.charAt(i)!=p.charAt(j) && p.charAt(j)!='.') return false; else return helper(s,p,i+1,j+1); } //p.charAt(j+1)=='*' while(i<s.length() && (p.charAt(j)=='.' || s.charAt(i)==p.charAt(j))) { if(helper(s,p,i,j+2)) return true; i++; } return helper(s,p,i,j+2); }
假设我们维护一个布尔数组res[i][j],代表s的前i个字符和p的前j个字符是否匹配(注意这里res的维度是s.length()+1,p.length()+1)。递推公式跟上面类似,分三种种情况:
(1)p[j+1]不是'*'。情况比较简单,只要判断如果当前s的i和p的j上的字符一样(如果有p在j上的字符是'.',也是相同),并且res[i][j]==true,则res[i+1][j+1]也为true,res[i+1][j+1]=false;
(2)p[j+1]是'*',但是p[j]!='.'。那么只要以下条件有一个满足即可对res[i+1][j+1]赋值为true:
1)res[i+1][j]为真('*'只取前面字符一次);
2)res[i+1][j-1]为真('*'前面字符一次都不取,也就是忽略这两个字符);
3)res[i][j+1] && s[i]==s[i-1] && s[i-1]==p[j-1](这种情况是相当于i从0到s.length()扫过来,如果p[j+1]对应的字符是‘*’那就意味着接下来的串就可以依次匹配下来,如果下面的字符一直重复,并且就是‘*’前面的那个字符)。
(3)p[j+1]是'*',并且p[j]=='.'。因为".*"可以匹配任意字符串,所以在前面的res[i+1][j-1]或者res[i+1][j]中只要有i+1是true,那么剩下的res[i+1][j+1],res[i+2][j+1],...,res[s.length()][j+1]就都是true了。
public boolean isMatch(String s, String p) { if(s.length()==0 && p.length()==0) return true; if(p.length()==0) return false; boolean[][] res = new boolean[s.length()+1][p.length()+1]; res[0][0] = true; for(int j=0;j<p.length();j++) { if(p.charAt(j)=='*') { if(j>0 && res[0][j-1]) res[0][j+1]=true; if(j<1) continue; if(p.charAt(j-1)!='.') { for(int i=0;i<s.length();i++) { if(res[i+1][j] || j>0&&res[i+1][j-1] || i>0 && j>0 && res[i][j+1]&&s.charAt(i)==s.charAt(i-1)&&s.charAt(i-1)==p.charAt(j-1)) res[i+1][j+1] = true; } } else { int i=0; while(j>0 && i<s.length() && !res[i+1][j-1] && !res[i+1][j]) i++; for(;i<s.length();i++) { res[i+1][j+1] = true; } } } else { for(int i=0;i<s.length();i++) { if(s.charAt(i)==p.charAt(j) || p.charAt(j)=='.') res[i+1][j+1] = res[i][j]; } } } return res[s.length()][p.length()]; }Java code from http://www.programcreek.com/2012/12/leetcode-regular-expression-matching-in-java/
The problem should be simplified to handle 2 basic cases:
- the second char of pattern is "*"
- the second char of pattern is not "*"
For the 1st case, if the first char of pattern is not ".", the first char of pattern and string should be the same. Then continue to match the remaining part.
For the 2nd case, if the first char of pattern is "." or first char of pattern == the first i char of string, continue to match the remaining part.
public boolean isMatch(String s, String p) { if(p.length() == 0) return s.length() == 0; //p's length 1 is special case if(p.length() == 1 || p.charAt(1) != '*'){ if(s.length() < 1 || (p.charAt(0) != '.' && s.charAt(0) != p.charAt(0))) return false; return isMatch(s.substring(1), p.substring(1)); }else{ int len = s.length(); int i = -1; while(i<len && (i < 0 || p.charAt(0) == '.' || p.charAt(0) == s.charAt(i))){ if(isMatch(s.substring(i+1), p.substring(2))) return true; i++; } return false; } }
bool isMatch(const char *s, const char *p) {
assert(s && p);
if (*p == '\0') return *s == '\0';
// next char is not '*': must match current character
if (*(p+1) != '*') {
assert(*p != '*');
return ((*p == *s) || (*p == '.' && *s != '\0')) && isMatch(s+1, p+1);
}
// next char is '*'
while ((*p == *s) || (*p == '.' && *s != '\0')) {
if (isMatch(s, p+2)) return true;
s++;
}
return isMatch(s, p+2);
}
EPI Solution:Implement regular expression matching
Regular_expression.cpp RegularExpression.javapublic static boolean isMatch(String r, String s) {
// Case (2.) : starts with '^'.
if (r.charAt(0) == '^') {
return isMatchHere(r.substring(1), s);
}
for (int i = 0; i <= s.length(); ++i) {
if (isMatchHere(r, s.substring(i))) {
return true;
}
}
return false;
}
private static boolean isMatchHere(String r, String s) {
// Case (1.)
if (r.isEmpty()) {
return true;
}
// Case (2) : ends with '$'.
if ("$".equals(r)) {
return s.isEmpty();
}
// Case (4.)
if (r.length() >= 2 && r.charAt(1) == '*') {
for (int i = 0; i < s.length()
&& (r.charAt(0) == '.' || r.charAt(0) == s.charAt(i)); ++i) {
if (isMatchHere(r.substring(2), s.substring(i + 1))) {
return true;
}
}
return isMatchHere(r.substring(2), s);
}
// Case (3.)
return !s.isEmpty() && (r.charAt(0) == '.' || r.charAt(0) == s.charAt(0))
&& isMatchHere(r.substring(1), s.substring(1));
}
https://discuss.leetcode.com/topic/54908/evolve-from-brute-force-to-dp
- Recursion. This is the most straight forward idea.
bool isMatch(string s, string p) {
return isMatch(0,s,0,p);
}
bool isMatch(int i, string& s, int j, string &p) {
int pn=p.size(), sn = s.size();
if(j==pn) return i==sn;
if(p[j+1]=='*') {
if(isMatch(i,s,j+2,p)) return 1;
while(i<sn && (p[j]==s[i]||p[j]=='.')) if(isMatch(++i,s,j+2,p)) return 1;
} else if (i<sn && (p[j]=='.'|| s[i]==p[j]) && isMatch(i+1,s,j+1,p)) return 1;
return 0;
}
Time complexity is roughly O(m+n choose n).
Say n is length of s, m is length of p. In the worst case,
T(n,m) = T(n,m-2)+T(n-1,m-2)+T(n-2,m-2)+...+T(1, m-2)
T(n-1,m) = T(n-1,m-2)+T(n-2,m-2)+...+T(1,m-2)
=> T(n,m) = T(n-1,m) + T(n,m-2)
Initial conditions are T(0,m) = m, T(n,0) = 1
To solve the 2 variable recursion, see here
Say n is length of s, m is length of p. In the worst case,
T(n,m) = T(n,m-2)+T(n-1,m-2)+T(n-2,m-2)+...+T(1, m-2)
T(n-1,m) = T(n-1,m-2)+T(n-2,m-2)+...+T(1,m-2)
=> T(n,m) = T(n-1,m) + T(n,m-2)
Initial conditions are T(0,m) = m, T(n,0) = 1
To solve the 2 variable recursion, see here
TODO: http://www.cnblogs.com/yrbbest/p/4430699.html
我们先来判断p是否为空,若为空则根据s的为空的情况返回结果。当p的第二个字符为*号时,由于*号前面的字符的个数可以任意,可以为0,那么我们先用递归来调用为0的情况,就是直接把这两个字符去掉再比较,或者当s不为空,且第一个字符和p的第一个字符相同时,我们再对去掉首字符的s和p调用递归,注意p不能去掉首字符,因为*号前面的字符可以有无限个;如果第二个字符不为*号,那么我们就老老实实的比较第一个字符,然后对后面的字符串调用递归
bool isMatch(string s, string p) { if (p.empty()) return s.empty(); if (p.size() > 1 && p[1] == '*') { return isMatch(s, p.substr(2)) || (!s.empty() && (s[0] == p[0] || p[0] == '.') && isMatch(s.substr(1), p)); } else { return !s.empty() && (s[0] == p[0] || p[0] == '.') && isMatch(s.substr(1), p.substr(1)); }
public boolean isMatch(String s, String p) {
if (p == null || p.length() == 0)
return (s == null || s.length() == 0);
if (p.length() == 1 || p.charAt(1) != '*')
return (s.length() > 0 && (s.charAt(0) == p.charAt(0) || p.charAt(0) == '.') &&
isMatch(s.substring(1), p.substring(1)));
return (isMatch(s, p.substring(2)) ||
(s.length() > 0 && (s.charAt(0) == p.charAt(0) || p.charAt(0) == '.') && isMatch(s.substring(1), p)));
}
public boolean isMatch(String text, String pattern) {
if (pattern.isEmpty())
return text.isEmpty();
boolean first_match = (!text.isEmpty() && (pattern.charAt(0) == text.charAt(0) || pattern.charAt(0) == '.'));
if (pattern.length() >= 2 && pattern.charAt(1) == '*') {
return (isMatch(text, pattern.substring(2)) || (first_match && isMatch(text.substring(1), pattern)));
} else {
return first_match && isMatch(text.substring(1), pattern.substring(1));
}
}
https://blog.csdn.net/roufoo/article/details/82659855
enum Result {
TRUE, FALSE
}
class Solution {
Result[][] memo;
public boolean isMatch(String text, String pattern) {
memo = new Result[text.length() + 1][pattern.length() + 1];
return dp(0, 0, text, pattern);
}
public boolean dp(int i, int j, String text, String pattern) {
if (memo[i][j] != null) {
return memo[i][j];
}
boolean ans;
if (j == pattern.length()) {
ans = i == text.length();
} else {
boolean first_match = (i < text.length() && (pattern.charAt(j) == text.charAt(i) || pattern.charAt(j) == '.'));
if (j + 1 < pattern.length() && pattern.charAt(j + 1) == '*') {
ans = (dp(i, j + 2, text, pattern) || first_match && dp(i + 1, j, text, pattern));
} else {
ans = first_match && dp(i + 1, j + 1, text, pattern);
}
}
memo[i][j] = ans ? Result.TRUE : Result.FALSE;
return ans;
}
}
- Time Complexity: Let be the lengths of the text and the pattern respectively. In the worst case, a call to
match(text[i:], pattern[2j:])
will be made times, and strings of the order and will be made. Thus, the complexity has the order . With some effort outside the scope of this article, we can show this is bounded by ). - Space Complexity: For every call to
match
, we will create those strings as described above, possibly creating duplicates. If memory is not freed, this will also take a total of ) space, even though there are only order unique suffixes of and that are actually required.
public boolean isMatch(String text, String pattern) {
if (pattern.isEmpty())
return text.isEmpty();
boolean first_match = (!text.isEmpty() && (pattern.charAt(0) == text.charAt(0) || pattern.charAt(0) == '.'));
if (pattern.length() >= 2 && pattern.charAt(1) == '*') {
return (isMatch(text, pattern.substring(2)) || (first_match && isMatch(text.substring(1), pattern)));
} else {
return first_match && isMatch(text.substring(1), pattern.substring(1));
}
}
1. 不不考虑星号
2. 楼主想⽤用dp,但是他坚持要先写 dfs / recurrsion
3. regex变成“+”,“*”(我的妈,上来我⼀一看题就知道要挂…我⽤用dp写的,
但是天竺⼩小哥说最好⽤用recursion..)