0%

10.正则表达式匹配

Given an input string (s) and a pattern (p), implement regular expression matching with support for ‘.’ and ‘*’.

‘.’ Matches any single character.
‘*’ Matches zero or more of the preceding element.
The matching should cover the entire input string (not partial).

Note:

s could be empty and contains only lowercase letters a-z.
p could be empty and contains only lowercase letters a-z, and characters like . or *.
Example 1:

Input:
s = “aa”
p = “a”
Output: false
Explanation: “a” does not match the entire string “aa”.
Example 2:

Input:
s = “aa”
p = “a*”
Output: true
Explanation: ‘*’ means zero or more of the preceding element, ‘a’. Therefore, by repeating ‘a’ once, it becomes “aa”.
Example 3:

Input:
s = “ab”
p = “.
Output: true
Explanation: “.
“ means “zero or more (*) of any character (.)”.
Example 4:

Input:
s = “aab”
p = “cab”
Output: true
Explanation: c can be repeated 0 times, a can be repeated 1 time. Therefore, it matches “aab”.
Example 5:

Input:
s = “mississippi”
p = “misisp*.”
Output: false

来源:力扣(LeetCode)
链接:https://leetcode-cn.com/problems/regular-expression-matching
著作权归领扣网络所有。商业转载请联系官方授权,非商业转载请注明出处。


框架

1
2
3
4
5
6
class Solution {
public:
bool isMatch(string s, string p) {

}
};

1. 直接遍历

O(n)
失败了……
本来觉得还算简单来着,但是这个 *着实气人,它想匹配多少就匹配多少,连 a*a都有。。
下面的代码是错的,没有考虑这种智障输入。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
class Solution {
public:
bool isMatch(string s, string p) {
int ni = s.size();
int nj = p.size();
int i = 0, j = 0;
char pred = ' ';

while(i < ni && j < nj){
if (p[j] == '.') {
if(p[j + 1] == '*'){ // ".*" matches everything
i = ni;
j++;
}
else
i++;
}else if (p[j] == '*') {
while(i < ni && s[i] == pred)
i++;
}else{
if (p[j + 1] == '*'){
while(i < ni && s[i] == p[j])
i++;
j++;
}else{
if (s[i] == p[j])
i++;
else
return false;
}
}

j++;
pred = s[i - 1];
}

while(i == ni && j < nj){
if (p[j] == '.') {
if(p[j + 1] == '*')
j += 2;
else
return false;
}else if (p[j] == '*')
j++;
else
return false;
}

if (i < ni)
return false;

return true;
}
};

2. 动态规划

因为不能确定 *到底匹配多少个前一个字符,所以可以都测试一下。
bool dp[i][j]用来代表 s的前 i个字符和 p的前 j个字符的匹配关系,也就是 s[0 ~ i-1]p[0 ~ j-1]
那么 dp[i][j]就可以依靠前面 dp[][]True/False决定,表示后半部分匹配,结果由前半部分的匹配情况确定。

以下讨论 dp[i][j],即 s[0 ~ i-1]p[0 ~ j-1]的匹配情况。

  • s[i-1] == p[j-1]
    dp[i][j] = dp[i-1][j-1]说明 p中肯定是字母,并且二者在该位上匹配,所以 s[0 ~ i-1], p[0 ~ j-1]这部分是否匹配取决于 dp[i-1][j-1]
  • p[j-1] == '.'
    dp[i][j] = dp[i-1][j-1]``.可以匹配任何字母,因此二者在该位上也是匹配的,所以如上。
  • p[j-1] == '*'
    *可以匹配0次或多(>0)次在 *左侧的那个字符。
    • s[i-1] != p[j-1-1] && p[j-1-1] != '.'dp[i][j] = dp[i][j-2]
      无法匹配,只能匹配0次。
    • else / s[i-1] == p[j-1-1] || p[j-1-1] == '.'
      可以匹配,可能匹配0次或多(>0)次。
      • 匹配0次
        dp[i][j] = dp[i][j-2]即便 s[i-1] == p[j-1-1]也是有可能匹配0次的,比如 s = abc, p = 'abcc*'。匹配0次相当于这个字符和 *都不存在了,因此跳过这两个字符,如 s = ab, p = abc*
      • 匹配多(>0)次
        dp[i][j] = dp[i-1][j]
        匹配 k次依赖于匹配 k-1次,后者又依赖于 k-2次,因此最终就变成了匹配0次,所以步骤就是依赖于 dp[i-1][j]
  • else / s[i-1] != p[j-1] && p[j-1] != '.' && p[j-1] != '*'
    dp[i][j] = false
    不匹配。

初始化:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
class Solution {
public:
bool isMatch(string s, string p) {
int ns = s.size();
int np = p.size();
if(ns == 0){
if(np % 2 == 0){
for(int i = 1; i < np; i += 2)
if(p[i] != '*')
return false;
return true;
}else
return false;
}
if(np == 0 && ns != 0)
return false;

bool **dp = new bool*[ns + 1];
for(int i = 0; i <= ns; i++)
dp[i] = new bool[np + 1];

//initialization
dp[0][0] = true;
dp[0][1] = false;
for(int i = 2; i <= np; i++)
dp[0][i] = dp[0][i-2] && p[i-1] == '*';
for(int i = 1; i <= ns; i++)
dp[i][0] = false;

//dp
for(int i = 1; i <= ns; i++){
for(int j = 1; j <= np; j++){
if(s[i-1] == p[j-1] || p[j-1] == '.')
dp[i][j] = dp[i-1][j-1];
else if(p[j-1] == '*'){
if(s[i-1] != p[j-2] && p[j-2] != '.')
dp[i][j] = dp[i][j-2];
else
dp[i][j] = dp[i][j-2] || dp[i-1][j];
}else
dp[i][j] = false;
}
}

bool ans = dp[ns][np];
for(int i = 0; i < ns + 1; i++)
delete []dp[i];
delete []dp;

return ans;
}
};