leetcode 727

问题

Given strings S and T, find the minimum (contiguous) substring W of S, so that T is a subsequence of W.

If there is no such window in S that covers all characters in T, return the empty string "". If there are multiple such minimum-length windows, return the one with the left-most starting index.

Example 1:

1
2
3
4
5
6
Input: 
S = "abcdebdde", T = "bde"
Output: "bcde"
Explanation:
"bcde" is the answer because it occurs before "bdde" which has the same length.
"deb" is not a smaller window because the elements of T in the window must occur in order.

Note:

  • All the strings in the input will only contain lowercase letters.
  • The length of S will be in the range [1, 20000].
  • The length of T will be in the range [1, 100].

分析1

以公共子序列为参考,初步分析可以使用动态规划的解法。假设 $dp[i][j] = -1$ 时表示 $T[0…i]$ 不是 $S[0…j]$ 的子序列。若 $dp[i][j] \neq 0$ 时,$T[0…i]$ 是 $S[0…j]$ 的子序列,且$dp[i][j] = x$ 表示在 $T[0…i]$ 是 $S[x…j]$ 的子序列的情况下,$x$ 取到的最大值。则有转移方程如下:
$$
dp[i][j] = \left\{
\begin{array}{**lr**}
dp[i-1][j-1], & (T[i] == S[j] \;\&\&\; dp[i-1][j-1] \neq -1) \\
dp[i][j-1], \; &(dp[i][j-1] \neq -1)\\
-1 & else
\end{array}
\right.
$$
为了保证 $dp[i][j]$ 取到尽量大的值,在状态转移时,应该按上上述顺序从上到下的优先级进行状态转移。

目标:

$$
len = \min\limits_{0 \leq j <n }\{ j-dp[m-1][j]+1\}\;(dp[m-1][j] \neq -1) \\
start = dp[m-1][j] \;\;(dp[m-1][j] 使得len取得最小值时) \\
ans = S.substr(start, \;len);
$$

代码1

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
class Solution {
public:
string minWindow(string S, string T) {
int m = T.size();
int n = S.size();
vector<vector<int>> dp(m, vector<int>(n, -1));

if (S[0] == T[0])
dp[0][0] = 0;
for (int i = 1; i < n; ++i) {
dp[0][i] = T[0] == S[i]? i: dp[0][i-1];
}

for (int i = 1; i < m; ++i) {
for (int j = 1; j < n; ++j) {
if (dp[i-1][j-1] != -1 && T[i] == S[j])
dp[i][j] = dp[i-1][j-1];
else if (dp[i][j-1] != -1)
dp[i][j] = dp[i][j-1];
}
}
int len = INT_MAX;
int start = -1;
for (int j = 0; j < n; ++j) {
if (dp[m-1][j] != -1 && len > j - dp[m-1][j] +1) {
start = dp[m-1][j];
len = j - dp[m-1][j] +1;
}
}
return start == -1? "" : S.substr(start, len);
}
};

分析2

一个字符串时候是另一个字符串的子序列的问题解法,时间复杂度是 $O(m+n)$ ,借助这个问题的解法。在此基础上,没进行一次成功的匹配,则将两个字符串指针都进行回退,从而在 $S$ 中找到能够实现此次匹配的最大起始下标,即可以实现本次匹配的最短子串。时间复杂度最坏应该是 $O(mn)$ ,但在实际过程中要比这个值小得多,在leetcode使用cpp测试,直接上上面的 128ms 降到 12ms

代码2

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
class Solution {
public:
string minWindow(string S, string T) {
int n = S.size();
int m = T.size();
int i = 0, j = 0, start = -1, len = n+1;
while (i < n) {
if (S[i] == T[j]) {
if (j == m-1) {
int p = i+1;
while (j >= 0) {
while (S[i] != T[j])
--i;
--i; // note
--j;
}
++i;
++j;
if (len > p-i) {
start = i;
len = p-i;
}
} else {
++j;
}
}
++i;
}
return start==-1? "": S.substr(start, len);
}
};