leetcode 1520

问题

Given a string s of lowercase letters, you need to find the maximum number of non-empty substrings of s that meet the following conditions:

  1. The substrings do not overlap, that is for any two substrings s[i..j] and s[k..l], either j < k or i > l is true.
  2. A substring that contains a certain character c must also contain all occurrences of c.

Find the maximum number of substrings that meet the above conditions. If there are multiple solutions with the same number of substrings, return the one with minimum total length. It can be shown that there exists a unique solution of minimum total length.

Notice that you can return the substrings in any order.

Example 1:

1
2
3
4
5
6
7
8
9
10
11
12
Input: s = "adefaddaccc"
Output: ["e","f","ccc"]
Explanation: The following are all the possible substrings that meet the conditions:
[
"adefaddaccc"
"adefadda",
"ef",
"e",
"f",
"ccc",
]
If we choose the first string, we cannot choose anything else and we'd get only 1. If we choose "adefadda", we are left with "ccc" which is the only one that doesn't overlap, thus obtaining 2 substrings. Notice also, that it's not optimal to choose "ef" since it can be split into two. Therefore, the optimal way is to choose ["e","f","ccc"] which gives us 3 substrings. No other solution of the same number of substrings exist.

Example 2:

1
2
3
Input: s = "abbaccd"
Output: ["d","bb","cc"]
Explanation: Notice that while the set of substrings ["d","abba","cc"] also has length 3, it's considered incorrect since it has larger total length.

Constraints:

  • 1 <= s.length <= 10^5
  • s contains only lowercase English letters.

分析

C++ Greedy O(n))

保存每个字符出现的最左端和最右端的下标。

  • 如果是找到了一段合法的 substring,同时该 substring 中包含了更短的符合要求的 substring,则忽略掉长的substring。
    • 如找到了 “abccba”,其中包含 “bccb”,又包含 “cc”,则保留 “cc” ,忽略其他。
  1. First, collect left (l) and right(r) indices for every character.
  2. Then, go through the string. If the current index i is the left index for the character S[i], it may be a valid substring.
    • We then use a sliding window pattern to find the lendth of the substring.
      • If we find a character that appears before i - we do not have a valid string starting at i.
    • If substring is valid, we remember it, as well as it’s right edge.
    • This logic, repeated, will ensure that we find the valid substring with the smallest right edge.
  3. If the valid substring starts after the previous right edge, there is no overlap. The previous substring should be included into the result.

最快情况下查找26次字串,每一次查找的时间复杂度是 $O(n)$ ,因此整体的时间复杂度是 $O(n)$ 。

空间复杂度,只用额外空间记录了每个字符最左端和最右端的下标,因此时间复杂度是 $O(1)$。

代码

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
class Solution {
public:
int getRight(string &s, int i, vector<int> &l, vector<int> &r) {
int right = r[s[i]-'a'];
for (int j = i+1; j < right; ++j) {
if (l[s[j]-'a'] < i) return -1;
right = max(right, r[s[j]-'a']);
}
return right;
}
vector<string> maxNumOfSubstrings(string s) {
vector<int> l(26, INT_MAX);
vector<int> r(26, INT_MIN);

for (int i = 0; i < s.size(); ++i) {
l[s[i]-'a'] = min(l[s[i]-'a'], i);
r[s[i]-'a'] = max(r[s[i]-'a'], i);
}

vector<string> ans;
int right = -1;
for (int i = 0; i < s.size(); ++i) {
if (i == l[s[i]-'a']) {
int new_right = getRight(s, i, l, r);
if (new_right != -1) {
if (i > right)
ans.push_back("");
right = new_right;
ans.back() = s.substr(i, right-i+1);
}
}
}
return ans;
}
};