LeetCode 828 - Unique Letter String


https://leetcode.com/problems/unique-letter-string/
A character is unique in string S if it occurs exactly once in it.
For example, in string S = "LETTER", the only unique characters are "L" and "R".
Let's define UNIQ(S) as the number of unique characters in string S.
For example, UNIQ("LETTER") =  2.
Given a string S with only uppercases, calculate the sum of UNIQ(substring) over all non-empty substrings of S.
If there are two or more equal substrings at different positions in S, we consider them different.
Since the answer can be very large, return the answer modulo 10 ^ 9 + 7.

Example 1:
Input: "ABC"
Output: 10
Explanation: All possible substrings are: "A","B","C","AB","BC" and "ABC".
Evey substring is composed with only unique letters.
Sum of lengths of all substring is 1 + 1 + 1 + 2 + 2 + 3 = 10
Example 2:
Input: "ABA"
Output: 8
Explanation: The same as example 1, except uni("ABA") = 1.

Note: 0 <= S.length <= 10000
X.
肯定不能一个个substring来做,换个角度,求每个字符的有效substring个数
https://www.cnblogs.com/f91og/p/9710808.html
首先以字符串XAXAXXAX为例,如果将第二个 A 变成唯一的character的话,只可能时某个唯一的substring包含了这个A。比如:
We can take "XA(XAXX)AX" and between "()" is our substring.
利用这种方式,可以使得第二个A成为唯一的character,那么从上可知我们要做的是:
We can see here, to make the second "A" counted as a uniq character, we need to:
  1. insert "(" somewhere between the first and second A
  2. insert ")" somewhere between the second and third A
For step 1 we have "A(XA" and "AX(A", 2 possibility.
For step 2 we have "A)XXA""AX)XA" and "AXX)A", 3 possibilities.
So there are in total 2 * 3 = 6 ways to make the second A a unique character in a substring.
In other words, there are only 6 substring, in which this A contribute 1 point as unique string.
现在逆转下思维,一开始的思维是在原字符串的所有子串中寻找可能的唯一的character,我们现在就直接在S中来计算每个字符,看看对于每个unique char总公有多少种不同的找法。换而言之就是,对于S中的每个字符,如果他作为unique char的话,那么包含他的substring的范围是在前一个相同字符以及后一个相同字符这个区间内找的(参见上面的A),将所有可能的substring数量加起来即可,很有趣的逆向思维。
https://leetcode.com/problems/unique-letter-string/discuss/128952/One-pass-O(N)-Straight-Forward
Let's think about how a character can be found as a unique character.
Think about string "XAXAXXAX" and focus on making the second "A" a unique character.
We can take "XA(XAXX)AX" and between "()" is our substring.
We can see here, to make the second "A" counted as a uniq character, we need to:
  1. insert "(" somewhere between the first and second A
  2. insert ")" somewhere between the second and third A
For step 1 we have "A(XA" and "AX(A", 2 possibility.
For step 2 we have "A)XXA""AX)XA" and "AXX)A", 3 possibilities.
So there are in total 2 * 3 = 6 ways to make the second A a unique character in a substring.
In other words, there are only 6 substring, in which this A contribute 1 point as unique string.
Instead of counting all unique characters and struggling with all possible substrings,
we can count for every char in S, how many ways to be found as a unique char.
We count and sum, and it will be out answer.
Explanation:
  1. index[26][2] record last two occurrence index for every upper characters.
  2. Initialise all values in index to -1.
  3. Loop on string S, for every character c, update its last two occurrence index to index[c].
  4. Count when loop. For example, if "A" appears twice at index 3, 6, 9 seperately, we need to count:
    • For the first "A": (6-3) * (3-(-1))"
    • For the second "A": (9-6) * (6-3)"
    • For the third "A": (N-9) * (9-6)"
Complexity:
One pass, time complexity O(N).
    public int uniqueLetterString(String S) {
        int[][] index = new int[26][2];
        for (int i = 0; i < 26; ++i) Arrays.fill(index[i], -1);
        int res = 0, N = S.length(), mod = (int)Math.pow(10, 9) + 7;
        for (int i = 0; i < N; ++i) {
            int c = S.charAt(i) - 'A';
            res = (res + (i - index[c][1]) * (index[c][1] - index[c][0]) % mod) % mod;
            index[c] = new int[] {index[c][1], i};
        }
        for (int c = 0; c < 26; ++c)
            res = (res + (N - index[c][1]) * (index[c][1] - index[c][0]) % mod) % mod;
        return res;
    }


X.
https://leetcode.com/problems/unique-letter-string/discuss/129021/O(N)-Java-Solution-DP-Clear-and-easy-to-Understand
n each loop, We caculate cur[i], which represent the sum of Uniq() for all substrings whose last char is S.charAt(i).
For example,
S = 'ABCBD'
When i = 2, cur[2] = Uniq('ABC') + Uniq('BC') + Uniq('C')
When i = 3, cur[3] = Uniq('ABCB') + Uniq('BCB') + Uniq('CB') + Uniq('B')
Notice, we append char 'B' into each previous substrings. Only in substrings 'CB' and 'B', the char 'B' can be identified as uniq. The contribution of 'B' from cur[2] to cur[3] is i - showLastPosition['B']. At the same time, in substrings 'ABCB', 'BCB', the char 'B' can‘t’ be identified as uniq any more, the previous contribution of 'B' should be removed.
So we have'cur[i] = cur[i - 1] - contribution[S.charAt(i)] + (i - showLastPosition[S.charAt(i)])
Then the new contribution[S.charAt(i)] = i - showLastPosition[S.charAt(i)]
The final result is the sum of all cur[i].
The showLastPosition[x] represent the last showing positon of char x befor i. (The initial showLastPosition[x] for all chars should be '-1'. For convenience, I shift it to '0'. It may cause some confusion, notice I update "showLastPosition[x] = i + 1" for each loop.)
    public int uniqueLetterString(String S) {
        
        int res = 0;
        if (S == null || S.length() == 0)
            return res;    
        int[] showLastPosition = new int[128];
        int[] contribution = new int[128];
        int cur = 0;
        for (int i = 0; i < S.length(); i++) {
            char x = S.charAt(i);
            cur -= contribution[x]; 
            contribution[x] = (i - (showLastPosition[x] - 1));
            cur += contribution[x]; 
            showLastPosition[x] = i + 1;
            res += cur;
        }   
        return res;
    }

X.
public int uniqueLetterString(String S) { //上次出现位置 int[] lastAppearPos = new int[26]; //上上次出现位置 int[] lastLastAppearPos = new int[26]; long[] res = new long[S.length() + 1]; for (int i = 1; i < res.length; i++) { int letter = S.charAt(i - 1) - 'A'; res[i] = res[i - 1]; res[i] += (i - lastAppearPos[letter]); res[i]-=(lastAppearPos[letter]-lastLastAppearPos[letter]); lastLastAppearPos[letter]=lastAppearPos[letter]; lastAppearPos[letter]=i; if(i>>8==0) { res[i]%=1000000007; } } long ans=0; for(int i=0;i<res.length;i++)ans+=res[i]; return (int) (ans%1000000007); }

https://leetcode.com/problems/unique-letter-string/discuss/158378/Concise-DP-O(n)-solution
Let dp[i] is sum of unique char in all substring ending at i, then the answer is sum(dp[i]), i=[0..n-1]. It's not difficult to get the recursive formula:
dp[i] = dp[i-1] + (i - first_from_i(s[i])) - (first_from_i(s[i]) - second_from_i(s[i]))
Take the below example:
BBBBBBBBBBBBBBBBBOABCDOABCOABC
                 ^    ^   ^
                 s    f   i

dp[i] = dp[i-1] + (i-f) - (f-s)
When extending s[i] to all substrings ending with s[i-1], there are (i-f) more unique char s[i], and (f-s) less unique char because of duplicate of s[i].
The implementation is quite simple. Here is Java version:
    public int uniqueLetterString(String s) {
            final int MOD = 1000000007;
            int res = 0, dp = 0;
            int[] first = new int[26], second = new int[26];

            for (int i = 0; i < s.length(); i++) {
                int ci = s.charAt(i) - 'A';
                dp = dp + 1 + i - first[ci] * 2 + second[ci];
                res = (res + dp) % MOD;

                second[ci] = first[ci];
                first[ci] = i + 1;
            }

            return res;
    }
X. http://bookshadow.com/weblog/2018/05/06/leetcode-unique-letter-string/
字符统计
分别统计每个字母出现的下标

假设字母letter的下标数组为idx,将-1和len(S)插入idx的头部和尾部

则sum((idx[i] - idx[i - 1]) * (idx[i + 1] - idx[i]))为letter出现的总次数
X. https://leetcode.com/articles/unique-letter-string/
Approach #2: Split by Character [Accepted]
As in Approach #1, we have U(x) = \sum_{c \in \mathcal{A}} U_c(x), where \mathcal{A} = \{ \text{&quot;A&quot;}, \text{&quot;B&quot;}, \dots \} is the alphabet, and we only need to answer the following question (26 times, one for each character): how many substrings have exactly one \text{&quot;A&quot;}?
Consider how many substrings have a specific \text{&quot;A&quot;}. For example, let's say S only has three "A"'s, at 'S[10] = S[14] = S[20] = "A"; and we want to know the number of substrings that contain S[14]. The answer is that there are 4 choices for the left boundary of the substring (11, 12, 13, 14), and 6 choices for the right boundary (14, 15, 16, 17, 18, 19). So in total, there are 24 substrings that have S[14] as their unique "A".
Continuing our example, if we wanted to count the number of substrings that have S[10], this would be 10 * 4 - note that when there is no more "A" characters to the left of S[10], we have to count up to the left edge of the string.
We can add up all these possibilities to get our final answer.
Time Complexity: O(N), where N is the length of S

  public int uniqueLetterString(String S) {
    Map<Character, List<Integer>> index = new HashMap();
    for (int i = 0; i < S.length(); ++i) {
      char c = S.charAt(i);
      index.computeIfAbsent(c, x -> new ArrayList<Integer>()).add(i);
    }

    long ans = 0;
    for (List<Integer> A : index.values()) {
      for (int i = 0; i < A.size(); ++i) {
        long prev = i > 0 ? A.get(i - 1) : -1;
        long next = i < A.size() - 1 ? A.get(i + 1) : S.length();
        ans += (A.get(i) - prev) * (next - A.get(i));
      }
    }

    return (int) ans % 1_000_000_007;

  }

https://leetcode.com/problems/unique-letter-string/discuss/172041/Very-simple-O(N)-with-Prev-and-Next
Just save the prev and next occurrence of the char at the current index for all indices and then with simple math just calculate the number of substrings in which the current char is unique.
int uniqueLetterString(string S) {
        int n = S.size();
        unordered_map<char, int> Last; //index of last occurrence of char
        
        vector<int> Next(n, n); //index of next occurrence of S[i]
        vector<int> Prev(n, 0); //index of prev occurrence of S[i]
        
        for (int i = 0; i < n; ++i) {
            char c = S[i];
            if (Last.count(c)) {
                Prev[i] = Last[c] + 1;
                Next[Last[c]] = i;   
            }
            Last[c] = i;
        }
        
        int res = 0;
        for (int i = 0; i < n; ++i) {
        // the following is the number of substrings
        // in which the current char is unique
            res += ((i + 1) - Prev[i]) * (Next[i] - i);
        }
        return res % 1000000007;
    }
X. Left, right array
https://www.acwing.com/solution/leetcode/content/623/
预处理每个位置 i,左边第一个和自己一样字符的位置 left,右边第一个和自己一样字符的位置 right。该位置 i 贡献的答案就是 (i - left) * (right - i)。
对每个位置进行统计。
    int uniqueLetterString(string S) {
        const int mod = 1000000007;

        vector<int> last(26, -1);
        int n = S.length();

        vector<int> left(n, -1), right(n, n);

        for (int i = 0; i < n; i++) {
            left[i] = last[S[i] - 'A'];
            last[S[i] - 'A'] = i;
        }

        last = vector<int>(26, n);

        for (int i = n - 1; i >= 0; i--) {
            right[i] = last[S[i] - 'A'];
            last[S[i] - 'A'] = i;
        }

        int ans = 0;
        for (int i = 0; i < n; i++)
            ans = (ans + (LL)(i - left[i]) * (right[i] - i)) % mod;

        return ans;
    }
X. http://tashi711.xyz/programming/reports/leetcode/leetcode-828/
比较简单的一道题目,考虑每个单独的字母对于最终答案的贡献。如果某个字母a对答案有贡献,那么包含他的子串一定在从这个a往两边延伸到边界或者另一个a之内位置的范围内。那么总共可能的子串有u*v个,其中u、v分别为延伸到边界或者另一个a之内位置的长度。对每个’A’到’Z’的字母a,找到每个a的位置,计算往外延伸的长度,乘起来累加到答案即可。

复杂度分析

  时间复杂度为 O(NM),其中M为字母个数,可以认为是常数26。
  空间复杂度为 O(N)
 int uniqueLetterString(string S) {
  return work(S);
 }

 int work(const string& s) {
  int n = static_cast<int>(s.size());
  long long ans = 0;
  for (char i = 'A'; i <= 'Z'; ++i) {
   for (int j = 0; j < n; ++j) {
    if (s[j] == i) {
     int cnt_pre = j + 1, cnt_next = 0;
     for (int k = j + 1; k < n; ++k) {
      ++cnt_next;
      if (s[k] == i) {
       ans = (ans + cnt_pre * cnt_next) % kM;
       cnt_pre = cnt_next;
       cnt_next = 0;
      }
     }
     ans = (ans + cnt_pre * (cnt_next + 1)) % kM;
     break;
    }
   }
  }
  return ans;
 }

X. https://www.gjxhlan.me/2018/05/09/leetcode-contest-83-solution/
因为只可能有大写字母,用三个指针数组 left, mid, right[26] 来保存某一个字母连续的三个位置,从左到右扫一遍,每次遇到一个字母,更新一次 left = mid,mid = right, right = i,并且将 mid 所在位置的字母的贡献存入 ans,因为 mid 所在位置的字母无法对包含 right 位置右边字符的子串产生影响(unique)。注意最后需要更新一次 ans。
    public int uniqueLetterString(String S) {
        if (S == null || S.length() == 0) return 0; 
        
        long ans = 0;
        final long cons = (long)(1e9 + 7);
        
        int[] left = new int[26], mid = new int[26];
        int[] right = new int[26];
        
        int l = S.length();
        for (int i = 0; i < left.length; i++) left[i] = mid[i] = right[i] = -1;
        
        for (int i = 0; i < l; i++) {
            int c = S.charAt(i) - 'A';
            left[c] = mid[c];
            mid[c] = right[c];
            right[c] = i;
            if (mid[c] != -1) {
                ans = (ans + (mid[c] - left[c]) * (right[c] - mid[c])) % cons;
            }
        }
        
        for (int i = 0; i < 26; i++) {
            if (right[i] != -1) {
                ans = (ans + (l - right[i]) * (right[i] - mid[i])) % cons;
            }
        }
        
        return (int)ans;
    }
X. Two Pointers
https://leetcode.com/problems/unique-letter-string/discuss/129246/Simple-Java-2-Pointer
    int MOD = 1_000_000_007;
    
    public int uniqueLetterString(String s) {
        
        int result = 0;
        
        for(char ch = 'A';ch <= 'Z'; ++ch){
            result = (result + uniqueSubstrings(s,ch))%MOD;
        }
        
        return result;
                
    }
    
    int uniqueSubstrings(String s, char ch){
        int l = 0;
        int prevChIndex = -1;
        int cnt = 0;
        int result = 0;
        
        for(int r=0;r<s.length();++r){
            if(s.charAt(r) == ch){                
                prevChIndex = r;
                cnt++;
            }
            
           
            while(cnt==2){
                if(s.charAt(l)==ch){
                    cnt--;                        
                }
                l++;
            }
        
            
            if(cnt==1){
                result = (result + (prevChIndex + 1 - l)) % MOD;
            }
        }
                
        
        return result;
    }

X. DP
Let dp[i] is sum of unique char in all substring ending at i, then the answer is sum(dp[i]), i=[0..n-1]. It's not difficult to get the recursive formula:
dp[i] = dp[i-1] + (i - first_from_i(s[i])) - (first_from_i(s[i]) - second_from_i(s[i]))
Take the below example:
BBBBBBBBBBBBBBBBBOABCDOABCOABC
                 ^    ^   ^
                 s    f   i

dp[i] = dp[i-1] + (i-f) - (f-s)
When extending s[i] to all substrings ending with s[i-1], there are (i-f) more unique char s[i], and (f-s) less unique char because of duplicate of s[i]
    public int uniqueLetterString(String s) {
            final int MOD = 1000000007;
            int res = 0, dp = 0;
            int[] first = new int[26], second = new int[26];

            for (int i = 0; i < s.length(); i++) {
                int ci = s.charAt(i) - 'A';
                dp = dp + 1 + i - first[ci] * 2 + second[ci];
                res = (res + dp) % MOD;

                second[ci] = first[ci];
                first[ci] = i + 1;
            }

            return res;
    }

Approach #1: Maintain Answer of Suffix [Accepted]


Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts