LeetCode 340 - Longest Substring with At Most K Distinct Characters


(Leetcode) Longest Substring with At Most Two Distinct Characters | Changhaz's Codeplay
Given a string, find the length of the longest substring T that contains at most k distinct characters.
For example, Given s = “eceba” and k = 2,
T is "ece" which its length is 3.
X. https://discuss.leetcode.com/topic/41671/15-lines-java-solution-using-slide-window
    public int lengthOfLongestSubstringKDistinct(String s, int k) {
        int[] count = new int[256];
        int num = 0, i = 0, res = 0;
        for (int j = 0; j < s.length(); j++) {
            if (count[s.charAt(j)]++ == 0) num++;
            if (num > k) {
                while (--count[s.charAt(i++)] > 0);
                num--;
            }
            res = Math.max(res, j - i + 1);
        }
        return res;
    }
A more generic solution as follows, can be solution for Unicode string:
public int lengthOfLongestSubstringKDistinct(String s, int k) {
    Map<Character, Integer> map = new HashMap<>();
    int left = 0;
    int best = 0;
    for(int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        map.put(c, map.getOrDefault(c, 0) + 1);
        while (map.size() > k) {
            char leftChar = s.charAt(left);
            if (map.containsKey(leftChar)) {
                map.put(leftChar, map.get(leftChar) - 1);                     
                if (map.get(leftChar) == 0) { 
                    map.remove(leftChar);
                }
            }
            left++;
        }
        best = Math.max(best, i - left + 1);
    }
    return best;
} 

http://www.danielbit.com/blog/puzzle/leetcode/leetcode-longest-substring-with-at-most-two-distinct-characters
最优的解法应该是维护一个sliding window,指针变量i指向sliding window的起始位置,j指向另个一个字符在sliding window的最后一个,用于定位i的下一个跳转位置。内部逻辑就是
1)如果当前字符跟前一个字符是一样的,直接继续。
2)如果不一样,则需要判断当前字符跟j是不是一样的
a)一样的话sliding window左边不变,右边继续增加,但是j的位置需要调整到k-1。
b)不一样的话,sliding window的左侧变为j的下一个字符(也就是去掉包含j指向的字符的区间),j的位置也需要调整到k-1。
在对i进行调整的时候(1.a),需要更新maxLen。
[注意事项]
1)在最后返回的时候,注意考虑s.length()-i这种情况,也就是字符串读取到最后而没有触发(1.a)
2)讲解清楚sliding window的更新
3)该题目有个follow-up,就是如果是k个distinct characters怎么办。这样的话就只能对所有可能的字符用一个数组去做counting,而且只能假设ASIC字符集256。Unicode太大了。
public int lengthOfLongestSubstringTwoDistinct(String s) {
        int i = 0, j = -1, maxLen = 0;
        for (int k = 1; k < s.length(); k++) {
            if (s.charAt(k) == s.charAt(k - 1)) continue;
            if (j >= 0 && s.charAt(j) != s.charAt(k)) {
                maxLen = Math.max(k - i, maxLen);
                i = j + 1; 
            }
            j = k - 1;  
        }
        return Math.max(s.length() - i, maxLen);
    }

另一种方法是,维护一个sliding window保证这个窗口中的字符串最多只有两种,维护start和end的指针,记录每个字符出现的次数,当第三个字符出现时,加入到map中,并且把map中记录的start位置上的字符串删掉,如果start位置上的字符串有多个,就循环将计数器减一,直到为零,然后删除start位置的字符串
Based on http://sidefaraway.blogspot.com/2014/12/leetcode-longest-substring-with-at-most.html
public static int findMaxKCharSubstring(String s, int k) {
Map<Character, Integer> charNum = new HashMap<Character, Integer>();
int start = 0;
int end = 0;
// int charType = 2; //can be k
int maxLen = 0;
while (end < s.length()) {
char cur = s.charAt(end);
// if this char is in substring already, then increase its number
if (charNum.containsKey(cur)) {
charNum.put(cur, charNum.get(cur) + 1);
} else {
charNum.put(cur, 1);
if (charNum.size() > k) {
// We need eliminate another char in substring to maintain
// the feasibility of the substring.
while (charNum.get(s.charAt(start)) > 0
&& charNum.size() > k) {
int newCount = charNum.get(s.charAt(start)) - 1;
if (newCount > 0) {
charNum.put(s.charAt(start), newCount);
} else {
charNum.remove(s.charAt(start));
}
start++;
}
}
}
if (maxLen < end - start + 1) {
maxLen = end - start + 1;
}
end++;
}
return maxLen;
}

Leetcode Longest Substring with At Most Two Distinct Characters
  1.     int lengthOfLongestSubstringTwoDistinct(string s) {  
  2.         int len = s.length();  
  3.         unordered_map<char,int> existingChars;  
  4.         int cnt = 0, longest = 0, st = 0;  
  5.         for(int ed = 0; ed < len; ed++){  
  6.             if(existingChars[s[ed]]==0){  
  7.                cnt++;  
  8.             }  
  9.             existingChars[s[ed]]++;  
  10.             for(;cnt>2;st++){  
  11.                 existingChars[s[st]]--;  
  12.                 if(existingChars[s[st]]==0){  
  13.                     cnt--;  
  14.                 }  
  15.             }  
  16.             longest = max(longest, ed-st+1);  
  17.         }  
  18.         return longest;  
  19.     } 

Related:
LeetCode] Longest Substring Without Repeating Characters 解题报告
Given a string, find the length of the longest substring without repeating characters. For example, the longest substring without repeating letters for "abcabcbb" is "abc", which the length is 3. For "bbbbb" the longest substring is "b", with the length of 1.
1:       int lengthOfLongestSubstring(string s) {  
2:            int count[26];  
3:            memset(count, -1, 26*sizeof(int));  
4:            int len=0, maxL = 0;  
5:            for(int i =0; i< s.size(); i++,len++)  
6:            {  
7:                 if(count[s[i]-'a']>=0)  
8:                 {  
9:                      maxL = max(len, maxL);  
10:                      len =0;          
11:                      i = count[s[i]-'a']+1;  
12:                      memset(count, -1, 26*sizeof(int));   
13:                 }  
14:                 count[s[i]-'a']=i;  
15:            }      
16:            return max(len, maxL);  
17:       }
https://changhaz.wordpress.com/2014/11/26/leetcode-longest-substring-with-at-most-two-distinct-characters/
Solution 1:
Keep a sliding window of valid substring with at most 2 distinct characters. Keep track of the index of the last appearance of each of the 2 characters within the window. Use -1 as the default value of the index.
ai: the smaller index of the two last appearance indices
bi: the lager index of the two last appearance indices
We always have ai smaller than bi.
For example, in the example eceba, for window ece, the last appearance index for character c is 1, the last appearance for character e is 2. So here ai equals 1 and bi equals 2;
https://wwwx.cs.unc.edu/~duozhao/entry/2014/12/543/
http://blog.csdn.net/whuwangyi/article/details/42451289
 public int lengthOfLongestSubstringTwoDistinct(String s) {
  int start = 0;
  int maxLen = 0;

  // Key: letter; value: the index of the last occurrence.
  Map<Character, Integer> map = new HashMap<Character, Integer>();
  int i;
  for (i = 0; i < s.length(); ++i) {
   char c = s.charAt(i);
   if (map.size() == 2 && !map.containsKey(c)) {
    // Pick the character with the leftmost last occurrence.
    char charEndsMostLeft = ' ';
    int minLast = s.length();
    for (char ch : map.keySet()) {
     int last = map.get(ch);
     if (last < minLast) {
      minLast = last;
      charEndsMostLeft = ch;
     }
    }

    map.remove(charEndsMostLeft);
    start = minLast + 1;
   }
   map.put(c, i);
   maxLen = Math.max(maxLen, i - start + 1);
  }
  return maxLen;
 }

Longest Substring with At Most K Distinct Characters
 public int lengthOfLongestSubstringKDistinct(String s, int k) {
  int start = 0;
  int maxLen = 0;

  // Key: letter; value: the number of occurrences.
  Map<Character, Integer> map = new HashMap<Character, Integer>();
  int i;
  for (i = 0; i < s.length(); ++i) {
   char c = s.charAt(i);
   if (map.containsKey(c)) {
    map.put(c, map.get(c) + 1);
   } else {
    map.put(c, 1);
    while (map.size() > k) {
     char startChar = s.charAt(start++);
     int count = map.get(startChar);
     if (count > 1) {
      map.put(startChar, count - 1);
     } else {
      map.remove(startChar);
     }
    }
   }

   maxLen = Math.max(maxLen, i - start + 1);
  }

  return maxLen;
 }
http://www.programcreek.com/2013/02/longest-substring-which-contains-2-unique-characters/
Time is O(n).
https://discuss.leetcode.com/topic/119/longest-subsequence-of-distinct-characters/2

Return length of the Longest subsequence of distinct characters from a stream of character.
The stream can be very long and saving the whole stream in memory is not an option.//following method is used to receive new char from the stream put(char c)

but the stream makes it a little complicated

I think with a hash-table and a pointer which records the current start point will apply to this problem.
using sliding (rolling) window and hash for each character.

X. follow up
https://discuss.leetcode.com/topic/48827/java-o-nlogk-using-treemap-to-keep-last-occurrence-interview-follow-up-question
 The interviewer may say that the string is given as a steam. In this situation, we can't maintain a "left pointer" as the classical O(n) hashmap solution.

some interviewer may require this solution as a followup. Instead of recording each char's count, we keep track of char's last occurrence. If you consider k as constant, it is also a O(n) algorithm.
inWindow keeps track of each char in window and its last occurrence position
lastOccurrence is used to find the char in window with left most last occurrence. A better idea is to use a PriorityQueue, as it takes O(1) to getMin, However Java's PQ does not support O(logn) update a internal node, it takes O(n). TreeMap takes O(logn) to do both getMin and update.
Every time when the window is full of k distinct chars, we lookup TreeMap to find the one with leftmost last occurrence and set left bound j to be 1 + first to exclude the char to allow new char coming into window.
        public int lengthOfLongestSubstringKDistinct(String str, int k) {
            if (str == null || str.isEmpty() || k == 0) {
                return 0;
            }
            TreeMap<Integer, Character> lastOccurrence = new TreeMap<>();
            Map<Character, Integer> inWindow = new HashMap<>();
            int j = 0;
            int max = 1;
            for (int i = 0; i < str.length(); i++) {
                char in = str.charAt(i);
                while (inWindow.size() == k && !inWindow.containsKey(in)) {
                    int first = lastOccurrence.firstKey();
                    char out = lastOccurrence.get(first);
                    inWindow.remove(out);
                    lastOccurrence.remove(first);
                    j = first + 1;
                }
                //update or add in's position in both maps
                if (inWindow.containsKey(in)) {
                    lastOccurrence.remove(inWindow.get(in));
                }
                inWindow.put(in, i);
                lastOccurrence.put(i, in);
                max = Math.max(max, i - j + 1);
            }
            return max;
        }

Read full article from (Leetcode) Longest Substring with At Most Two Distinct Characters | Changhaz's Codeplay

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts