Massive Algorithms: Google – Minimum Word Set with Frequency Sum Greater than Target

Google – Minimum Word Set with Frequency Sum Greater than Target
给一篇文章，里面有很多单词，整篇文章可以全部装入内存，再给一个target number, 要求返回一个最小的set，使得set里那些单词的出现频率总和大于等于target number
[Solution]
#1 sort
#2 quick select
这道题quick select的思路的确很巧。但是仔细想一下这道题除了sort之外一时半会儿也想不到其他方法。而凡是能用sort做的题目，都多根经尝试一下quick select。
思路就是每次partition出来一个index, 计算[L, index]这个区间的sum和target的大小关系，来判断递归左边还是右边。区间sum在partition的过程中就可以算出来，所以其实就是一个quick select过程，没有其他额外开销。
[Note]
注意在getResult和getSum之间取index的不同，getSum取的区间应该是[L, index], 而getResult必须要从0开始，取[0, index]。
因为再递归过程中，如果当前sum < target, 递归右边的时候target会变成target – sum，就像kth smallest number那种做法一样，所以在getSum和target比较的时候，就是在当前[L, R]的期间里。
但是到了getResult, 情况就不一样了，必须得从0开始来对应最开始Input的target number.
[Code]
下面的code是quick select的算法，但是偷懒了getSum没有在partition的过程中算，这样的话时间复杂度可能和sort差不多。

// Quick Select, O(m), where m is the number of distinct words

class Solution2 {

  public Set<String> getWords(List<String> file, int target) {

    Set<String> result = new HashSet<>();

    if (file == null || file.isEmpty()) {

      return result;

}

    Map<String, Integer> cntMap = new HashMap<>();

    for (String word : file) {

      cntMap.put(word, cntMap.getOrDefault(word, 0) + 1);

}

    List<Map.Entry<String, Integer>> freq = new ArrayList<>(cntMap.entrySet());

    quickSelect(freq, 0, freq.size() - 1, target);

    return result;

}

  private void quickSelect(List<Map.Entry<String, Integer>> freq, int l, int r, int target, Set<String> result) {

    if (l > r) {

      return;

}

    if (l == r) {

      getResult(freq, 0, l, result);

      return;  

}

    int index = partition(freq, l, r);

    int sum = getSum(freq, l, index);

    if (sum == target) {

      getResult(freq, 0, index, result);

}

    else if (sum < target) {

      quickSelect(freq, index + 1, r, target - sum, result);

}

    else {

      quickSelect(freq, l, index, sum, result);

}

}

  private int partition(List<Map.Entry<String, Integer>> freq, int l, int r) {

    int pivot = freq.get(l).getValue();

    int left = l;

    int right = r + 1;

    while (true) {

      while (++left < r && freq.get(left).getValue() < pivot);

      while (--right > l && freq.get(right).getValue() > pivot);

      if (left < right) {

        swap(freq, left, right);

}

      else {

        break;

}

}

    swap(freq, l, rigth);

    return right;

}

  private int getSum(List<Map.Entry<String, Integer>> freq, int l, int r) {

    int sum = 0;

    for (int i = l; i <= r; i++) {

      sum += freq.get(i).getValue();

}

    return sum;

}

  private void getResult(List<Map.Entry<String, Integer>> freq, int l, int r, Set<String> result) {

    for (int i = l; i <= r; i++) {

      result.add(freq.get(i).getKey());

}

}

}

Read full article from Google – Minimum Word Set with Frequency Sum Greater than Target

Google – Minimum Word Set with Frequency Sum Greater than Target

Labels

Popular Posts