## Wednesday, November 23, 2016

### LeetCode 460 - LFU Cache

LeetCode 146 - LRU Cache
https://leetcode.com/problems/lfu-cache/
Design and implement a data structure for Least Frequently Used (LFU) cache. It should support the following operations: `get`and `set`.
`get(key)` - Get the value (will always be positive) of the key if the key exists in the cache, otherwise return -1.
`set(key, value)` - Set or insert the value if the key is not already present. When the cache reaches its capacity, it should invalidate the least frequently used item before inserting a new item. For the purpose of this problem, when there is a tie (i.e., two or more keys that have the same frequency), the least recently used key would be evicted.
Could you do both operations in O(1) time complexity?
Example:
```LFUCache cache = new LFUCache( 2 /* capacity */ );

cache.set(1, 1);
cache.set(2, 2);
cache.get(1);       // returns 1
cache.set(3, 3);    // evicts key 2
cache.get(3);       // returns 3.
cache.set(4, 4);    // evicts key 1.
cache.get(3);       // returns 3
cache.get(4);       // returns 4
```

```KeyNode中保存key（键），value（值），freq（频度），prev（前驱），next（后继）

FreqNode中保存freq（频度）、prev（前驱）、next（后继）、first（指向最新的KeyNode），last（指向最老的KeyNode）```

```capacity：缓存的容量

keyDict：从key到KeyNode的映射

freqDict：从freq到FreqNode的映射

```head --- FreqNode1 ---- FreqNode2 ---- ... ---- FreqNodeN
|               |                       |
first           first                   first
|               |                       |
KeyNodeA        KeyNodeE                KeyNodeG
|               |                       |
KeyNodeB        KeyNodeF                KeyNodeH
|               |                       |
KeyNodeC         last                   KeyNodeI
|                                       |
KeyNodeD                                 last
|
last```
LFUCache操作实现如下：
set(key, value)：
```如果capacity为0，忽略当前操作，结束

get(key)：
```若keyDict中包含key，则更新节点频度，返回对应的value

```从keyDict中找到对应的KeyNode，然后通过KeyNode的freq值，从freqDict找到对应的FreqNode

class KeyNode(object): def __init__(self, key, value, freq = 1): self.key = key self.value = value self.freq = freq self.prev = self.next = None class FreqNode(object): def __init__(self, freq, prev, next): self.freq = freq self.prev = prev self.next = next self.first = self.last = None class LFUCache(object): def __init__(self, capacity): """ :type capacity: int """ self.capacity = capacity self.keyDict = dict() self.freqDict = dict() self.head = None def get(self, key): """ :type key: int :rtype: int """ if key in self.keyDict: keyNode = self.keyDict[key] value = keyNode.value self.inc(key, value) return value return -1 def set(self, key, value): """ :type key: int :type value: int :rtype: void """ if self.capacity == 0: return if key not in self.keyDict and len(self.keyDict) == self.capacity: self.removeKeyNode(self.head.last) self.inc(key, value) def inc(self, key, value): """ Inserts a new KeyNode<key, value> with freq 1. Or increments the freq of an existing KeyNode<key, value> by 1. :type key: str :rtype: void """ if key in self.keyDict: keyNode = self.keyDict[key] keyNode.value = value freqNode = self.freqDict[keyNode.freq] nextFreqNode = freqNode.next keyNode.freq += 1 if nextFreqNode is None or nextFreqNode.freq > keyNode.freq: nextFreqNode = self.insertFreqNodeAfter(keyNode.freq, freqNode) self.unlinkKey(keyNode, freqNode) self.linkKey(keyNode, nextFreqNode) else: keyNode = self.keyDict[key] = KeyNode(key, value) freqNode = self.freqDict.get(1) if freqNode is None: freqNode = self.freqDict[1] = FreqNode(1, None, self.head) if self.head: self.head.prev = freqNode self.head = freqNode self.linkKey(keyNode, freqNode) def delFreqNode(self, freqNode): """ Delete freqNode. :rtype: void """ prev, next = freqNode.prev, freqNode.next if prev: prev.next = next if next: next.prev = prev if self.head == freqNode: self.head = next del self.freqDict[freqNode.freq] def insertFreqNodeAfter(self, freq, node): """ Insert a new FreqNode(freq) after node. :rtype: FreqNode """ newNode = FreqNode(freq, node, node.next) self.freqDict[freq] = newNode if node.next: node.next.prev = newNode node.next = newNode return newNode def removeKeyNode(self, keyNode): """ Remove keyNode :rtype: void """ self.unlinkKey(keyNode, self.freqDict[keyNode.freq]) del self.keyDict[keyNode.key] def unlinkKey(self, keyNode, freqNode): """ Unlink keyNode from freqNode :rtype: void """ next, prev = keyNode.next, keyNode.prev if prev: prev.next = next if next: next.prev = prev if freqNode.first == keyNode: freqNode.first = next if freqNode.last == keyNode: freqNode.last = prev if freqNode.first is None: self.delFreqNode(freqNode) def linkKey(self, keyNode, freqNode): """ Link keyNode to freqNode :rtype: void """ firstKeyNode = freqNode.first keyNode.prev = None keyNode.next = firstKeyNode if firstKeyNode: firstKeyNode.prev = keyNode freqNode.first = keyNode if freqNode.last is None: freqNode.last = keyNode # Your LFUCache object will be instantiated and called as such: # obj = LFUCache(capacity) # param_1 = obj.get(key) # obj.set(key,value)

Two HashMaps are used, one to store <key, value> pair, another store the <key, node>.
I use double linked list to keep the frequent of each key. In each double linked list node, keys with the same count are saved using java built in LinkedHashSet. This can keep the order.
Every time, one key is referenced, first find the current node corresponding to the key, If the following node exist and the frequent is larger by one, add key to the keys of the following node, else create a new node and add it following the current node.
All operations are guaranteed to be O(1).

``````public class LFUCache {
private int cap = 0;
private HashMap<Integer, Integer> valueHash = null;
private HashMap<Integer, Node> nodeHash = null;

public LFUCache(int capacity) {
this.cap = capacity;
valueHash = new HashMap<Integer, Integer>();
nodeHash = new HashMap<Integer, Node>();
}

public int get(int key) {
if (valueHash.containsKey(key)) {
increaseCount(key);
return valueHash.get(key);
}
return -1;
}

public void set(int key, int value) {
if ( cap == 0 ) return;
if (valueHash.containsKey(key)) {
valueHash.put(key, value);
} else {
if (valueHash.size() < cap) {
valueHash.put(key, value);
} else {
removeOld();
valueHash.put(key, value);
}
}
increaseCount(key);
}

} else if (head.count > 0) {
Node node = new Node(0);
} else {
}
}

private void increaseCount(int key) {
Node node = nodeHash.get(key);
node.keys.remove(key);

if (node.next == null) {
node.next = new Node(node.count+1);
node.next.prev = node;
} else if (node.next.count == node.count+1) {
} else {
Node tmp = new Node(node.count+1);
tmp.prev = node;
tmp.next = node.next;
node.next.prev = tmp;
node.next = tmp;
}

nodeHash.put(key, node.next);
if (node.keys.size() == 0) remove(node);
}

private void removeOld() {
int old = 0;
old = n;
break;
}
nodeHash.remove(old);
valueHash.remove(old);
}

private void remove(Node node) {
if (node.prev == null) {
} else {
node.prev.next = node.next;
}
if (node.next != null) {
node.next.prev = node.prev;
}
}

class Node {
public int count = 0;
public Node prev = null, next = null;

public Node(int count) {
this.count = count;
prev = next = null;
}
}
}``````
X. using 3 HashMaps and LinkedHashSet
``````public class LFUCache {
HashMap<Integer, Integer> vals;
HashMap<Integer, Integer> counts;
int cap;
int min = -1;
public LFUCache(int capacity) {
cap = capacity;
vals = new HashMap<>();
counts = new HashMap<>();
lists = new HashMap<>();
}

public int get(int key) {
if(!vals.containsKey(key))
return -1;
int count = counts.get(key);
counts.put(key, count+1);
lists.get(count).remove(key);
if(count==min && lists.get(count).size()==0)
min++;
if(!lists.containsKey(count+1))
return vals.get(key);
}

public void set(int key, int value) {
if(cap<=0)
return;
if(vals.containsKey(key)) {
vals.put(key, value);
get(key);
return;
}
if(vals.size() >= cap) {
int evit = lists.get(min).iterator().next();
lists.get(min).remove(evit);
vals.remove(evit);
}
vals.put(key, value);
counts.put(key, 1);
min = 1;
}
}``````
https://github.com/laurentluce/lfu-cache/blob/master/lfucache/lfu_cache.py
http://www.laurentluce.com/posts/least-frequently-used-cache-eviction-scheme-with-complexity-o1-in-python/

X. Using TreeSet
https://discuss.leetcode.com/topic/69033/java-solution-using-priorityqueue-with-detailed-explanation/2
We need to implement `get()` and `set()` in average `O(logn)` time, or we will get TLE.
Obviously, we need a hashmap to remember `key-value` pair.
What we need to do, is to remember `(frequency, recentness)` for each key; and sort them to get the smallest one.
So, we need to use `Collection` such as `TreeSet` or `PriorityQueue`.
Now, the only question is, how to update?
It is difficult to update `(frequency, recentness)` in the collection, as we don't know the index.
(Maybe using `binary search` or `hashmap` can do this, I haven't tried it.)
The trick is, just override equals() and hashCode() function, in order to use `remove`.
``````    class Cache implements Comparable<Cache> {
int key, f, r;
public Cache(int k, int f, int r) {key=k;this.f=f;this.r=r;}
public boolean equals(Object object) {return key==((Cache) object).key;}
public int hashCode() {return key;}
public int compareTo(Cache o) {return key==o.key?0:f==o.f?r-o.r:f-o.f;}
}

int capacity,id;
HashMap<Integer, Integer> hashMap;
HashMap<Integer, Cache> caches;
TreeSet<Cache> treeSet;

public LFUCache(int capacity) {
this.capacity=capacity;
id=0;
hashMap=new HashMap<>();
caches=new HashMap<>();
treeSet=new TreeSet<>();
}

public int get(int key) {
id++;
if (hashMap.containsKey(key)) {
update(key);
return hashMap.get(key);
}
return -1;
}

public void set(int key, int value) {
if (capacity==0) return;
id++;
if (hashMap.containsKey(key)) {
update(key);
hashMap.put(key, value);
return;
}
if (hashMap.size()==capacity) {
Cache first=treeSet.pollFirst();
hashMap.remove(first.key);
caches.remove(first.key);
}
hashMap.put(key, value);
Cache cache=new Cache(key, 1, id);
caches.put(key, cache);
}

private void update(int key) {
int f=caches.get(key).f;
treeSet.remove(caches.get(key));
Cache cache=new Cache(key, f+1, id);
caches.put(key, cache);
}``````
1. Map to store key-value pairs
2. Map to store counts/frequency of access
How do we implement evict method? When size of map reaches max capacity, we need to find item with lowest frequency count. There are 2 problems:
1. We have to iterate through all values of frequencies map, find lowest count and remove corresponding key from both maps. This will take O(n) time.
2. Also, what if there are multiple keys with same frequency count? How do we find least recently used? That’s not possible because HashMap does not store the order of insertion.
To solve both of above problems we need to add one more data structure: Sorted map with frequency as map-keys and ‘list of item-keys with same frequency’ as map-values.
1. We can add new item can to the end of the list with frequency 1.
2. We can find the list with lowest frequency in O(1), since map is sorted by frequencies.
3. We can delete the first item of the list (of lowest frequency) since that will be least recently used. Also O(1).
While solving our delete problem, we accidentally increased our access operation time to O(n). How? Note that all of item-keys sharing same frequency are in a list. Now if one of these items is accessed, how do we move it to list of next frequency? We will have to iterate through the list first to find the item, which in worst-case will take O(n) operations.
To solve the problem, we somehow need to jump directly to that item in the list without iteration. If we can do that, it will be easier to delete the item and add it to end of next frequency list.
Unfortunately, this is not possible using our in-built data structures. We need to create a new one (mentioned in the paper).
We need to store each item’s position.
1. We will create a simple class which stores item’s key, value and its position in the list.
2. We will convert the linked list to
We need to store each item’s position.
1. We will create a simple class which stores item’s key, value and its position in the list.
2. We will convert the linked list to