## Sunday, July 3, 2016

### Find Duplicates

https://github.com/niannianli/ScalabilityMemoryLimits/blob/master/src/scalabilitymemory/DuplicateElements4.java
/**
* 10.4 You have an array with all the numbers from 7 to N, where N is at most
* 32,000. The array may have duplicate entries and you do not know what N is.
* With only 4 kilo- bytes of memory available, how would you print all
* duplicate elements in the array?
*/

/*
* we have 4 kilobytes of memory which means we can address up to 8*4*2^10 bits.
* Note that 32*2^10 bits in greater than 32000. We can create a bit vector with
* 32000 bits, where each bit represents one integer.
*/

/*
* using this bit vector, we can then iterate through the array, flagging each
* element v by setting bit v to 1. When we come across a duplicate element, we
* print it.
*/
public class DuplicateElements4 {
public static void checkDuplicates(int[] array) {
BitSet bs = new BitSet(32000);
for (int i = 0; i < array.length; i++) {
int num = array[i];
int num0 = num - 1;// bitset starts at 0, numbers
// start at 1
if (bs.get(num0)) {
System.out.println(num);

} else {
bs.set(num0);
}
}
}

}

/*
* Note that while this isn't an especially difficult problem, it's important to
* implement this cleanly. This is why we defined our own bit vector class to
* hold a large bit vector. If our interviewer lets us (she may or many not), we
* could have of course used Java's built in BitSet class.
*/

class BitSet {
int[] bitset;
public BitSet(int size) {
bitset = new int[(size >> 5) + 1); // divide by 32
}
boolean get(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
return (bitset[wordNumber] & (1 << bitNumber)) != 0;
}
void set(int pos) {
int wordNumber = (pos >> 5); // divide by 32
int bitNumber = (pos & 0x1F); // mod 32
bitset[wordNumber] I= 1 << bitNumber;
}
}