## Monday, May 16, 2016

### Microsoft Interview Questions: Number of Employees in the factory | Yaozong's Blog

Microsoft Interview Questions: Number of Employees in the factory | Yaozong's Blog
You have a log file containing the entering and leaving time of each employee in the factory. Given a query time (a floating number), write a function to count how many employees in the factory.
File Format:
001 0.14 0.15
002 0.32 0.60
003 0.25 0.80

Question 1: Assume that the log file is too large to fit into memory, write a function with the following signature to return the count.

olution 1: Binary Search
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 struct Data {   vector sorted_starts;   vector sorted_ends; };   void Preprocess(vector>& arr, Data& data) {   for(int i = 0; i < arr.size(); ++i) {     data.sorted_starts.push_back(arr.first);     data.sorted_ends.push_back(arr.second);   }      sort(data.sorted_starts.begin(), data.sorted_starts.end());   sort(data.sorted_ends.begin(), data.sorted_ends.end()); }   int Query(Data& data, float query_time) {   auto it1 = lower_bound(data.sorted_starts.begin(), data.sorted_starts.end(), query_time);   auto it2 = lower_bound(data.sorted_ends.begin(), data.sorted_ends.end(), query_time);   int idx1 = it1 - data.sorted_starts.begin();   int idx2 = it2 - data.sorted_ends.begin();   return idx1 - idx2; }
Solution 2: Store the curve (e.g., the skyline problem)
olution 2: Storing curve
 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 struct Data {   vector times;   vector counts; };   void Preprocess(vector>& arr, Data& data) {   map times;   for(int i = 0; i < arr.size(); ++i) {     times[arr[i].first] ++;     times[arr[i].second] --;   }      int cur = 0;   for(auto elem : times) {     data.times.push_back(elem.first);     cur += elem.second;     data.counts.push_back(cur);   } }   int Query(Data& data, float query_time) {   int idx = lower_bound(data.times.begin(), data.times.end(), query_time);   if(idx == 0)       return 0;   else       return data.counts[idx-1]; }
Followup: what should you do if we need to improve the time complexity of Query function to O(1).
Answer: 1) for all floating numbers (2^32 number), pre-compute their results and save them in the lookup table.
2) quantize the curve into different bins and store the histogram for lookup. It only approximates the solution but significantly saves the space compared to 1).