Minimum Sum of Manhattan Distance | tech::interview


http://blog.csdn.net/yuanhisn/article/details/46118191
In a city there are N persons. A person can walk only horizontally or vertically. Find a point that minimizes the sum of distances all persons walk to the point.

This is called the manhattan distance since a person can walk only horizontally or vertically, like in the city of Manhattan.

Let's assume this is 1-dimensional. Then for 2 persons, the point can be any point on the line connecting the 2 persons. For 3 persons , the point is the middle person. For 4 persons, the point can be any point between the middle 2 persons. For 5 persons, the point is the middle person. In general, for odd number of persons, it's the person in the middle; for even number of persons, it's any point between the middle 2 persons.

This can be extended to 2-dimensional. The answer is the point (x, y), where x and y are the median points taken independently of all the xi and yi for i = 1 to n. [1][2]

The cool thing about the Manhatan distance is that the distance itself comprises of two independent components: the distance on the x and y coordinate. Thus you can solve two simpler tasks and merge the results from them to obtain the desired results.
Minimum Sum of Manhattan Distance | tech::interview
已知平面上有N个点,找到其中某个点P,使得P到其余所有点的Manhattan距离之和最短并求出这个最短距离。
因为是曼哈顿距离,所以可以把x轴和y轴分离开来计算。
  1. 针对x坐标对所有点排序。
  2. 求一个数组,每个元素为对应点到其他所有点的在x轴上的距离和。
    这一步需要O(n)的算法,具体思路是,记录两个数组,leftright:
    left[i] = x[1] + x[2] + ... + x[i-1]
    right[i] = x[i+1] + x[i+2] ... + x[n]
    然后对每个点
    sum[i] = x[i] * i - left[i] + right[i] - (n - 1 - i) * x[i]
  3. 针对y坐标重复1和2的计算。
  4. 求得每个点在x和y上的1D曼哈顿距离和之后,可以遍历所有点,直接找出最小值。
http://www.jiuzhang.com/problem/30/ 
初阶:因为采用曼哈顿距离,所以可以分开考虑x坐标和y坐标。将k个点的x坐标排序,可以知道要求的x坐标一定在这k个点的x坐标上,扫描一遍并统计到各个点的x坐标距离和,找到使得距离和最小的x坐标。这一步只需要O(k)的时间复杂度,而不是O(k^2),怎么优化这里不多赘述。对y坐标做同样的操作,从而得到答案。时间复杂度O(klogk),排序的复杂度。


进阶:通过初阶的算法得到一个最优位置,如果这个位置与k个点重合,则从这个位置开始进行搜索,将这个点周围的点和对应的距离放入到一个堆里,每次从堆中取出最小距离的点,然后将这个点周围的点放入堆中,直到取出的点不与所给k个点重合。时间复杂度klogk,因为最多从堆中取出k+1个点即可找到一个不与所给k个点重合的点。堆每次操作为logk。


面试官角度:
本题的最优算法较难想到。所以如果公司要求不高,答出O(nm)的方法即可。O(nm)的方法是因为假设我们知道在(x,y)这个位置的距离和为S,那么当(x,y)移动到(x+1,y)和(x,y+1)的时候,我们可以在O(1)的时间更新S。方法是预处理每一行上方/下方有多少个k个点中的点,每一列左侧/右侧有多少个k个点中的点。上面的解答基于nm>>klogk,如果k比较大,则还是O(nm)的方法更好。答题时需要答出对于给定参数不同情况下采用不用算法这一点。

#堆#
#矩阵#
#排序#

map<pair<int,int>,int> mht_sum(const vector<Point>& pts, bool get_x) {
int n = (int)pts.size();
vector<int> left(n), right(n);
int sum = 0;
for(int i = 0; i < n; ++i) {
left[i] = sum;
sum += get_x ? pts[i].x : pts[i].y;
}
sum = 0;
for(int i = n - 1; i >= 0; --i) {
right[i] = sum;
sum += get_x ? pts[i].x : pts[i].y;
}
map<pair<int,int>,int> ret;
for(int i = 0; i < n; ++i) {
int p = get_x ? pts[i].x : pts[i].y;
ret[{pts[i].x,pts[i].y}] = p * i - left[i] + right[i] - (n-1-i) * p;
}
return ret;
}
int min_mht_dis(vector<Point> pts) {
sort(pts.begin(), pts.end(), [&](const Point& p0, const Point& p1){
return p0.x < p1.x;
});
auto x_sums = mht_sum(pts, true);
sort(pts.begin(), pts.end(), [&](const Point& p0, const Point& p1){
return p0.y < p1.y;
});
auto y_sums = mht_sum(pts, false);
int ret = INT_MAX;
for(int i = 0; i < pts.size(); ++i) {
ret = min(ret, x_sums[{pts[i].x,pts[i].y}] + y_sums[{pts[i].x,pts[i].y}]);
}
return ret;
}
http://www.haodaima.net/art/1630004
曼哈顿距离最小生成树

http://www.jiuzhang.com/problem/30/
初阶:因为采用曼哈顿距离,所以可以分开考虑x坐标和y坐标。将k个点的x坐标排序,可以知道要求的x坐标一定在这k个点的x坐标上,扫描一遍并统计到各个点的x坐标距离和,找到使得距离和最小的x坐标。这一步只需要O(k)的时间复杂度,而不是O(k^2),怎么优化这里不多赘述。对y坐标做同样的操作,从而得到答案。时间复杂度O(klogk),排序的复杂度。
进阶:通过初阶的算法得到一个最优位置,如果这个位置与k个点重合,则从这个位置开始进行搜索,将这个点周围的点和对应的距离放入到一个堆里,每次从堆中取出最小距离的点,然后将这个点周围的点放入堆中,直到取出的点不与所给k个点重合。时间复杂度klogk,因为最多从堆中取出k+1个点即可找到一个不与所给k个点重合的点。堆每次操作为logk。

O(nm)的方法是因为假设我们知道在(x,y)这个位置的距离和为S,那么当(x,y)移动到(x+1,y)和(x,y+1)的时候,我们可以在O(1)的时间更新S。方法是预处理每一行上方/下方有多少个k个点中的点,每一列左侧/右侧有多少个k个点中的点。上面的解答基于nm>>klogk,如果k比较大,则还是O(nm)的方法更好。答题时需要答出对于给定参数不同情况下采用不用算法这一点。
http://flexaired.blogspot.com/2013/02/minimum-manhattan-distance.html

http://www.mitbbs.com/article_t/Programming/31346575.html
进阶:如果要求这个点与所给的k个点不重合,该怎么办?

II. Geometric median.
The geometric median of a discrete set of sample points in a Euclidean space is the point minimizing the sum of distances to the sample points [3]. This no longer requires the path be horizontal or vertical.

Despite the simple form, the solution is much more complex than the similar problem of finding the center of mass, which minimizes the sum of the squares of distances of the points to the center. There is no simple formula for the solution.

For 2 points, it's any point on the line connecting the 2 points.

For 3 non-collinear points, the problem is known as Fermat's problem. Solution is in [3]. For 4 co-planar points, the solution is also in [3].

For more points, the solution can be approximated by numerical methods such as the Weiszfeld's algorithm.

References:
[1] Shortest distance travel - common meeting point
[2] Algorithm to find point of minimum total distance from locations
[3] Geometric median
Related: [Google] Shortest Manhattan Distance - Shuatiblog.com
Read full article from Minimum Sum of Manhattan Distance | tech::interview

Labels

LeetCode (1432) GeeksforGeeks (1122) LeetCode - Review (1067) Review (882) Algorithm (668) to-do (609) Classic Algorithm (270) Google Interview (237) Classic Interview (222) Dynamic Programming (220) DP (186) Bit Algorithms (145) POJ (141) Math (137) Tree (132) LeetCode - Phone (129) EPI (122) Cracking Coding Interview (119) DFS (115) Difficult Algorithm (115) Lintcode (115) Different Solutions (110) Smart Algorithm (104) Binary Search (96) BFS (91) HackerRank (90) Binary Tree (86) Hard (79) Two Pointers (78) Stack (76) Company-Facebook (75) BST (72) Graph Algorithm (72) Time Complexity (69) Greedy Algorithm (68) Interval (63) Company - Google (62) Geometry Algorithm (61) Interview Corner (61) LeetCode - Extended (61) Union-Find (60) Trie (58) Advanced Data Structure (56) List (56) Priority Queue (53) Codility (52) ComProGuide (50) LeetCode Hard (50) Matrix (50) Bisection (48) Segment Tree (48) Sliding Window (48) USACO (46) Space Optimization (45) Company-Airbnb (41) Greedy (41) Mathematical Algorithm (41) Tree - Post-Order (41) ACM-ICPC (40) Algorithm Interview (40) Data Structure Design (40) Graph (40) Backtracking (39) Data Structure (39) Jobdu (39) Random (39) Codeforces (38) Knapsack (38) LeetCode - DP (38) Recursive Algorithm (38) String Algorithm (38) TopCoder (38) Sort (37) Introduction to Algorithms (36) Pre-Sort (36) Beauty of Programming (35) Must Known (34) Binary Search Tree (33) Follow Up (33) prismoskills (33) Palindrome (32) Permutation (31) Array (30) Google Code Jam (30) HDU (30) Array O(N) (29) Logic Thinking (29) Monotonic Stack (29) Puzzles (29) Code - Detail (27) Company-Zenefits (27) Microsoft 100 - July (27) Queue (27) Binary Indexed Trees (26) TreeMap (26) to-do-must (26) 1point3acres (25) GeeksQuiz (25) Merge Sort (25) Reverse Thinking (25) hihocoder (25) Company - LinkedIn (24) Hash (24) High Frequency (24) Summary (24) Divide and Conquer (23) Proof (23) Game Theory (22) Topological Sort (22) Lintcode - Review (21) Tree - Modification (21) Algorithm Game (20) CareerCup (20) Company - Twitter (20) DFS + Review (20) DP - Relation (20) Brain Teaser (19) DP - Tree (19) Left and Right Array (19) O(N) (19) Sweep Line (19) UVA (19) DP - Bit Masking (18) LeetCode - Thinking (18) KMP (17) LeetCode - TODO (17) Probabilities (17) Simulation (17) String Search (17) Codercareer (16) Company-Uber (16) Iterator (16) Number (16) O(1) Space (16) Shortest Path (16) itint5 (16) DFS+Cache (15) Dijkstra (15) Euclidean GCD (15) Heap (15) LeetCode - Hard (15) Majority (15) Number Theory (15) Rolling Hash (15) Tree Traversal (15) Brute Force (14) Bucket Sort (14) DP - Knapsack (14) DP - Probability (14) Difficult (14) Fast Power Algorithm (14) Pattern (14) Prefix Sum (14) TreeSet (14) Algorithm Videos (13) Amazon Interview (13) Basic Algorithm (13) Codechef (13) Combination (13) Computational Geometry (13) DP - Digit (13) LCA (13) LeetCode - DFS (13) Linked List (13) Long Increasing Sequence(LIS) (13) Math-Divisible (13) Reservoir Sampling (13) mitbbs (13) Algorithm - How To (12) Company - Microsoft (12) DP - Interval (12) DP - Multiple Relation (12) DP - Relation Optimization (12) LeetCode - Classic (12) Level Order Traversal (12) Prime (12) Pruning (12) Reconstruct Tree (12) Thinking (12) X Sum (12) AOJ (11) Bit Mask (11) Company-Snapchat (11) DP - Space Optimization (11) Dequeue (11) Graph DFS (11) MinMax (11) Miscs (11) Princeton (11) Quick Sort (11) Stack - Tree (11) 尺取法 (11) 挑战程序设计竞赛 (11) Coin Change (10) DFS+Backtracking (10) Facebook Hacker Cup (10) Fast Slow Pointers (10) HackerRank Easy (10) Interval Tree (10) Limited Range (10) Matrix - Traverse (10) Monotone Queue (10) SPOJ (10) Starting Point (10) States (10) Stock (10) Theory (10) Tutorialhorizon (10) Kadane - Extended (9) Mathblog (9) Max-Min Flow (9) Maze (9) Median (9) O(32N) (9) Quick Select (9) Stack Overflow (9) System Design (9) Tree - Conversion (9) Use XOR (9) Book Notes (8) Company-Amazon (8) DFS+BFS (8) DP - States (8) Expression (8) Longest Common Subsequence(LCS) (8) One Pass (8) Quadtrees (8) Traversal Once (8) Trie - Suffix (8) 穷竭搜索 (8) Algorithm Problem List (7) All Sub (7) Catalan Number (7) Cycle (7) DP - Cases (7) Facebook Interview (7) Fibonacci Numbers (7) Flood fill (7) Game Nim (7) Graph BFS (7) HackerRank Difficult (7) Hackerearth (7) Inversion (7) Kadane’s Algorithm (7) Manacher (7) Morris Traversal (7) Multiple Data Structures (7) Normalized Key (7) O(XN) (7) Radix Sort (7) Recursion (7) Sampling (7) Suffix Array (7) Tech-Queries (7) Tree - Serialization (7) Tree DP (7) Trie - Bit (7) 蓝桥杯 (7) Algorithm - Brain Teaser (6) BFS - Priority Queue (6) BFS - Unusual (6) Classic Data Structure Impl (6) DP - 2D (6) DP - Monotone Queue (6) DP - Unusual (6) DP-Space Optimization (6) Dutch Flag (6) How To (6) Interviewstreet (6) Knapsack - MultiplePack (6) Local MinMax (6) MST (6) Minimum Spanning Tree (6) Number - Reach (6) Parentheses (6) Pre-Sum (6) Probability (6) Programming Pearls (6) Rabin-Karp (6) Reverse (6) Scan from right (6) Schedule (6) Stream (6) Subset Sum (6) TSP (6) Xpost (6) n00tc0d3r (6) reddit (6) AI (5) Abbreviation (5) Anagram (5) Art Of Programming-July (5) Assumption (5) Bellman Ford (5) Big Data (5) Code - Solid (5) Code Kata (5) Codility-lessons (5) Coding (5) Company - WMware (5) Convex Hull (5) Crazyforcode (5) DFS - Multiple (5) DFS+DP (5) DP - Multi-Dimension (5) DP-Multiple Relation (5) Eulerian Cycle (5) Graph - Unusual (5) Graph Cycle (5) Hash Strategy (5) Immutability (5) Java (5) LogN (5) Manhattan Distance (5) Matrix Chain Multiplication (5) N Queens (5) Pre-Sort: Index (5) Quick Partition (5) Quora (5) Randomized Algorithms (5) Resources (5) Robot (5) SPFA(Shortest Path Faster Algorithm) (5) Shuffle (5) Sieve of Eratosthenes (5) Strongly Connected Components (5) Subarray Sum (5) Sudoku (5) Suffix Tree (5) Swap (5) Threaded (5) Tree - Creation (5) Warshall Floyd (5) Word Search (5) jiuzhang (5)

Popular Posts