Massive Algorithms: Greedy

Greedy - Summary

http://www.cs.cornell.edu/courses/cs482/2004su/handouts/greedy_ahead.pdf
http://www.cs.cornell.edu/courses/cs482/2003su/handouts/greedy_exchange.pdf
https://rafael.do/blog/introduction-greedy-algorithms/

Greedy algorithms are good at finding solutions to problems by choosing a consistently optimal solution on each step.

Basic concepts

An optimal solution is a feasible solution with the largest (or smallest) objective function value.
A local optimum can be obtained by finding the optimal solution within a neighboring set of candidate solutions.
A global optimum can be obtained by finding the optimal solutions among all possible solutions.

Problem characteristics

Greedy choice property: a global optimum can be obtained by the selection of a local optimum.
Optimal substructure: a global optimum can be obtained by using the local optimum of its subproblems.

General strategies

“GREEDY STAYS AHEAD” ARGUMENTS

Using “Greedy stays ahead” strategy, the algorithm is always at least as far ahead as the optimal solution of each iteration.

Define your solutions. Define what object will represent the global optimum, and what form each local optimum takes.
Find a measure. Find a series of measurements to ensure your algorithm stays ahead of the local optimums you’ve found.
Proove greedy stays ahead. Inductively show that the local optimums are as good as any of the solution’s measures.
Mathematical induction: a means of proving a theorem by showing that if it is true of any particular case, it is true of the next case in a series, and then showing that it is indeed true in one particular case.
Prove optimality. By contradiction, prove that since the algorithm stays ahead of its previous measures, it must produce an optimal solution.
Mathematical proof by contradiction: assume that a statement is not true and then to show that that assumption leads to a contradiction. In the case of trying to prove this is equivalent to assuming that That is, to assume that is true and is false.

EXCHANGE ARGUMENTS

The greedy exchange strategy is used to prove the correctness of greedy algorithms by transforming the global optimum iteratively without worsening its quality.

Define your solutions. Define an object A that will represent the global optimum and an object O that represents a local optimum.
Compare solutions. Show that if the global optimum is not the same as the local optimum, either:
There is an element in O that is not in A.
There are two elements of O that are in a different order in A.
Exchange pieces. Gradually modify the local optimum O until it is the same as the algorithm’s global optimum A.
Iterate. Decrease the number of differences between A and O, until you can turn A into O without worsening the quality of the solution. Inductively, O must be optimal.

https://www.hackerrank.com/challenges/pairs/topics

A greedy algorithm is an algorithm that follows the problem-solving heuristic of making the locally optimal choice at each stage with the hope of finding a global optimum.

A very common problem which gives good insight into this is the Job Scheduling Problem.

You have

jobs numbered

and you have the start times

and the end times

for the

job. Which jobs should you choose such that you get the maximal set of non-overlapping jobs?

The correct solution to this problem is to sort all the jobs on the basis of their end times and then keep on selecting the job with the minimal index in this list which does not overlap with currently selected jobs.

Sounds intuitive, but why does it work?

Well, since each job has equal weight, selecting the one which makes way for newer jobs sooner is optimal. Although a more formal argument can be made for a rigorous proof, the intuition behind it is similar.

Now, let's consider another problem. You again have

jobs. Each job can be completed at any time and takes

time to complete and has a point value of

. But with each second, the point value of the

job decreases by

. If you have to complete all the jobs, what is the maximum points that you can get?

The problem basically asks us for an order of accomplishing the jobs.

Here, doing the job with higher

first makes sense. At the same time, doing the job with lower

also sounds good. So how do we decide?

Assume you have just

jobs which, without loss of generality, can be numbered as

and

Now, if you do job

before job

, your net score is:

Otherwise,

then:

In other words, it is optimal to do job

before job

iff

Notice that this argument can be applied to

jobs as a sorting rule. The job with maximum

value should be done first and so on.

This gives us the optimal ordering and is also in line with our intuition.

Greedy doesn't always work

Greedy solutions are usually good whenever they exist, because they are usually easy to implement and usually have a fast running time. However, greedy algorithms don't always work! By this, we don't mean that the greedy algorithm fails to return the correct answer on all inputs. Instead, we mean that the algorithm fails on at least one input.

For example, consider the following problem: You again have

jobs, and the

job takes

time to complete and has a point value of

. This time, the point values do not decrease over time, and you don't have to finish all jobs. Unfortunately you only have a total of

time to spend. What is the maximum points you can get?

One greedy algorithm that comes to mind is the following: while there is still time remaining, take the job with the largest point value that can be finished within the remaining time. Intuitively, this can be seen to work in some cases. However, this fails in the following set of jobs:

Assuming

, the greedy algorithm mentioned above first takes job

, then job

, for a total of

points. However, this is not the global optimum, because you can take jobs

and

for a total of

points, which is much higher.

The next greedy algorithm we can try is to always take the job which takes the shortest amount of time to finish. However, this also fails the set of jobs above (where you only get

points).

You can probably try crafting a more sophisticated greedy algorithm than the ones we described, but it probably won't work, i.e. it will probably fail on some input. This is because the problem we described to you is equivalent to the knapsack problem which currently has no known efficient algorithm!