Massive Algorithms: Dynamic Programming

Dynamic Programming

https://apps.topcoder.com/forums/?module=Thread&threadID=697369&start=0
The most creative part of inventing dynamic programming solution is defining recurrent relations. The recurrent relations consist of two parts: state domain and transitions. State domain is a set of states (subproblems) in dynamic programming. For each state the subresult will be calculated eventually. Transitions are the relations between different states which help calculate the subresults.

Code of DP solution usually contains an array representing subresults on the state domain. For example, classic knapsack problem solution will be like:

  int maxcost[items+1][space+1];
  memset(maxcost, -63, sizeof(maxcost));   //fill with negative infinity
  maxcost[0][0] = 0;                       //base of DP
  for (int i = 0; i<items; i++)            //iterations over states in proper order
    for (int j = 0; j<=space; j++) {
      int mc = maxcost[i][j];              //we handle two types forward transitions
      int ni, nj, nmc;                     //from state (i,j)->mc to state (ni,nj)->nmc
 
      ni = i + 1;                          //forward transition: do not add i-th item
      nj = j;
      nmc = mc;      
      if (maxcost[ni][nj] < nmc)           //relaxing result for new state
        maxcost[ni][nj] = nmc;
 
      ni = i + 1;                          //forward transition: add i-th item
      nj = j + size[i];
      nmc = mc + cost[i];
      if (nj <= space && maxcost[ni][nj] < nmc)
        maxcost[ni][nj] = nmc;
    }
  int answer = -1000000000;                //getting answer from state results
  for (j = 0; j<=space; j++)
    if (maxcost[items][j] > answer)
      answer = maxcost[items][j];
  return answer;

Here (i,j) is state of DP with result equal to maxcost[i][j]. The result here means the maximal cost of items we can get by taking some of first i items with overall size of exactly j. So the set of (i,j) pairs and concept of maxcost[i][j] here comprise a state domain. The forward transition is adding or not adding the i-th item to the set of items we have already chosen.

The order of iterations through all DP states is important. The code above iterates through states with pairs (i,j) sorted lexicographically. It is correct since any transition goes from set (i,*) to set (i+1,*), so we see that i is increasing by one. Speaking in backward (recurrent) style, the result for each state (i,j) directly depends only on the results for the states (i-1,*).

To determine order or iteration through states we have to define order on state domain. We say that state (i1,j1) is greater than state (i2,j2) if (i1,j1) directly or indirectly (i.e. through several other states) depends on (i2,j2). This is definition of order on the state domain used. In DP solution any state must be considered after all the lesser states. Else the solution would give incorrect result.

Multidimensional array
The knapsack DP solution described above is an example of multidimensional array state domain (with 2 dimensions). A lot of other problems have similar state domains. Generally speaking, in this category states are represented by k parameters: (i1, i2, i3, ..., ik). So in the code we define a multidimensional array for state results like: int Result[N1][N2][N3]...[Nk]. Of course there are some transition rules (recurrent relations). These rules themselves can be complex, but the order of states is usually simple.

In most cases the states can be iterated through in lexicographical order. To do this you have to ensure that if I = (i1, i2, i3, ..., ik) directly depends on J = (j1, j2, j3, ..., jk) then I is lexicographically greater that J. This can be achieved by permuting parameters (like using (j,i) instead of (i,j)) or reversing them. But it is usually easier to change the order and direction of nested loops. Here is general code of lexicographical traversion:

  for (int i1 = 0; i1<N1; i1++)
    for (int i2 = 0; i2<N1; i2++)
      ...
        for (int ik = 0; ik<Nk; ik++) {
          //get some states (j1, j2, j3, ..., jk) -> jres by performing transitions
          //and handle them
        }

Note: changing order of DP parameters in array and order of nested loops can noticably affect performance on modern computers due to CPU cache behavior.

DP over subsets is much more popular.

Consider TSP problem as an example. The set of cities X={0, 1, 2, ..., N-1} is used here. State domain will have two parameters: s and a. The state (s,a)->R means that R is the shortest path from city 0 to city "a" which goes through all the vertices from subset s exactly once. The transition is simply adding one city v to the end of path: (s,a)->R turns into (s+{v},v)->R + M[a,v]. Here M[i,j] is distance between i-th and j-th city. Any hamiltonian cycle is a path which goes through each vertex exactly once plus the edge which closes the cycle, so the answer for TSP problem can be computed as min(R[X,a]+M[a,0]) among all vertices "a".

It is very convenient to encode subsets with binary numbers. Look recipe "Representing sets with bitfields" for detailed explanation.

The state domain of DP over subsets is usually ordered by set inclusion. Each forward transition adds some elements to the current subset, but does not subtract any. So result for each state (s,a) depends only on the results of states (t,b) where t is subset of s. If state domain is ordered like this, then we can iterate through subsets in lexicographical order of binary masks. Since subsets are usually represented with binary integers, we can iterate through all subsets by iterating through all integers from 0 to 2^N - 1. For example in TSP problem solution looks like:

  int res[1<<N][N];
  memset(res, 63, sizeof(res));       //filling results with positive infinity
  res[1<<0][0] = 0;                   //DP base
 
  for (int s = 0; s < (1<<N); s++)    //iterating through all subsets in lexicographical order
    for (int a = 0; a < N; a++) {
      int r = res[s][a];
      for (int v = 0; v < N; v++) {   //looking through all transitions (cities to visit next)
        if (s & (1<<v)) continue;     //we cannot visit cities that are already visited
        int ns = s | (1<<v);          //perform transition
        int na = v;
        int nr = r + matr[a][v];      //by adding edge (a - v) distance
        if (res[ns][na] > nr)         //relax result for state (ns,na) with nr
          res[ns][na] = nr;
      }
    }
  int answer = 1000000000;            //get TSP answer
  for (int a = 0; a < N; a++)
    answer = min(answer, res[(1<<N)-1][a] + matr[a][0]);

Often in DP over subsets you have to iterate through all subsets or supersets of a given set s. The bruteforce implementation will require O(4^N) time for the whole DP, but it can be easily optimized to take O(3^N). Please read recipe "Iterating Over All Subsets of a Set".

Subsets of a given set

The problems of this type has some set X. The number of elements in this set is small: less than 20. The idea of DP solution is to consider all subsets of X as state domain. Often there are additional parameters. So generally we have state domain in form (s,a) where s is a subset of X and "a" represents additional parameters.

https://codeforces.com/blog/entry/43256
It can be rather characterized as an algorithmic technique that is usually based on a starting state of the problem, and a recurrent formula or relation between the successive states. A state of the problem usually represents a sub-solution, i.e. a partial solution or a solution based on a subset of the given input. And the states are built one by one, based on the previously built states.

Given a list of n coins, their weights W1, W2, ..., Wn; and the total sum S. Find the minimum number of coins the overall weight of which is S (we can use as many coins of each type as we want), or report that it is not possible to select coins in such a way so that they sum up to S. This problem is a special case of the famous unbounded knapsack problem. For this problem a state, let’s call it (P) or (P)->k, would represent the solution for a partial sum (P), where P is not greater than S. k is minimal number of coins required to get exact overall weight P. The k value is usually called the result of corresponding state (P).

A dynamic programming solution would thus start with an initial state (0) and then will build the succeeding states based on the previously found ones. In the above problem, a state (Q) that precedes (P) would be the one for which sum Q is lower than P, thus representing a solution for a sum smaller than P. One starts with the trivial state (0), and then builds the state (P1), (P2), (P3), and so on until the final state (S) is built, which actually represents the solution of the problem. One should note that a state can not be processed until all of the preceding states haven’t been processed – this is another important characteristic of DP technique.

/* Recurrent equations for DP:
  {k[0] = 0;
  {k[P] = min_i (k[P-Wi] + 1);   (for Wi <= P)
*/
/* Consider the input data: S=11, n=3, W = {1,3,5}
   The DP results table is:
  P = 0 |1 |2 |3 |4 |5 |6 |7 |8 |9 |10|11
  ------+--+--+--+--+--+--+--+--+--+--+--
  k = 0 |1 |2 |1 |2 |1 |2 |3 |2 |3 |2 |3
*/
// The implementation:
int n, S;                                        //n &mdash; number of coin types, S &mdash; desired overall weight
int wgt[MAXN];                                   //array of coin weights (W); for example: {1, 3, 5};
int mink[MAXW];                                  //array of DP results (k); look above for the example;
 
  mink[0] = 0;                                   //base of DP: 0 weight can be achieved by 0 coins
  for (int P = 1; P<=S; P++) {                   //iterate through all the states
    int minres = 1000000000;                    
    for (int i = 0; i<n; i++) if (wgt[i] <= P) { //suppose that the coin with weight wgt[i] is the last
      int tres = mink[P &mdash; wgt[i]] + 1;           //the number of coins with the coin is greater by one
      if (minres > tres) minres = tres;          //choose the minimal overall number of coins among all cases
    }
    mink[P] = minres;                            //store the result in mink array
  }
  int answer = mink[S];                          //the answer for the whole problem is the result for state (S)

Tutorial (LCS example)

Consider another problem: given two words, find the length of their longest common subsequence. For example, for two words "quetzalcoatl" and "tezcatlipoca" the longest subsequence has length 6, f.i. "ezaloa".



To solve the problem we introduce the set of subproblems: given a prefix of the first word and a prefix of the second word, find their LCS. Let the prefix of the first word has length i and the prefix of the second word has length j. As we see, the DP state is determined by two integer parameters: i and j. The state domain is therefore (i,j)->L, where i is the length of first word prefix, j is the length of second word prefix and L is the length of the longest common subsequence of these prefixes. The idea of solution is to take the solution for basic subproblem (0,0) and then add letters to the prefixes one-by-one until we reach the final state (n1,n2) which represents the problem for the full words.

Now let's derive the recurrent relations for DP results denoted as L[i,j]. Clearly, if one of the prefixes is empty, then the LCS must be zero. This is a base equation: L[i,0] = L[0,j] = 0. When i and j are positive then we have to treat several cases: 1. The last letter in the first word prefix is not used in the LCS. So it can be erased without changing the subsequence. The corresponding formula is L[i,j] = L[i-1,j]. 2. The last letter in the second word prefix is unused. Similarly, the formula for the case is: L[i,j] = L[i,j-1] 3. Otherwise, last letters 's' and 't' of both prefixes are included in the common subsequence. Clearly, these letters must be equal. In such a case erasing both last letters will reduce LCS by exactly one. The corresponding formula is: L[i,j] = L[i-1,j-1] + 1 (only if 's' = 't'). Among all three cases we should choose the case which gives the maximal length of sequence.

Implementation-wise, the DP results are stored in two-dimensional array. The values of this array are calculated in two nested loops. It is important that the states are traversed in such order that parameter values are non-decreasing because the DP result for the state (i,j) depends on the results for states (i-1,j), (i,j-1), (i-1,j-1).
/* Recurrent relations for DP:
  {L[i,0] = L[0,j] = 0;
  |            {L[i-1,j],
  {L[i,j] = max|L[i,j-1],
               {L[i-1,j-1]+1   (only if last symbols are equal)




char str1[1024], str2[1024];                             //input words
int lcs[1024][1024];                                     //DP results array
 
  for (int i = 0; i<=n1; i++)                            //iterate through all states (i,j)
    for (int j = 0; j<=n2; j++) {                        //in lexicographical order
      if (i == 0 || j == 0)
        lcs[i][j] = 0;                                   //the DP base case
      else {
        lcs[i][j] = max(lcs[i-1][j], lcs[i][j-1]);       //handle cases 1 and 2
        if (str1[i-1] == str2[j-1])
          lcs[i][j] = max(lcs[i][j], lcs[i-1][j-1] + 1); //handle case 3
      }
    }
  int answer = lcs[n1][n2];










Comparison with memoization



There is another technique called memoization which is covered in detail by recipe "Optimizing recursive solution". Recursive solution with memoization is very similar to backward-style dynamic programming solution. Both methods solve recurrent equations, which means that they deal with state domain — set of states with some result defined. The results for some states are determined from base of recurrence. The results for other states depend on the results of previous states. The DP solution iterates through the states in some particular order set by coder, while memoization iterates through them in order of depth-first search. DP never calculates the DP result for any state twice, just like the recursive solution with full memoization. The memoization approach does not spend time on unnecessary states — it is a lazy algorithm. Only the states which influence the final answer are processed. Here are the pros and cons of memoization over DP: 1[+]. Sometimes easier to code. 2[+]. Does not require to specify order on states explicitly. 3[+]. Processes only necessary states. 4[-]. Works only in the backward-style DP. 5[-]. Works a bit slower than DP (by constant).


Most of DP problems can be divided into two types: optimization problems and combinatoric problems. The optimization problems require you to choose some feasible solution so that the value of goal function is minimized (or maximized). Combinatoric problems request the number of ways to do something or the probability of some event.








Optimization DP problem



Optimization problem asks to choose the best feasible solution according to some goal function. Both coins and LCS examples are optimization-type. The recurrent equation looks like R[s] = min(F1(R[i], R[j], ..., R[k]), F2(R[u], R[v], ..., R[w]), ..., Fl(R[q], R[p], ..., R[z])), where R is the DP results array. Simply speaking, the result is chosen as the best = minimal among the several candidate cases. For each case the result is calculated from the results of previous DP states. For example in coins problem all the possible last coin cases are considered. Each of them yields one case in the recurrent formula. The result for the state is a minimum among all such cases. In LCS example there were three cases: first word last letter unused, second word last letter unused and both words last letter used.


It is often useful to fill the DP results array with neutral values before calculating anything. The neutral value is a result which does not affect the problem answer for sure. In case of minimization problem the neutral value is positive infinity: since it is greater than any number, all the recurrent formulas would prefer a case with finite value to such a neutral element. In other words, the state with neutral value result can be thought of as an impossible state. Note that for maximization problem negative infinity is a neutral element.

The DP states are often called DP subproblems because they represent some problem for input data which is subset of the whole problem input. For example, in LCS case each subproblem involves two arbitrary prefixes of the original two words. 

The DP method relies on the optimal substructure property: given the optimal solution for the whole problem, its partial solutions must be optimal for the subproblems. 

In the coins case it means that if the solution for whole problem with overall weight S is optimal and it contains coin with weight w, then the solution without w coin must also be optimal for the subproblem with overall weight (S — w).

Optimal substructure property is very important: if it does not hold and the optimal solution has the subsolution which is not optimal, then it would be discarded somewhere in the middle of DP on taking the minimum. Often the DP solution turns out to be theoretically wrong because it lacks the optimal substructure. For example there is a classical travelling salesman problem. Let the DP state domain be (k,l)->D where D is the minimal length of the simple path going through exactly k cities with 0-th city being the first one and l-th city being the last one. The optimal substructure property in such a DP does not hold: given the shortest tour its subpath with fixed last city and overall number of cities is not always the shortest. Therefore the proposed DP would be wrong anyway.

Combinatoric DP problem

The goal of combinatoric DP problem is to find number of ways to do something or the probability that the event happens. Often the number of ways can be big and only the reminder modulo some small number is required. The recurrent equation looks like R[s] = F1(R[i], R[j], ..., R[k]) + F2(R[u], R[v], ..., R[w]) + ... + Fl(R[q], R[p], ..., R[z]). The only difference from optimization case is the sum instead of minimum — and it changes a lot. The summation means that the different ways from F1, F2, ..., Fl cases altogether comprise the all the ways for state (s).

The example of combinatoric case is a modified coins problem: count the number of ways to choose coins so that their overall weight is equal to S. The state domain is the same: (P)->k where k is number of ways to choose coins so that their overall weight is exactly P. The recurrent equations are only a bit different in combinatoric problem.

/* Recurrent equations for DP:
  {k[0] = 1;
  {k[P] = sum_i (k[P-Wi]);   (for Wi <= P)

There is also a neutral value for combinatoric problem. Since combinatoric problem uses summation, the neutral element is zero. The DP results in combinatoric case usually represents number of ways to do smth, so if the result is zero than there is no way to do it. The neutral result value means that the case is impossible. It may be useful to fill DP results array with zero values, though it is usually done automatically. In case of combinatorics it is important that each possible way is counted and that no way is counted more than once. The second condition is sometimes difficult to satisfy.




Forward vs backward DP style

All the DPs described above are done in backward style. The schema is: iterate through all the states and for each of them calculate the result by looking backward and using the already known DP results of previous states. This style can also be called recurrent since it uses recurrent equations directly for calculation. The relations for backward-style DP are obtained by examining the best solution for the state and trying to decompose it to lesser states.

There is also forward-style DP. Surprisingly it is often more convenient to use. The paradigm of this style is to iterate through all the DP states and from each state perform some transitions leading forward to other states. Each transition modifies the currently stored result for some unprocessed states. When the state is considered, its result is already determined completely. The forward formulation does not use recurrent equations, so it is more complex to prove the correctness of solution strictly mathematically. The recurrent relations used in forward-style DP are obtained by considering one partial solution for the state and trying to continue it to larger states. To perform forward-style DP it is necessary to fill the DP results with neutral values before starting the calculation.

The first example will be combinatoric coins problem. Suppose that you have a partial solution with P overall weight. Then you can add arbitrary coin with weight Wi and get overall weight P+Wi. So you get a transition from state (P) to state (P+Wi). When this transition is considered, the result for state (P) is added to the result of state (P+Wi) which means that all the ways to get P weight can be continued to the ways to get P+Wi weight by adding i-th coin. Here is the code.
/* Recurrent relations (transitions) of DP:
  {k[0] = 1;
  {(P)->k ===> (P+Wi)->nk    add k to nk
*/
  //res array is automatically filled with zeroes
  res[0] = 1;                                 //DP base is the same
  for (int p = 0; p<s; p++)                   //iterate through DP states
    for (int i = 0; i<n; i++) {               //iterate through coin to add
      int np = p + wgt[i];                    //the new state is (np)
      if (np > s) continue;                   //so the transition is (p) ==> (np)
      res[np] += res[p];                      //add the DP result of (p) to DP result of (np)
    }
  int answer = res[s];                        //problem answer is the same

The second example is longest common subsequence problem. It is of maximization-type, so we have to fill the results array with negative infinities before calculation. The DP base is state (0,0)->0 which represents the pair of empty prefixes. When we consider partial solution (i,j)->L we try to continue it by three ways: 1. Add the next letter of first word to the prefix, do not change subsequence. 2. Add the next letter of the second word to the prefix, do not change subsequence. 3. Only if the next letters of words are the same: add next letter to both prefixes and include it in the subsequence. For each transition we perform so-called relaxation of the larger DP state result. We look at the currently stored value in that state: if it is worse that the proposed one, then it is replaced with the proposed one, otherwise it is not changed. The implementation code and compact representation of DP relations are given below.
/* Recurrent relations (transitions) of DP:
  {L[0,0] = 0;
  |          /> (i+1,j)->relax(L)
  {(i,j)->L ==> (i,j+1)->relax(L)
             \> (i+1,j+1)->relax(L+1)  (only if next symbols are equal)
*/
void relax(int &a, int b) {                   //relaxation routine
  if (a < b) a = b;
}
 
  memset(lcs, -63, sizeof(lcs));              //fill the DP results array with negative infinity
  lcs[0][0] = 0;                              //set DP base: (0,0)->0
  for (int i = 0; i<=n1; i++)
    for (int j = 0; j<=n2; j++) {             //iterate through all states
      int tres = lcs[i][j];
      relax(lcs[i+1][j], tres);               //try transition of type 1
      relax(lcs[i][j+1], tres);               //try transition of type 2
      if (str1[i] == str2[j])                 //and if next symbols are the same
        relax(lcs[i+1][j+1], tres + 1);       //then try transition of type 3
    }
  int answer = lcs[n1][n2];








Recovering the best solution for optimization problems



The optimization problem asks us to find the feasible solution with the minimal value of goal function, but DP finds only the goal function value itself. It does not produce the best solution along with the numerical answer. In practical usage the answer without a solution is useless, though in topcoder algorithm problems often only the answer is required. Anyway, it is useful to know how to reconstruct the best solution after DP.


In the most common case the each transition is an atomic improvement of some partial solution and recurrent equation is something like: R[s] = min(F1(R[u], u), F2(R[v], v), ..., Fk(R[w], w)). In other words, the result for the state is produced from a single best previous state plus some modification. In such a case the DP solution can be reconstructed trivially from the DP solution path. The DP solution path goes from some base DP state to some final state and consists of the states which represent all the partial solutions of the desired best solution. There are two ways to get this path.

Another way is to store back-links along with the DP result. For each state (s) we save the parameters of the previous state (u) that was continued. When we perform a transition (u) ==> (s) which produces better result than the currently stored in (s) then we set the back-link to (s) to the state (u). To trace the DP solution path we need simply to repeatedly move to back-linked state until the starting state is met. Note that you can store any additional info about the way the DP result was obtained to simplify solution reconstruction.

The first approach has a lot of drawbacks. It is usually slower, it leads to DP code being copy/pasted, it requires backward-style DP for tracing the path. It is good only in the rare case when there is not enough memory to store the back-links required in the second way. The second way is preferred since it is simple to use and supports both backward and forward style DP solutions. If the result for each DP state originates from more than one previous DP state, then you can store the links to all the previous states. The path reconstruction is easiest to implement in the recursive way in such a case.



For example of recovering the solution coins problem is again considered. Note that the DP code is almost the same except that the back-link and item info is set on the relaxation. The later part of tracing the path back is rather simple.




int mink[MAXW];                       //the DP result array
int prev[MAXW], item[MAXW];           //prev &mdash; array for back-links to previous state
int k;                                //item &mdash; stores the last item index
int sol[MAXW];                        //sol[0,1,2,...,k-1] would be the desired solution
 
  memset(mink, 63, sizeof(mink));     //fill the DP results with positive infinity
  mink[0] = 0;                        //set DP base (0)->0
  for (int p = 0; p<s; p++)           //iterate through all states
    for (int i = 0; i<n; i++) {       //try to add one item
      int np = p + wgt[i];            //from (P)->k we get
      int nres = mink[p] + 1;         //to (P+Wi)->k+1
      if (mink[np] > nres) {          //DP results relaxation
        mink[np] = nres;              //in case of success
        prev[np] = p;                 //save the previous state
        item[np] = i;                 //and the used last item
      }
    }
 
  int answer = mink[s];
  int cp = s;                         //start from current state S
  while (cp != 0) {                   //until current state is zero
    int pp = prev[cp];                //get the previous state from back-link
    sol[k++] = item[cp];              //add the known item to solution array
    cp = pp;                          //move to the previous state
  }

https://www.topcoder.com/community/competitive-programming/tutorials/dynamic-programming-from-novice-to-advanced/

What is a dynamic programming, how can it be described? A DP is an algorithmic technique which is usually based on a recurrent formula and one (or some) starting states. A sub-solution of the problem is constructed from previously found ones. DP solutions have a polynomial complexity which assures a much faster running time than other techniques like backtracking, brute-force etc.

Now let’s see the base of DP with the help of an example:

Given a list of N coins, their values (V1, V2, … , VN), and the total sum S. Find the minimum number of coins the sum of which is S (we can use as many coins of one type as we want), or report that it’s not possible to select coins in such a way that they sum up to S.

Now let’s start constructing a DP solution:

First of all we need to find a state for which an optimal solution is found and with the help of which we can find the optimal solution for the next state.

Set Min[i] equal to Infinity for all of i
Min[0]=0

For i = 1 to S
For j = 0 to N - 1
 If (Vj<=i AND Min[i-Vj]+1 < Min[i])
Then Min[i]=Min[i-Vj]+1

Output Min[S]

Additionally, by tracking data about how we got to a certain sum from a previous one, we can find what coins were used in building it. For example: to sum 11 we got by adding the coin with value 1 to a sum of 10. To sum 10 we got from 5. To 5 – from 0. This way we find the coins used: 1, 5 and 5.

Having understood the basic way a DP is used, we may now see a slightly different approach to it. It involves the change (update) of best solution yet found for a sum i, whenever a better solution for this sum was found. In this case the states aren’t calculated consecutively.



Let’s consider the problem above. Start with having a solution of 0 coins for sum 0. Now let’s try to add first coin (with value 1) to all sums already found. If the resulting sum t will be composed of fewer coins than the one previously found – we’ll update the solution for it. Then we do the same thing for the second coin, third coin, and so on for the rest of them. For example, we first add coin 1 to sum 0 and get sum 1. Because we haven’t yet found a possible way to make a sum of 1 – this is the best solution yet found, and we mark S[1]=1. By adding the same coin to sum 1, we’ll get sum 2, thus making S[2]=2. And so on for the first coin. After the first coin is processed, take coin 2 (having a value of 3) and consecutively try to add it to each of the sums already found. Adding it to 0, a sum 3 made up of 1 coin will result. Till now, S[3] has been equal to 3, thus the new solution is better than the previously found one. We update it and mark S[3]=1. After adding the same coin to sum 1, we’ll get a sum 4 composed of 2 coins. Previously we found a sum of 4 composed of 4 coins; having now found a better solution we update S[4] to 2. The same thing is done for next sums – each time a better solution is found, the results are updated.

for harder problems. For that we will introduce a new term called recurrent relation, which makes a connection between a lower and a greater state.

Let’s see how it works:

Given a sequence of N numbers – A[1] , A[2] , …, A[N] . Find the length of the longest non-decreasing sequence.



As described above we must first find how to define a "state" which represents a sub-problem and thus we have to find a solution for it. Note that in most cases the states rely on lower states and are independent from greater states.

Let’s define a state i as being the longest non-decreasing sequence which has its last number A[i] . This state carries only data about the length of this sequence. Note that for i<j the state i is independent from j, i.e. doesn’t change when we calculate state j. Let’s see now how these states are connected to each other. Having found the solutions for all states lower than i, we may now look for state i. At first we initialize it with a solution of 1, which consists only of the i-th number itself. Now for each j<i let’s see if it’s possible to pass from it to state i. This is possible only when A[j]≤A[i] , thus keeping (assuring) the sequence non-decreasing. So if S[j] (the solution found for state j) + 1 (number A[i]added to this sequence which ends with number A[j] ) is better than a solution found for i (ie. S[j]+1>S[i] ), we make S[i]=S[j]+1. This way we consecutively find the best solutions for each i, until last state N.

Practice problem:

Given an undirected graph G having N (1<N<=1000) vertices and positive weights. Find the shortest path from vertex 1 to vertex N, or state that such path doesn’t exist.

Hint: At each step, among the vertices which weren’t yet checked and for which a path from vertex 1 was found, take the one which has the shortest path, from vertex 1 to it, yet found.

Try to solve the following problems from Topcoder competitions:


ZigZag – 2003 TCCC Semifinals 3


BadNeighbors – 2004 TCCC Round 4


FlowerGarden – 2004 TCCC Round 1



Let’s see now how to tackle bi-dimensional DP problems.

Problem:

A table composed of N x M cells, each having a certain quantity of apples, is given. You start from the upper-left corner. At each step you can go down or right one cell. Find the maximum number of apples you can collect.

This problem is solved in the same way as other DP problems; there is almost no difference.

First of all we have to find a state. The first thing that must be observed is that there are at most 2 ways we can come to a cell – from the left (if it’s not situated on the first column) and from the top (if it’s not situated on the most upper row). Thus to find the best solution for that cell, we have to have already found the best solutions for all of the cells from which we can arrive to the current cell.

From above, a recurrent relation can be easily obtained:



S[i][j]=A[i][j] + max(S[i-1][j], if i>0 ; S[i][j-1], if j>0) (where i represents the row and j the column of the table , its left-upper corner having coordinates {0,0} ; and A[i][j] being the number of apples situated in cell i,j).

S[i][j] must be calculated by going first from left to right in each row and process the rows from top to bottom, or by going first from top to bottom in each column and process the columns from left to right.

Pseudocode:

For i = 0 to N - 1
 For j = 0 to M - 1
 S[i][j] = A[i][j] +
  max(S[i][j-1], if j>0 ; S[i-1][j], if i>0 ; 0)

Output S[n-1][m-1]


Here are a few problems, from Topcoder Competitions, for practicing:


AvoidRoads – 2003 TCO Semifinals 4


ChessMetric – 2003 TCCC Round 4

Upper-Intermediate

This section will discuss about dealing DP problems which have an additional condition besides the values that must be calculated.

As a good example would serve the following problem:

Given an undirected graph G having positive weights and N vertices.

You start with having a sum of M money. For passing through a vertex i, you must pay S[i] money. If you don’t have enough money – you can’t pass through that vertex. Find the shortest path from vertex 1 to vertex N, respecting the above conditions; or state that such path doesn’t exist. If there exist more than one path having the same length, then output the cheapest one. Restrictions: 1<N<=100 ; 0<=M<=100 ; for each i, 0<=S[i]<=100.

As we can see, this is the same as the classical Dijkstra problem (finding the shortest path between two vertices), with the exception that it has a condition. In the classical Dijkstra problem we would have used a uni-dimensional array Min[i] , which marks the length of the shortest path found to vertex i. However in this problem we should also keep information about the money we have. Thus it would be reasonable to extend the array to something like Min[i][j] , which represents the length of the shortest path found to vertex i, with j money being left. In this way the problem is reduced to the original path-finding algorithm.

At each step we find the unmarked state (i,j) for which the shortest path was found. We mark it as visited (not to use it later), and for each of its neighbors we look if the shortest path to it may be improved. If so – then update it. We repeat this step until there will remain no unmarked state to which a path was found. The solution will be represented by Min[N-1][j] having the least value (and the greatest j possible among the states having the same value, i.e. the shortest paths to which has the same length).

Pseudocode:

Set states(i,j) as unvisited for all (i,j)
Set Min[i][j] to Infinity for all (i,j)

Min[0][M]=0

While(TRUE)

Among all unvisited states(i,j) find the one for which Min[i][j]
is the smallest. Let this state found be (k,l).

If there wasn't found any state (k,l) for which Min[k][l] is
less than Infinity - exit While loop.

Mark state(k,l) as visited

For All Neighbors p of Vertex k.
 If (l-S[p]>=0 AND
 Min[p][l-S[p]]>Min[k][l]+Dist[k][p])
  Then Min[p][l-S[p]]=Min[k][l]+Dist[k][p]
i.e.
If for state(i,j) there are enough money left for
going to vertex p (l-S[p] represents the money that
will remain after passing to vertex p), and the
shortest path found for state(p,l-S[p]) is bigger
than [the shortest path found for
state(k,l)] + [distance from vertex k to vertex p)],
then set the shortest path for state(i,j) to be equal
to this sum.
End For

End While

Find the smallest number among Min[N-1][j] (for all j, 0<=j<=M);
if there are more than one such states, then take the one with greater
j. If there are no states(N-1,j) with value less than Infinity - then
such a path doesn't exist.

Here are a few Topcoder problems for practicing:

Jewelry – 2003 TCO Online Round 4
StripePainter – SRM 150 Div 1
QuickSums – SRM 197 Div 2
ShortPalindromes – SRM 165 Div 2

MiniPaint – SRM 178 Div 1

Additional Note:

When have read the description of a problem and started to solve it, first look at its restrictions. If a polynomial-time algorithm should be developed, then it’s possible that the solution may be of DP type. In this case try to see if there exist such states (sub-solutions) with the help of which the next states (sub-solutions) may be found. Having found that – think about how to pass from one state to another. If it seems to be a DP problem, but you can’t define such states, then try to reduce the problem to another one (like in the example above, from Section 5).

https://www.geeksforgeeks.org/dynamic-programming/
Dynamic Programming is mainly an optimization over plain recursion. Wherever we see a recursive solution that has repeated calls for same inputs, we can optimize it using Dynamic Programming. The idea is to simply store the results of subproblems, so that we do not have to re-comupute them when needed later. This simple optimization reduces time complexities from exponential to polynomial

1) Overlapping Subproblems
2) Optimal Substructure

Like Divide and Conquer, Dynamic Programming combines solutions to sub-problems. Dynamic Programming is mainly used when solutions of same subproblems are needed again and again. In dynamic programming, computed solutions to subproblems are stored in a table so that these don’t have to be recomputed. So Dynamic Programming is not useful when there are no common (overlapping) subproblems because there is no point storing the solutions if they are not needed again. For example, Binary Search doesn’t have common subproblems.

http://blog.refdash.com/dynamic-programming-tutorial-example/

How to recognize a DP problem
Identify problem variables
Clearly express the recurrence relation
Identify the base cases
Decide if you want to implement it iteratively or recursively
Add memoization
Determine time complexity

1) You’re given a flat runway with a bunch of spikes in it. The runway is represented by a boolean array which indicates if a particular (discrete) spot is clear of spikes. It is True for clear and False for not clear.

Example array representation

2) You’re given a starting speed S. S is a non-negative integer at any given point and it indicates how much you will move forward with the next jump.

3) Every time you land on a spot, you can adjust your speed by up to 1 unit before the next jump.

4) You want to safely stop anywhere along the runway (does not need to be at the end of the array). You stop when your speed becomes 0. However, if you land on a spike at any point, your crazy bouncing ball bursts and it’s a game over.

The output of your function should be a boolean indicating whether we can safely stop anywhere along the runway.

Step 1: How to recognize a Dynamic Programming problem

First, let’s make it clear that DP is essentially just an optimization technique. DP is a method for solving problems by breaking it down into a collection of simpler subproblems, solving each of those subproblems just once, and storing their solutions. The next time the same subproblem occurs, instead of recomputing its solution, one simply looks up the previously computed solution. This saves computation time at the expense of a (hopefully) modest expenditure in storage space.

Recognizing that a problem can be solved using DP is the first and often the most difficult step in solving it. What you want to ask yourself is whether your problem solution can be expressed as a function of solutions to similar smaller problems.

In the case of our example problem, given a point on the runway, a speed, and the runway ahead, we could determine the spots where we could potentially jump next. Furthermore, it seems that whether we can stop from the current point with the current speed depends only on whether we could stop from the point we choose to go to next. That is a great thing because by moving forward we shorten the runway ahead and make our problem smaller. We should be able to repeat this process all the way until we get to a point where it is obvious whether we can stop.

Recognizing a Dynamic Programming problem is often the most difficult step in solving it. Can the problem solution be expressed as a function of solutions to similar smaller problems?

Step 2: Identify problem variables

Now we have established that there is some recursive structure between our subproblems. Next, we need to express the problem in terms of the function parameters and see which of those parameters are changing. Typically in interviews, you will have one or two changing parameters, but technically this could be any number. A classic example of a one-changing-parameter problem is “determine an n-th Fibonacci number”. Such example for a two-changing-parameters problem is “Compute edit distance between strings”.

A way to determine the number of changing parameters is to list examples of several subproblems and compare the parameters. Counting the number of changing parameters is valuable to determine the number of subproblems we have to solve, but it is also important in its own right in helping us strengthen the understanding of the recurrence relation from step 1.

In our example, the two parameters that could change for every subproblem are:

Array position (P)
Speed (S)

One could say that the runway ahead is changing as well, but that would be redundant considering that the entire non-changing runway and the position (P) carry that information already.

Now, with these 2 changing parameters and other static parameters, we have the complete description of our sub-problems.

Identify the changing parameters and determine the number of subproblems.

Step 3: Clearly express the recurrence relation

This is an important step that many rush through in order to get into coding. Expressing the recurrence relation as clearly as possible will strengthen your problem understanding and make everything else significantly easier.

Once you figure out that the recurrence relation exists and you specify the problems in terms of parameters, this should come as a natural step. How do problems relate to each other? In other words, let’s assume that you have computed the subproblems. How would you compute the main problem?

Here is how we think about it in our sample problem:

Because you can adjust your speed by up to 1 before jumping to the next position, there are only 3 possible speeds and therefore 3 spots in which we could be next.

More formally, if our speed is S, position P, we could go from (S, P) to:

1. (S, P + S); # if we do not change the speed
2. (S – 1, P + S – 1); # if we change the speed by -1
3. (S + 1, P + S + 1); # if we change the speed by +1

If we can find a way to stop in any of the subproblems above, then we can also stop from (S, P). This is because we can transition from (S, P) to any of the above three options.

This is typically a fine level of understanding of the problem (plain English explanation), but you sometimes might want to express the relation mathematically as well. Let’s call a function that we’re trying to compute canStop. Then:

canStop(S, P) = canStop(S, P + S) || canStop(S – 1, P + S – 1) || canStop(S + 1, P + S + 1)

Woohoo, it seems like we have our recurrence relation!

Recurrence relation: Assuming you have computed the subproblems, how would you compute the main problem?

Step 4: Identify the base cases

A base case is a subproblem that doesn’t depend on any other subproblem. In order to find such subproblems, you typically want to try a few examples, see how your problem simplifies into smaller subproblems, and at what point it cannot be simplified further.

The reason a problem cannot be simplified further is that one of the parameters would become a value that is not possible given constraints of a problem.

In our example problem, we have two changing parameters, S and P. Let’s think about what possible values of S and P might not be legal:

P should be within the bounds of the given runway
P cannot be such that runway[P] is false because that would mean that we’re standing on a spike
S cannot be negative and a S==0 indicates that we’re done

Sometimes it can be a little challenging to convert assertions that we make about parameters into programmable base cases. This is because, in addition to listing the assertions if you want to make your code look concise and not check for unnecessary conditions, you need to also think about which of these conditions are even possible.

In our example:

P < 0 || P >= length of runway seems like the right thing to do. An alternative could be to consider making P == end of runway a base case. However, it is possible that a problem splits into a subproblem which goes beyond the end of the runway, so we really need to check for inequality.
This seems pretty obvious. We can simply check if runway[P] is false.
Similar to #1, we could simply check for S < 0 and S == 0. However, here we can reason that it is impossible for S to be < 0 because S decreases by at most 1, so it would have to go through S == 0 case beforehand. Therefore S == 0 is a sufficient base case for the S parameter.

Step 5: Decide if you want to implement it iteratively or recursively

To decide whether to go iteratively or recursively, you want to carefully think about the trade-offs.

	Recursive	Iterative
Asymptotic time complexity	Same assuming memoization	Same
Memory usage	Recursive stack, Sparse memoization	Full memoization
Execution speed	Often faster depending on the input	Slower, needs to do same work regardless of the input
Stack overflow	Problem	No issues as long as enough memory for full memoization
More intuitive / easier to implement	Often easier to reason about	most people find it harder to reason through

Stack overflow issues are typically a deal breaker and a reason why you would not want to have recursion in a (backend) production system. However, for the purposes of the interview, as long as you mention the trade-offs, you should typically be fine with either of the implementations. You should feel comfortable implementing both.

def canStopRecursive(runway, initSpeed, startIndex = 0):
# negative base cases need to go first
if (startIndex >= len(runway) or startIndex < 0 or
initSpeed < 0 or not runway[startIndex]):
return False
# base case for a stopping condition
if initSpeed == 0:
return True
# Try all possible paths
for adjustedSpeed in [initSpeed, initSpeed - 1, initSpeed + 1]:
# Recurrence relation: If you can stop from any of the subproblems,
# you can also stop from the main problem
if canStopRecursive(
runway, adjustedSpeed, startIndex + adjustedSpeed):
return True
return False

Step 6: Add memoization

Memoization is a technique that is closely associated with DP. It is used for storing the results of expensive function calls and returning the cached result when the same inputs occur again. Why are we adding memoization to our recursion? We encounter the same subproblems which without memoization are computed repeatedly. Those repetitions very often lead to exponential time complexities.

In recursive solutions, adding memoization should feel straightforward. Let’s see why. Remember that memoization is just a cache of the function results. There are times when you want to deviate from this definition in order to squeeze out some minor optimizations, but treating memoization as a function result cache is the most intuitive way to implement it.

This means that you should:

Store your function result into your memory before every return statement
Look up the memory for the function result before you start doing any other computation

def canStopRecursiveWithMemo(runway, initSpeed, startIndex = 0, memo = None):
# Only done the first time to initialize the memo.
if memo == None:
memo = {}
# First check if the result exists in memo
if startIndex in memo and initSpeed in memo[startIndex]:
return memo[startIndex][initSpeed]
# negative base cases need to go first
if (startIndex >= len(runway) or startIndex < 0 or
initSpeed < 0 or not runway[startIndex]):
insertIntoMemo(memo, startIndex, initSpeed, False)
return False
# base case for a stopping condition
if initSpeed == 0:
insertIntoMemo(memo, startIndex, initSpeed, True)
return True
# Try all possible paths
for adjustedSpeed in [initSpeed, initSpeed - 1, initSpeed + 1]:
# Recurrence relation: If you can stop from any of the subproblems,
# you can also stop from the main problem
if canStopRecursiveWithMemo(
runway, adjustedSpeed, startIndex + adjustedSpeed, memo):
insertIntoMemo(memo, startIndex, initSpeed, True)
return True
insertIntoMemo(memo, startIndex, initSpeed, False)
return False

Step 7: Determine Time complexity

There are some simple rules that can make computing time complexity of a dynamic programming problem much easier. Here are two steps that you need to do:

Count the number of states – this will depend on the number of changing parameters in your problem
Think about the work done per each state. In other words, if everything else but one state has been computed, how much work do you have to do to compute that last state

In our example problem, the number of states is |P| * |S|, where

P is the set of all positions (|P| indicates the number of elements in P)
S is the set of all speeds

The work done per each state is O(1) in this problem because, given all other states, we simply have to look at 3 subproblems to determine the resulting state.

As we noted in the code before, |S| is limited by length of the runway (|P|), so we could say that the number of states is |P|^2 and because work done per each state is O(1), then the total time complexity is O(|P|^2).

However, it seems that |S| can be further limited because if it were really |P| it is very clear that stopping would not be possible because you would have to jump the length of the entire runway on the first move.

So let’s see how we can put a tighter bound on |S|. Let’s call maximum speed S. Assume that we’re starting from position 0. How quickly could we stop if we were trying to stop as soon as possible and if we ignore potential spikes?

In the first iteration, we would have to come at least to the point (S-1), by adjusting our speed at zero by -1. From there we would at a minimum go by (S-2) steps forward, and so on.

For a runway of length L, the following has to hold:

=> (S-1) + (S-2) + (S-3) + ….+ 1 < L

=> S * (S – 1) / 2 < L

=> S^2 – S – 2L < 0

If you find roots of the above function, they will be:

r1 = 1/2 + sqrt(1/4 + 2L) and r2 = 1/2 – sqrt(1/4 + 2L)

We can write our inequality as:

(S – r1) * (S – r2) < 0

Considering that S – r2 > 0 for any S > 0 and L > 0, we need the following:

S – 1/2 – sqrt(1/4 + 2L) < 0

=> S < 1/2 + sqrt(1/4 + 2L)

That is the maximum speed that we could possibly have on a runway of a length L. If we had a speed higher than that, we could not stop even theoretically, irrespective of the position of the spikes.

That means that the total time complexity depends only on the length of the runway L in the following form:

O(L * sqrt(L)) which is better than O(L^2)

O(L * sqrt(L)) is the upper bound on the time complexity

Here are some next steps that you can take:

Extend the sample problem by trying to find a path to a stopping point. We solved a problem that tells you whether you can stop, but what if you wanted to also know the steps to take in order to stop eventually along the runway. How would you modify the existing implementation to do that?
One thing that could be quite useful in solidifying your understanding of memoization and understanding that it is just a function result cache is reading about decorators in python or similar concepts in other languages. Think about how they would allow you to implement memoization in general for any function that you want to memoize.
Work on more DP problems by following the steps we went through. You can always find a bunch of them online (ex. LeetCode or GeeksForGeeks). As you practice, keep in mind one thing: Learn ideas, don’t learn problems. The number of ideas is significantly smaller and it’s an easier space to conquer which will also serve you much better.

http://www.shmoop.com/computer-science/dynamic-programming/when-to-use.html
Whenever a problem talks about optimizing something, dynamic programming could be your solution.

http://www.geeksforgeeks.org/solve-dynamic-programming-problem/

Steps to solve a DP
1) Identify if it is a DP problem
2) Decide a state expression with 
   least parameters
3) Formulate state relationship    
4) Do tabulation (or add memoization)

Step 1 : How to classify a problem as a Dynamic Programming Problem?

Typically, all the problems that require to maximize or minimize certain quantity or counting problems that say to count the arrangements under certain condition or certain probability problems can be solved by using Dynamic Programming.
All dynamic programming problems satisfy the overlapping subproblems property and most of the classic dynamic problems also satisfy the optimal substructure property. Once, we observe these properties in a given problem, be sure that it can be solved using DP.

Step 2 : Deciding the state
DP problems are all about state and their transition. This is the most basic step which must be done very carefully because the state transition depends on the choice of state definition you make. So, let’s see what do we mean by the term “state”.

State A state can be defined as the set of parameters that can uniquely identify a certain position or standing in the given problem. This set of parameters should be as small as possible to reduce state space.

For example: In our famous Knapsack problem, we define our state by two parameters index and weight i.e DP[index][weight]. Here DP[index][weight] tells us the maximum profit it can make by taking items from range 0 to index having the capacity of sack to be weight. Therefore, here the parameters index and weight together can uniquely identify a subproblem for the knapsack problem.

So, our first step will be deciding a state for the problem after identifying that the problem is a DP problem.

As we know DP is all about using calculated results to formulate the final result.
So, our next step will be to find a relation between previous states to reach the current state.

Step 3 : Formulating a relation among the states
This part is the hardest part of for solving a DP problem and requires a lots of intuition, observation and practice. Let’s understand it by considering a sample problem


Given 3 numbers {1, 3, 5}, we need to tell
the total number of ways we can form a number 'N' 
using the sum of the given three numbers.
(allowing repetitions and different arrangements).

Let’s think dynamically for this problem. So, first of all, we decide a state for the given problem. We will take a parameter n to decide state as it can uniquely identify any subproblem. So, our state dp will look like state(n). Here, state(n) means the total number of arrangements to form n by using {1, 3, 5} as elements.

Now, we need to compute state(n).

How to do it?
So here the intuition comes into action. As we can only use 1, 3 or 5 to form a given number. Let us assume that we know the result for n = 1,2,3,4,5,6 ; being termilogistic let us say we know the result for the
state (n = 1), state (n = 2), state (n = 3) ……… state (n = 6)

Now, we wish to know the result of the state (n = 7). See, we can only add 1, 3 and 5. Now we can get a sum total of 7 by the following 3 ways:

Therefore, we can say that result for
state(7) = state (6) + state (4) + state (2)
or
state(7) = state (7-1) + state (7-3) + state (7-5)

In general,
state(n) = state(n-1) + state(n-3) + state(n-5)

So, our code will look like:

// Returns the number of arrangements to 

// form 'n' 

int solve(int n)

{ 

   // base case

   if (n < 0) 

      return 0;

   if (n == 0)  

      return 1;  

   return solve(n-1) + solve(n-3) + solve(n-5);

}

Step 4 : Adding memoization or tabulation for the state
This is the easiest part of a dynamic programming solution. We just need to store the state answer so that next time that state is required, we can directly use it from our memory

Adding memoization to the above code

// initialize to -1

int dp[MAXN];

// this function returns the number of 

// arrangements to form 'n' 

int solve(int n)

{ 

  // base case

  if (n < 0)  

      return 0;

  if (n == 0)  

      return 1;

  // checking if already calculated

  if (dp[n]!=-1) 

      return dp[n];

  // storing the result and returning

  return dp[n] = solve(n-1) + solve(n-3) + solve(n-5);

}

http://www.geeksforgeeks.org/tabulation-vs-memoizatation/