Massive Algorithms: Bitmasking and Dynamic Programming

Bitmasking and Dynamic Programming | Set-2 (TSP)

https://blog.csdn.net/joekwok/article/details/4749713

given a list of N cities, as well as the distances between each pair of cities, what is the shortest possible route that visits each city exactly once and returns back to the city that you started from? (Note: we can start from any city as long as we return to it)

TSP问题（旅行商问题）是指旅行家要旅行n个城市，要求各个城市经历且仅经历一次然后回到出发城市，并要求所走的路程最短。

现在要从城市0出发，最后又回到0，期间1，2，3都必须并且只能经过一次，使代价最小。

3.推导动态规划方程

假设从顶点s出发，令d(i, V’)表示从顶点i出发经过V’(是一个点的集合)中各个顶点一次且仅一次，最后回到出发点s的最短路径长度。

推导：(分情况来讨论)

①当V’为空集，那么d(i, V’)，表示从i不经过任何点就回到s了，如上图的城市3->城市0(0为起点城市)。此时d(i, V’)=Cis(就是城市i 到城市s 的距离)、

②如果V’不为空，那么就是对子问题的最优求解。你必须在V’这个城市集合中，尝试每一个，并求出最优解。

d(i, V’)=min{Cik + d(k, V’-{k})}

注：Cik表示你选择的城市和城市i的距离，d(k, V’-{k})是一个子问题。

综上所述，TSP问题的动态规划方程就出来了：

现在对问题定义中的例子来说明TSP的求解过程。(假设出发城市是 0城市)

①我们要求的最终结果是d(0,{1,2,3}),它表示，从城市0开始，经过{1,2,3}之中的城市并且只有一次，求出最短路径.

②d(0,{1,2,3})是不能一下子求出来的，那么他的值是怎么得出的呢？看上图的第二层，第二层表明了d(0,{1,2,3})所需依赖的值。那么得出：

d(0,{1,2,3})=min {

C01+d(1,{2,3})

C02+d{2,{1,3}}

C03+d{3,{1,2}}

}

③d(1,{2,3})，d(2,{1,3})，d(3,{1,2})同样也不是一步就能求出来的，它们的解一样需要有依赖，就比如说d(1,{2,3})

d(1,{2,3})=min{

C12+d(2,{3})

C13+d(3,{2})

}

d(2,{1,3})，d(3,{1,2})同样需要这么求。

https://jerkwin.github.io/2016/03/17/旅行推销商问题TSP的动态规划解法/

d[i,j]=⎧⎩⎨⎪⎪Ci0,min{k}∈j{Cik+d[k,j∖{k}]}=min{k}∈j{Cik+d[k,j−2k−1]},j=0j≠0

private double tsp(int i, int state, Double[][] memo, Integer[][] prev) {

// Done this tour. Return cost of going back to start node.
if (state == FINISHED_STATE) return distance[i][START_NODE];

// Return cached answer if already computed.
if (memo[i][state] != null) return memo[i][state];

double minCost = Double.POSITIVE_INFINITY;
int index = -1;
for (int next = 0; next < N; next++) {

// Skip if the next node has already been visited.
if ((state & (1 << next)) != 0) continue;

int nextState = state | (1 << next);
double newCost = distance[i][next] + tsp(next, nextState, memo, prev);
if (newCost < minCost) {
minCost = newCost;
index = next;
}
}

prev[i][state] = index;
return memo[i][state] = minCost;
}

https://www.geeksforgeeks.org/bitmasking-dynamic-programming-set-2-tsp/

Given a 2D grid of characters representing 
a town where '*' represents the 
houses, '#' represents the blockage, 
'.' represents the vacant street 
area. Currently you are (0, 0) position.

Our task is to determine the minimum distance 
to be moved to visit all the houses and return
to our initial position at (0, 0). You can 
only move to adjacent cells that share exactly
1 edge with the current cell.

The above problem is the well-known Travelling Salesman Problem.

The first part is to calculate the minimum distance between the two cells. We can do it by simply using a BFS as all the distances are unit distance. To optimize our solution we will be pre-calculating the distances taking the initial location and the location of the houses as the source point for our BFS.

Each BFS traversal takes O(size of grid) time. Therefore, it is O(X * size_of_grid) for overall pre-calculation, where X = number of houses + 1 (initial position)

Now lets’s think of a DP state

So we will be needing to track the visited houses and the last visited house to uniquely identify a state in this problem.

Therefore, we will be taking dp[index][mask] as our DP state.

Here,
index : tells us the location of current house
mask : tells us the houses that are visited ( if ith bit is set in mask then this means that the ith dirty tile is cleaned)

Whereas dp[index][mask] will tell us the minimum distance to visit X(number of set bits in mask) houses corresponding to their order of their occurrence in the mask where the last visited house is house at location index.

State transition relation

So our initial state will be dp[0][0] this tells that we are currently at initial tile that is our initial location and mask is 0 that states that no house is visited till now.

And our final destination state will be dp[any index][LIMIT_MASK], here LIMIT_MASK = (1<<N) – 1
and N = number of houses.

Therefore our DP state transition can be stated as

dp(curr_idx)(curr_mask) = min{
    for idx : off_bits_in_curr_mask
       dp(idx)(cur_mask.set_bit(idx)) + dist[curr_idx][idx]
}

The above relation can be visualized as the minimum distance to visit all the houses by standing at curr_idx house and by already visiting cur_mask houses is equal to min of distance between the curr_idx house and idx house + minimum distance to visit all the houses by standing at idx house and by already
visiting ( cur_mask | (1 <<idx) ) houses.

So, here we iterate over all possible idx values such that cur_mask has ith bit as 0 that tells us that ith house is not visited.

Whenever we have our mask = LIMIT_MASK, this means that we have visited all the houses in the town. So, we will add the distance from the last visited town (i.e the town at cur_idx positon) to the initial position (0, 0).

We have used the initial state to be dp[0][1] because we have pushed the start location at the first position in the container of houses. Hence, our Bit Mask will be 1 as the 0th bit is set i.e we have visited the starting location for our trip.

Time Complexity:

Consider the number of houses to be n. So, there are n * (2n) states and at every state, we are looping over n houses to transit over to next state and because of memoization we are doing this looping transition only once for each state. Therefore, our Time Complexity is O(n2 * 2n).

// stores distance taking souce 

// as every dirty tile 

int dist[MAXR][MAXC][MAXHOUSE]; 

// memoization for dp states 

int dp[MAXHOUSE][MAXMASK]; 

// stores coordinates for 

// dirty tiles 

vector < pair < int, int > > dirty; 

// Directions 

int X[] = {-1, 0, 0, 1}; 

int Y[] = {0, 1, -1, 0}; 

char arr[21][21]; 

// len : number of dirty tiles + 1 

// limit : 2 ^ len -1 

// r, c : number of rows and columns 

int len, limit, r, c; 

// Returns true if current position 

// is safe to visit 

// else returns false 

// Time Complexity : O(1) 

bool safe(int x, int y) 

{ 

    if (x >= r or y>= c or x<0 or y<0) 

       return false; 

    if (arr[x][y] == '#') 

       return false; 

    return true; 

} 

// runs BFS traversal at tile idx 

// calulates distance to every cell 

// in the grid 

// Time Complexity : O(r*c) 

void getDist(int idx){ 

    // visited array to track visited cells 

    bool vis[21][21]; 

    memset(vis, false, sizeof(vis)); 

    // getting current positon 

    int cx = dirty[idx].first; 

    int cy = dirty[idx].second; 

    // initializing queue for bfs 

    queue < pair < int, int > > pq; 

    pq.push({cx, cy}); 

    // initializing the dist to max 

    // because some cells cannot be visited 

    // by taking source cell as idx 

    for (int i = 0;i<= r;i++) 

        for (int j = 0;j<= c;j++) 

            dist[i][j][idx] = INF; 

    // base conditions 

    vis[cx][cy] = true; 

    dist[cx][cy][idx] = 0; 

    while (! pq.empty()) 

    { 

        auto x = pq.front(); 

        pq.pop(); 

        for (int i = 0;i<4;i++) 

        { 

           cx = x.first + X[i]; 

           cy = x.second + Y[i]; 

           if (safe(cx, cy)) 

           { 

               if (vis[cx][cy]) 

                   continue; 

               vis[cx][cy] = true; 

               dist[cx][cy][idx] = dist[x.first][x.second][idx] + 1; 

               pq.push({cx, cy}); 

            } 

         } 

    } 

} 

// Dynamic Programming state transition recursion 

// with memoization. Time Complexity: O(n*n*2 ^ n) 

int solve(int idx, int mask) 

{ 

    // goal state 

    if (mask == limit) 

       return dist[0][0][idx]; 

    // if already visited state 

    if (dp[idx][mask] != -1) 

       return dp[idx][mask]; 

    int ret = INT_MAX; 

    // state transiton relation 

    for (int i = 0;i<len;i++) 

    { 

        if ((mask & (1<<i)) == 0) 

        { 

            int newMask = mask | (1<<i); 

            ret = min( ret, solve(i, newMask) 

                + dist[dirty[i].first][dirty[i].second][idx]); 

        } 

    } 

    // adding memoization and returning 

    return dp[idx][mask] = ret; 

} 

void init() 

{ 

    // initializing containers 

    memset(dp, -1, sizeof(dp)); 

    dirty.clear(); 

    // populating dirty tile positions 

    for (int i = 0;i<r;i++) 

        for (int j = 0;j<c;j++) 

        { 

            if (arr[i][j] == '*') 

               dirty.push_back({i, j}); 

        } 

    // inserting ronot's location at the 

    // begining of the dirty tile 

    dirty.insert(dirty.begin(), {0, 0}); 

    len = dirty.size(); 

    // calculating LIMIT_MASK 

    limit = (1<<len) - 1; 

    // precalculating distances from all 

    // dirty tiles to each cell in the grid 

    for (int i = 0;i<len;i++) 

       getDist(i); 

}

https://github.com/cormacpayne/Data-Structures-and-Algorithms/blob/master/dynamic-programming/bitmask-dynamic-programming/bitmask-dynamic-programming.md

Take I - Permutations

We can create a permutation of the N cities to tell us what order we should visit the cities.

For example, if we have N = 4, we could have the permutation {3, 1, 4, 2}, which tells us to visit the cities in the following order: 3 --> 1 --> 4 --> 2 --> 3

For a given N, there are N! possible permutations (N choices for the first city, N-1 choices for the second city, N-2 choices for the third city, etc.)

However, if N > 12, this algorithm will be too slow, so we need to shoot for something better

Take II - Bitmask DP

Let's use bitmasks to help us with this problem: we'll construct a bitmask of length N.

What does the bit at index k represent in the scope of the problem? What does it mean for the bit to be zero? What about one?

if the bit at index k is zero, then we have not yet visited city k
if the bit at index k is one, then we have visited city k

For a given bitmask, we can see which cities we have left to travel to by checking which bits have a value of zero.

We can construct a recursive function to find the answer to this problem:

// The following function will calculate the shortest route that visits all
// unvisited cities in the bitmask and goes back to the starting city
public double solve( int bitmask )
{
    if ( dp[ bitmask ] != -1 )
        return dp[ bitmask ];
    // Handle the usual case
}

What are some base cases for this problem? How do we know when we are done?

What does the bitmask look like when we have visited every city?

111...111

In this case, we need to return to our starting city, so we'll just return the distance from our current location to the start city, but what is our current location?

Imagine we currently have the bitmask 11111.

This bitmask tells us that we have visited all N = 5 cities and can return home.

However, this bitmasks does not tell us which city we are currently at, so we don't know which distance to return.

This means that in addition to the bitmask, we will need to keep track of the current location (anytime we travel to a city, that city becomes the current location).

// The following function will calculate the shortest route that visits all unvisited
// cities in the bitmask starting from pos, and then goes back to the starting city
public double solve( int bitmask, int pos )
{
    if ( dp[ bitmask ][ pos ] != -1 )
        return dp[ bitmask ][ pos ];
    // Handle the usual case
}

If we know that solve( bitmask, pos ) will give us thge answer to the traveling salesman problem, and we say that we start at city 0, what should our initial function call be?

What is the value of bitmask?

1 (we visited city 0, so our starting bitmask is 000...0001)

What is the value of pos?

So how do we handle the usual case?

We have a bitmask that tells us which cities we have left to visit, and we know our current position, so which city do we go to next?

whichever one will minimize the remaining route

To see which city will minimize the value we want, we need to try all possible remaining cities and see how it affects the future routes.

What does trying to visit a city look like?

Let's say that we are visiting some city at index k:

int newBitmask = bitmask | ( 1 << k );
double newDist = dist[ pos ][ k ] + solve( newBitmask, k );

We need to update our bitmask, marking city k as visited, and then we add the distance from our current loication to city k, and then solve the subproblem where we find the shortest route starting at city k and solving for the rest of the cities.

If we do this for each city that hasn't been visited yet, we want to take the minimum newDist that we find.

// The following function will calculate the shortest route that visits all unvisited
// cities in the bitmask starting from pos, and then goes back to the starting city
public double solve( int bitmask, int pos )
{
    // If we have solved this subproblem previously, return the result that was recorded
    if ( dp[ bitmask ][ pos ] != -1 )
        return dp[ bitmask ][ pos ];
        
    // If the bitmask is all 1s, we need to return home    
    if ( bitmask == ( 1 << N ) - 1 )
        return dist[ pos ][ 0 ];
       
    // Keep track of the minimum distance we have seen when visiting other cities
    double minDistance = 2000000000.0;
    
    // For each city we haven't visited, we are going to try the subproblem that arises from visiting it
    for ( int k = 0; k < N; k++ )
    {
        int res = bitmask & ( 1 << k );
        
        // If we haven't visited the city before, try and visit it
        if ( res == 0 )
        {
            int newBitmask = bitmask | ( 1 << k );
            
            // Get the distance from solving the subproblems
            double newDist = dist[ pos ][ k ] + solve( newBitmask, k );
            
            // If newDist is smaller than the current minimum distance, we will override it here
            minDistance = Math.min( minDistance, newDist );
        }
    }
    
    // Set the optimal value of the current subproblem
    return dp[ bitmask ][ pos ] = minDistance;
}

http://my-zhang.github.io/blog/2014/04/20/dynamic-programming-with-bitmask/

TSP is also a classic NP-Complete problem. Brute force method has the complexity of

O (n!)

, while DP method can reduce it to

O (n^{2} 2^{n})

. The optimal substructure of the problem is,

\begin{aligned} D_{S, v} = min_{u \in S - {v}} (D_{S - {v}, u} + c o s t (u, v)) \end{aligned}

where the

D_{S, v}

denotes the length of the optimal path that visits every node in

S

exactly once and ends at

v

Thus, there’re

n 2^{n}

subproblems. The answer to the problem is

min_{v \in V} (D_{V, v})

, where

V

is the complete set of nodes.

Reference: slides from Standford CS97SI.

http://luoshaochuan.com/2017/07/15/%E5%8A%A8%E6%80%81%E8%A7%84%E5%88%92%E6%80%BB%E7%BB%93(%E5%85%AD%EF%BC%89/

令S表示还未经过的点集，v表示当前所在点，则初始状态为dp[V][0]=0(V为全集)，终止状态为dp[0][0],状态转移公式如下：

1	dp[S][v]=min(dp[S\|(1<<u)][u]+d(u,v))

记忆化搜索


int n;

int d[maxn][maxn];

int dp[1<<maxn][maxn];



int rec(int S,int v)

{

    if(dp[S][v]>=0)return dp[S][v];

    if(S==(1<<n)-1 && v==0) return dp[S][v]=0;

    int res = INF;

    for(int u=0;u<n;++u)

    {

    if(!(S>>u&1))

        res = min(res,rec(S|(1<<u),u)+d[u][v]);

    }

    return dp[S][v]=res;

}



void solve()

{

    memset(dp,-1,sizeof(dp));

    printf("%d\n",rec(0,0));

}

动态规划


int n;

int d[maxn][maxn];

int dp[maxn][maxn];



void solve()

{

    memset(dp,0x3f,sizeof(dp));

    dp[(1<<n)-1][0]=0;

    for(int S=(1<<n)-2;S>=0;--S)

    {

        for(int v=0;v<n;++v)

        {

            for(int u=0;u<n;++u)

            {

                if(!((S>>u)&1))

                dp[S][v]=min(dp[S][v],dp[S|(1<<u)][u]+d[u][v]);

            }

        }

    }

    printf("%d\n",dp[0][0]);

}

Bitmasking and Dynamic Programming | Set-2 (TSP)

Take I - Permutations

Take II - Bitmask DP

记忆化搜索

动态规划

Labels

Popular Posts