Given a sorted array keys[0.. n-1] of search keys and an array freq[0.. n-1] of frequency counts, where freq[i] is the number of searches to keys[i]. Construct a binary search tree of all keys such that the total cost of all the searches is as small as possible.
Read full article from Dynamic Programming | Set 24 (Optimal Binary Search Tree) | GeeksforGeeks
1) Optimal Substructure:
The optimal cost for freq[i..j] can be recursively calculated using following formula.
The optimal cost for freq[i..j] can be recursively calculated using following formula.
We need to calculate optCost(0, n-1) to find the result.
The idea of above formula is simple, we one by one try all nodes as root (r varies from i to j in second term). When we make rth node as root, we recursively calculate optimal cost from i to r-1 and r+1 to j.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.
We use an auxiliary array cost[n][n] to store the solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i][i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we use a variable ‘L’ for chain length and increment ‘L’, one by one. We calculate column number ‘j’ using the values of ‘i’ and ‘L’.
We add sum of frequencies from i to j (see first term in the above formula), this is added because every search will go through root and one comparison will be done for every search.
We use an auxiliary array cost[n][n] to store the solutions of subproblems. cost[0][n-1] will hold the final result. The challenge in implementation is, all diagonal values must be filled first, then the values which lie on the line just above the diagonal. In other words, we must first fill all cost[i][i] values, then all cost[i][i+1] values, then all cost[i][i+2] values. So how to fill the 2D array in such manner> The idea used in the implementation is same as Matrix Chain Multiplication problem, we use a variable ‘L’ for chain length and increment ‘L’, one by one. We calculate column number ‘j’ using the values of ‘i’ and ‘L’.
int
optimalSearchTree(
int
keys[],
int
freq[],
int
n)
{
/* Create an auxiliary 2D matrix to store results of subproblems */
int
cost[n][n];
/* cost[i][j] = Optimal cost of binary search tree that can be
formed from keys[i] to keys[j].
cost[0][n-1] will store the resultant cost */
// For a single key, cost is equal to frequency of the key
for
(
int
i = 0; i < n; i++)
cost[i][i] = freq[i];
// Now we need to consider chains of length 2, 3, ... .
// L is chain length.
for
(
int
L=2; L<=n; L++)
{
// i is row number in cost[][]
for
(
int
i=0; i<=n-L+1; i++)
{
// Get column number j from row number i and chain length L
int
j = i+L-1;
cost[i][j] = INT_MAX;
// Try making all keys in interval keys[i..j] as root
for
(
int
r=i; r<=j; r++)
{
// c = cost when keys[r] becomes root of this subtree
int
c = ((r > i)? cost[i][r-1]:0) +
((r < j)? cost[r+1][j]:0) +
sum(freq, i, j);
if
(c < cost[i][j])
cost[i][j] = c;
}
}
}
return
cost[0][n-1];
}
int
optCost(
int
freq[],
int
i,
int
j)
{
// Base cases
if
(j < i)
// If there are no elements in this subarray
return
0;
if
(j == i)
// If there is one element in this subarray
return
freq[i];
// Get sum of freq[i], freq[i+1], ... freq[j]
int
fsum = sum(freq, i, j);
// Initialize minimum value
int
min = INT_MAX;
// One by one consider all elements as root and recursively find cost
// of the BST, compare the cost with min and update min if needed
for
(
int
r = i; r <= j; ++r)
{
int
cost = optCost(freq, i, r-1) + optCost(freq, r+1, j);
if
(cost < min)
min = cost;
}
// Return minimum value
return
min + fsum;
}
// The main function that calculates minimum cost of a Binary Search Tree.
// It mainly uses optCost() to find the optimal cost.
int
optimalSearchTree(
int
keys[],
int
freq[],
int
n)
{
// Here array keys[] is assumed to be sorted in increasing order.
// If keys[] is not sorted, then add code to sort keys, and rearrange
// freq[] accordingly.
return
optCost(freq, 0, n-1);
}