Puzzles, Maths and Algorithms: String Permutations
Printing all permutations of a string is a very common interview question. We'll discuss this problem and some interesting variations of it. The original problem of string permutation says, "print all permutations of a string". As an example, if the string is "abc" there are 6 permutations {abc, acb, bac, bca, cab, cba}. Assume that string contains no duplicate elements.
Above code performs the swapping operation twice. First time to generate all possible permutations of character at that level and second time to restore to the original string. It is important to restore to the original string (as passed to this call), otherwise the algorithm ends up printing redundant permutations. Additionally, this code assumes that there are no duplicates in the string, else it fails.
def _permute_unique_chars(String, Loc):
# get string length
l = len(String)
# reached the end of the String, nothing more to permute
if l == Loc:
global count
count += 1
print count, ''.join(String)
return
# get charater at position = Loc
c = String[Loc]
# permute the character at position = Loc with characters at position > Loc
for i in xrange(Loc, l):
# swap character at Loc and i
String[Loc] = String[i]
String[i] = c
# recursively permute
_permute_unique_chars(String, Loc+1)
# undo the swap at depth and i
String[i] = String[Loc]
String[Loc] = c
Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}
Clearly the solution to first problem fails here. This is due to fact that we generate all permutations of the string after the swapping step. Same character could get swapped to dth spot again, leading to redundant permutations. A very simple trick solves the game. The trick is to pre-sort the string on alphabetical order. Now we need to ensure that while we are swapping, to generate permutation for that particular character, if that matches the last swapped character we ignore generating permutations. Effectively, we are doing a cyclic shift and before the end of the call we reset the cyclic shift. The following code implements the same:
function permute(str, d)
def _permute_chars(String, Loc):
# get string length
l = len(String)
# reached the end of the String, nothing more to permute
if l == Loc:
global count
count += 1
print count, ''.join(String)
return
# get charater to be swapped
c = None
# do a cyclic rotation of String by 1,
# Intuition behind cyclic shift is that it leaves the subarray (Loc+1 to l) sorted
# a simple swap used in _permute_unique_chars does not ensure that
for i in xrange(Loc, l):
# same character, avoid duplicate permutations
if c == String[i]: continue
# swap character at Loc and i
c = String[i]
String[i] = String[Loc]
String[Loc] = c
# recursively permute
_permute_chars(String, Loc+1)
# after the end of above loop, string is cyclically shifted by 1, undo this shift
for i in xrange(Loc,l-1):
String[i] = String[i+1]
String[l-1] = c
def permute_chars(perm_str):
# set global counter to 0
global count
count = 0
# make a array from string (as string in non-mutable)
string = [perm_str[i] for i in xrange(0,len(perm_str))]
# sort string
string = sorted(string)
# permute string array
print "Permuting duplicated chars:", perm_str
_permute_chars(string, 0)
Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.
An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].
The hard question is how many such permutations exist. For N open and N close brackets, the number of such permutations is Nth Catalan number = 1/(n+1) * factorial(2*N)/factorial(N)^2. The solution is derived from the problem: Number of paths in rectangular grid.
Permutation problems can be easily solved using recursion. The trick is to formulate all the possible cases at a particular level and code according to the formulation. The steps used below illustrate this approach. Let's formulate the problem in general sense. Assume that at a given step of recursion, we have n open brackets and m closed brackets that are not used so far. Now we can formuate each and every case for a given (n,m) and code the possible substeps.
Constraints:
function permute(str, n, m)
function GenerateCStyleBrackets(N)
Read full article from Puzzles, Maths and Algorithms: String Permutations
Printing all permutations of a string is a very common interview question. We'll discuss this problem and some interesting variations of it. The original problem of string permutation says, "print all permutations of a string". As an example, if the string is "abc" there are 6 permutations {abc, acb, bac, bca, cab, cba}. Assume that string contains no duplicate elements.
function permute(str, d)
if d == length(str)
print str
else
for i <- d to length(str)
swap(str[d] <-> str[i]) // swap character for permutation
permute(str, d + 1)
swap(str[d] <-> str[i]) // undo swap for parent call
Above code performs the swapping operation twice. First time to generate all possible permutations of character at that level and second time to restore to the original string. It is important to restore to the original string (as passed to this call), otherwise the algorithm ends up printing redundant permutations. Additionally, this code assumes that there are no duplicates in the string, else it fails.
def _permute_unique_chars(String, Loc):
# get string length
l = len(String)
# reached the end of the String, nothing more to permute
if l == Loc:
global count
count += 1
print count, ''.join(String)
return
# get charater at position = Loc
c = String[Loc]
# permute the character at position = Loc with characters at position > Loc
for i in xrange(Loc, l):
# swap character at Loc and i
String[Loc] = String[i]
String[i] = c
# recursively permute
_permute_unique_chars(String, Loc+1)
# undo the swap at depth and i
String[i] = String[Loc]
String[Loc] = c
Lets us assume that string can contain duplicates. How do we print all non-redundant permutations of the string. For ex. If string is "abab" the permutations are {baab, abba, abab, baba, bbaa, aabb}
Clearly the solution to first problem fails here. This is due to fact that we generate all permutations of the string after the swapping step. Same character could get swapped to dth spot again, leading to redundant permutations. A very simple trick solves the game. The trick is to pre-sort the string on alphabetical order. Now we need to ensure that while we are swapping, to generate permutation for that particular character, if that matches the last swapped character we ignore generating permutations. Effectively, we are doing a cyclic shift and before the end of the call we reset the cyclic shift. The following code implements the same:
function permute(str, d)
if d == length(str)
print str
else
lastSwap = Nil
for i <- d to length(str)
if lastSwap == str[i]
continue
else
lastSwap = str[i]
swap(str[d] <-> str[i]) // swap character for permutation
permute(str, d + 1)
for i <- d to length(str)-1
str[i] = str[i+1]
str[length(str)] = last
If the string is pre-sorted alphabetically then the above algorithm works magically. Note that in this case, after permute is called, we do not undo the swap. This ensures that we are cyclic shifting and hence the intermediate str from d to l is also sorted. After the for loop call, we reverse the cyclic shift. The reason is that sorting ensures that if there are duplicates then the second occurrence of the duplicate element is picked immediately after we finish generating permutations for the first occurrence of the duplicate element. If sorting is not allowed, then you need to maintain a local hashtable (local to each function call), that maintains characters swapped to dth spot and ensures that duplicate candidates are discarded. Presort and make life simple.# get string length
l = len(String)
# reached the end of the String, nothing more to permute
if l == Loc:
global count
count += 1
print count, ''.join(String)
return
# get charater to be swapped
c = None
# do a cyclic rotation of String by 1,
# Intuition behind cyclic shift is that it leaves the subarray (Loc+1 to l) sorted
# a simple swap used in _permute_unique_chars does not ensure that
for i in xrange(Loc, l):
# same character, avoid duplicate permutations
if c == String[i]: continue
# swap character at Loc and i
c = String[i]
String[i] = String[Loc]
String[Loc] = c
# recursively permute
_permute_chars(String, Loc+1)
# after the end of above loop, string is cyclically shifted by 1, undo this shift
for i in xrange(Loc,l-1):
String[i] = String[i+1]
String[l-1] = c
def permute_chars(perm_str):
# set global counter to 0
global count
count = 0
# make a array from string (as string in non-mutable)
string = [perm_str[i] for i in xrange(0,len(perm_str))]
# sort string
string = sorted(string)
# permute string array
print "Permuting duplicated chars:", perm_str
_permute_chars(string, 0)
Some other constraints could be added to make the problem more interesting. One such constrain could be of partial ordering of characters. For Ex. 'a' should appear before 'c', and 'b' should appear before 'c' in all permutations. There can exist ordering that leads to no permutation of the string such as 'a' precedes 'b', 'b' precedes 'c', 'c' precedes 'a'. Additionally implementing the permutation algorithm in these cases require checking for ordering violation at each permutation level or just at the leaf of recursion, depending on whether checking for ordering violations is more costly or generating all permutations.
An interesting problem, that talks of special kind of ordering and can be very efficiently computed is as follows: Assume that the string is made up of equal number of '{' and '}'. You need to print all permutations of this string which could satisfy a C compiler i.e. an ending bracket is balanced by an opening bracket. For ex. for '{{}}' the permutations are [{{}}, {}{}].
Permutation problems can be easily solved using recursion. The trick is to formulate all the possible cases at a particular level and code according to the formulation. The steps used below illustrate this approach. Let's formulate the problem in general sense. Assume that at a given step of recursion, we have n open brackets and m closed brackets that are not used so far. Now we can formuate each and every case for a given (n,m) and code the possible substeps.
Problem Formulation: (n,m) ---> NOT POSSIBLE, if m < n // we are not actually using this in the code? (n,m) ---> PRINT AND DONE, if m = 0 and n = 0 (n,m) ---> (n-1, m) if n > 0 + (n, m-1) if m > 0 and m > nThe above formulation is generic and takes into account all possible input values of (n,m). If we put following constraint on the initial value of n and m such as following, then actual implementation can omit some of the above cases:
Constraints:
n > 0 m == nThe actual implementation that follows above constraints and problem formuation:
function permute(str, n, m)
if m == 0 print str else if n > 0 permute (str + '{', n - 1, m); if m > n permute (str + '}', n, m - 1);To enforce the above mentioned constraints, permute function should be kept as inner/private function and the actual function that should be exposed is as follows:
function GenerateCStyleBrackets(N)
if N <= 0 return str = new String[2 * N] permute(str, N, N) // str is passed as pointer.
def _print_brackets(N, M, String, Loc):
if N == 0 and M == 0:
global count
count += 1
print count, ''.join(String)
return
if N > 0:
String[Loc] = '{'
_print_brackets(N-1, M, String, Loc+1)
if M > N:
String[Loc] = '}'
_print_brackets(N, M-1, String, Loc+1)
def print_brackets(N):
global count
count = 0
String = ['' for i in xrange(0,N+N)]
_print_brackets(N, N, String, 0)
The memory cost is O(N). The maximum stack depth is also O(N). So total space cost is O(N).Read full article from Puzzles, Maths and Algorithms: String Permutations