A chess knight can move as indicated in the chess diagram below:
.
This time, we place our chess knight on any numbered key of a phone pad (indicated above), and the knight makes N-1 hops. Each hop must be from one key to another numbered key.
Each time it lands on a key (including the initial placement of the knight), it presses the number of that key, pressing N digits total.
How many distinct numbers can you dial in this manner?
Since the answer may be large, output the answer modulo 10^9 + 7.
Let f(start, n) be the number of ways to dial an n digit number, where the knight starts at square start. We can create a recursion, writing this in terms of f(x, n-1)'s.
Algorithm
By hand or otherwise, have a way to query what moves are available at each square. This implies the exact recursion for f. For example, from 1 we can move to 6, 8, so f(1, n) = f(6, n-1) + f(8, n-1).
After, let's keep track of dp[start] = f(start, n), and update it for each n from 1, 2, ..., N.
At the end, the answer is f(0, N) + f(1, N) + ... + f(9, N) = sum(dp).
The problem can be transformed into:
Traverse a directed graph (each node with a number as label and edges are defined by Knight's moving rule)
Start from 0 to 9
Move N - 1 step
Return how many ways to reach the end
Easy to come up with a DFS solution to start traversal from 0 to 9
In each recursion, move to one of the current node's neighbors and the remain step becomes N-1
Stop recursion when N == 0
Optimization:
Observe the recursive problem. The variances are:
Current Node
Remain Steps
Therefore, we can store these two variables as the memo to speed up DFS (then it's a Top Down DP)
public static finalint MOD = 1000000007;
publicint knightDialer(int N) {
int[][] graph = new int[][]{{4,6},{6,8},{7,9},{4,8},{3,9,0},{},{1,7,0},{2,6},{1,3},{2,4}};
int cnt = 0;
Integer[][] memo = new Integer[N+1][10];
for (int i = 0; i <= 9; i++)
cnt = (cnt + helper(N-1, i, graph, memo)) % MOD;
return cnt;
}
privateint helper(int N, int cur, int[][] graph, Integer[][] memo) {
if (N == 0)
return1;
if (memo[N][cur] != null)
return memo[N][cur];
int cnt = 0;
for (int nei : graph[cur])
cnt = (cnt + helper(N-1, nei, graph, memo)) % MOD;
memo[N][cur] = cnt;
return cnt;
}
defknightDialer(self, N): d = {0:[4,6],1:[6,8],2:[7,9],3:[4,8],4:[0,3,9],5:[],6:[0,1,7],7:[2,6],8:[1,3],9:[2,4]} dp = [[0]*10for _ in range(N)] mod = 10**9 + 7for i in range(10): dp[0][i] = 1for i in range(1, N):for j in range(10):for nex in d[j]: dp[i][j] += dp[i-1][nex] dp[i][j] %= modreturn sum(dp[N-1]) % mod
According to the article, four possible solutions are (1) naive recursive number generation, (2) naive recursive counting, (3) recursion + memoization, and (4) dynamic programming. A candidate who coded either the memoization or DP solution is likely to receive a "strong hire" recommendation.
It has a number of solutions, each requiring varying degrees of algorithms and data structures knowledge. Also, a little bit of insight goes a long way.
Each solution can be implemented in relatively few lines, making it perfect for a time-constrained environment.
With that being said, your first action after hearing the question should be stepping up to the whiteboard and solving small instances of the problem by hand. Never dive right into code! Solving small instances lets you spot patterns, observed and edge cases, and also helps crystallize a solution in your head. As an example, suppose you start on 6 and have two hops to make. Your sequences will be…
6–1–8
6–1–6
6–7–2
6–7–6
6–0–4
6–0–6
…for a total of six sequences. If you’re following along, try taking a pencil and paper and deriving these. This doesn’t translate well into a blog post, but trust me when I say there’s something magical about working out a problem by hand that leads to many more insights than just staring at it and thinking quietly.
With that being said, your first action after hearing the question should be stepping up to the whiteboard and solving small instances of the problem by hand. Never dive right into code! Solving small instances lets you spot patterns, observed and edge cases, and also helps crystallize a solution in your head. As an example, suppose you start on 6 and have two hops to make. Your sequences will be…
6–1–8
6–1–6
6–7–2
6–7–6
6–0–4
6–0–6
…for a total of six sequences. If you’re following along, try taking a pencil and paper and deriving these. This doesn’t translate well into a blog post, but trust me when I say there’s something magical about working out a problem by hand that leads to many more insights than just staring at it and thinking quietly.
As for the neighbors function here, given that it never changes you can simply create a map and return the appropriate value:
Anyway, on to the solution. Perhaps you’ve already noticed this problem can be solved by enumerating all possible numbers and counting them. You can use recursion to generate these values:
for sequence in yield_sequences(starting_position, num_hops):
num_sequences += 1
return num_sequences
This works, and it’s a common starting point I saw in interviews. Notice, however, that we generate the numbers and never actually use them. This problem asks for the count of numbers, not the numbers themselves. Once we count a number we never revisit it. As a general rule of thumb, I recommend paying attention to when your solution computes something it doesn’t use. Often you can cut it out and get a better solution
Level 2: Counting Without Counting
How can we count phone numbers without generating them? It can be done, but not without an additional insight. Notice how the count of numbers that can be generated from a given starting position in N hops is equal to the sum of the counts of hops that can be generated starting from each of its neighbors in N-1 hops. Stated mathematically as a recurrence relation, it looks like this:
This is intuitively obvious when you consider what happens with one hop: 6 has 3 neighbors (1, 7, and 0) and in zero hops you can reach one number for each, so you can only dial three numbers.
How does one arrive at this insight, you might ask? If you’ve studied recursion, this should become evident after some exploration on the whiteboard. Many candidates who’ve practiced recursion immediately notice this problem breaks down into smaller subproblems, which is a dead giveaway. If you’re in an interview with me and you can’t seem to arrive at this insight, I will usually give hints to help get you there, up to and including outright giving it away if prodding fails.
Once you have this insight in hand, you can already move forward and solve this problem again. There are a number of implementations that use this fact, but let’s start with the one I see most often in interviews: the naive recursive approach:
This next question is one you’re going to be hearing a lot from me: what is the Big-O complexity of this solution? For those who don’t know, Big-O complexity is (informally) a sort of shorthand for the rate at which the amount of computation required by a solution grows as a function of the size of the input. For this problem, the size of the input is the number of hops. If you’re interested in the proper mathematical definition, you can read more here.
For this implementation, every call to count_sequences() recursively calls count_sequences() at least twice, because each key has at least two neighbors. Since we recurse a number of times equal to the desired number of hops, and the number of calls to count_sequences() at least doubles with each call, we’re left with a runtime complexity of at least exponential time.
Level 3: Memoization
Can we do better? Using only the mathematical relation above and nothing else, not really. One of the reasons I love this problem, though, is it features layers of insight that yield more and more efficient solutions. To find the next one, let’s map out the function calls this function makes. Let’s consider the case of count_sequences(6, 4). Note I use C as the function name for brevity:
You may notice something peculiar: the C(6, 2) call is performed three times, and each time it performs the same computation, and returns the same value. The crucial insight here is that these function calls repeat, each time returning the same value. After you compute their result once there’s no need to recompute them.
If you’re wondering how you’re supposed to arrive at this, the easiest way is through good old-fashioned whiteboarding: staring at this problem statement in the abstract is nice, but I always encourage candidates to throw a sample solution up on the board. Solving out a problem like this and drawing the tree like I did above will see you writing the subtree for C(6, 2) multiple times, which you’re sure to notice. This is sometimes enough of an insight to allow candidates to bypass solutions 1 and 2 altogether and jump straight to this stage. Needless to say, that’s a huge time save in an interview where you only have 45 minutes to solve a problem.
Armed with this insight, how do we solve this problem? We can use memoization (not memorization), which basically means we record results of function calls we’ve seen before and use those instead of redoing the work. This way, when we encounter a place in the call graph where we would unnecessarily recompute an entire subtree, we instead immediately return the result we already computed. Here’s an implementation:
def count_sequences(start_position, num_hops):
cache = {}
def helper(position, num_hops):
if (position, num_hops) in cache:
return cache[ (position, num_hops) ]
if num_hops == 0:
return 1
else:
num_sequences = 0
for neighbor in neighbors(position):
num_sequences += helper(neighbor, num_hops - 1)
cache[ (position, num_hops) ] = num_sequences
return num_sequences
res = helper(start_position, num_hops)
return res
Alright, what’s the runtime complexity (Big-O) now? That’s tougher to answer. For the previous implementation, computing the runtime was as simple as counting the number of times the recursive function was called, which was always two or three times per call. This time counting is more complicated because the recursive call is guarded by a conditional. On the face of it there’s no obvious way to count function calls.
We can solve this mystery by instead looking at the cache. Each function call’s result is stored in the cache, and it’s inserted there exactly once. This allows us to reframe the question as “how does the size of the cache grow with the size of the input?” Given that the cache is keyed by position and number of hops, and there are exactly ten positions, we can conclude that the cache grows in direct proportion to the number of requested hops. This follows from the pigeonhole principle: once we have an entry in the cache for every combination of position and jump count, all calls will hit the cache rather than result in a new function call.
So we’re done right? Well, not quite. This solution has two drawback, one major(ish) and one minor. The major-ish drawback is that it’s recursive. Most languages place limits on the maximal size of their call stacks, which means there’s always a maximal number of hops an implementation can support. On my machine it fails after about 1000 hops. This is a major-ish limitation rather than a major one because any recursive function can be re-implemented in an iterative way, but still, it’s a hassle. As for the minor limitation, that actually leads us into the next solution…
Level 4: Dynamic Programming
The minor limitation of the recursive memoizing solution is clear when you look at the recurrence relation from above:
Notice that the results for N hops depend only on the results for calls with N-1 hops. Meanwhile, the cache contains entries for every (nonzero) number of hops. I call this a minor issue because it doesn’t actually cause any real problems, given that the cache grows only linearly with the number of hops. Requiring linear space isn’t the end of the world, but still, it’s inefficient.
What gives? Again, the result is clear when you look at a written-out solution and the code. Notice that the code starts with the largest number of hops and recurses directly down to the smallest:
If you imagine the entire function call graph as a sort of virtual tree, you’ll quickly see we’re performing a depth-first traversal. This is fine, it gives a proper solution, but it doesn’t take advantage of the shallow dependency property I pointed out above.
Can you perform a breadth-first traversal instead, where you start at the top and “visit” function calls for N-1 hops only after you’ve visited those for N hops? Sadly, no. The values of function calls with nonzero hops absolutely require the values from smaller hop counts, so you won’t get any results until you reach the zero-hop layer and start returning numbers rather than additional function calls (note the zero-hop layer isn’t depicted here).
You can however, reverse the order: visit layers with N hops only after you’ve visited layers with N-1 hops. Those of you who studied or are studying discrete mathematics will recognize all the necessary ingredients for an induction: we know that the values of zero-hop function calls are always one (the base case). We also know how to combine N-1 hop values to get N hop values, using the recurrence relation (the inductive step). We can start with a base case of zero hops and induce all values greater than zero. Here’s an implementation:
def count_sequences(start_position, num_hops):
prior_case = [1] * 10
current_case = [0] * 10
current_num_hops = 1
while current_num_hops <= num_hops:
current_case = [0] * 10
current_num_hops += 1
for position in range(0, 10):
for neighbor in neighbors(position):
current_case[position] += prior_case[neighbor]
prior_case = current_case
return current_case[start_position]
So what’s better about this version than the recursive, depth-first solution? Not a ton, but it has a few benefits. First off, it’s not recursive, meaning it can run for extremely large values without crashing. Second off, it uses constant memory, because it only ever needs two arrays of fixed size rather than the ever-growing cache of the memoization solution. Finally, it’s still linear time: I can compute 200,000 hops in just under twenty seconds.
What about the other solutions, you might ask? Unfortunately it’s impossible to rate an abstract candidate. Interviews are chaotic things; they can start late, people can be nervous, and they often arrive at insights and solutions late in the session, leaving them with little time to code anything. There’s also a conversation happening: I pay attention to how well the candidate communicates their thoughts and incorporates ideas and feedback. I always take these factors into account before making a hire/no-hire recommendation, and you just can’t do that in the abstract.
Instead of potential recommendations, I’ll focus on the things I’d like to be able to say. When assessing algorithms and data structures, I want to say something like “TC (The Candidate) explored the problem and produced a solution that addressed all edge cases, and improved the solution when presented with its shortcomings. In the end, they arrived at an optimal solution.” I also want to be able to say “TC chose appropriate data structures for the solution, and correctly answered questions about the Big-O of their solution’s runtime and space requirements.”
When assessing coding, my ideal statement would be “TC quickly and concisely translated their ideas into code. The code uses standard language constructs and is easy to read. All edge cases are addressed, and TC walked through their code to debug it and verify it’s correct.” For entry-level roles I give bonus points if there’s some sort of testing, but more experienced roles I penalize candidates who don’t at least list relevant test cases.
As for speed of progress, I’d love to be able to say “TC drove the problem solving process: they developed most of their own solution, and were able to identify and address shortcomings without my pointing them out. TC required only minimal hints to get them moving in the right direction.”
Anyone I can say these things about gets a “Strong Hire” in my book. However, “Hire” and “Leaning Hire” are also positive endorsements. If you come up short in one area but shine in another then I can probably still justify a positive recommendation.
Always start by solving a small instance of the problem by hand. In this problem the recurrence relation and the repetition of function calls become much more obvious when you hand-solve a problem.
Pay attention to when your solution is computing things you don’t need, like how the naive counting solution generates the sequences but doesn’t actually use them. Reducing unnecessary computation can often provide simpler solutions, if not open the door to more efficient ones.
Know your recursion. It’s almost useless in most production code because it blasts through the stack, but it’s a very powerful algorithm design tactic. Recursive solutions can often be adapted and improved: the difference between the exponential time naive solution and the linear time nearly-optimal memoization solution is minimal.
Know your Big-O analysis! You’re practically guaranteed to be asked this at some point during the interview process.
Always be on the lookout for opportunities to memoize. If your function is deterministic and you’ll be calling it multiple times with the same inputs, your solution may benefit from memoization.
Find and write out the recurrence relation. In this case writing it out makes it obvious that counts for N hops depend only on counts for N-1 hops.
Perhaps you’ve already noticed this problem can be solved by enumerating all possible numbers and counting them. You can use recursion to generate these values:
This works, and it’s a common starting point I saw in interviews. Notice, however, that we generate the numbers and never actually use them. This problem asks for the count of numbers, not the numbers themselves. Once we count a number we never revisit it. As a general rule of thumb, I recommend paying attention to when your solution computes something it doesn’t use. Often you can cut it out and get a better solution.