https://leetcode.com/problems/knight-dialer/
Google Interview Questions Deconstructed: The Knight’s Dialer
Google Interview Questions Deconstructed: The Knight’s Dialer (Logarithmic Time Edition)
Approach 1: Dynamic Programming
X. https://leetcode.com/problems/knight-dialer/discuss/189252/O(logN)
X. DP
https://leetcode.com/problems/knight-dialer/discuss/189265/Concise-Java-DP-Solution
- only need one dimension array
X. DFS + cache
https://leetcode.com/problems/knight-dialer/discuss/189271/Java-Top-Down-Memo-DP-O(N)
空间换时间,利用对称性
这是我在比赛最后的时间通过的代码,把所有状态给初始化了,这样好处是可以不用在循环中不停地copy原来的棋盘状态了,同时利用了对称性,只需要求出4个位置(1,2,4,0)的状态,其余状态可以直接利用对称性得到。
还有一个优化的地方在于在每次的过程中进行取模!虽然取模运算是耗时的运算,但是数字很大的时候,大整数既占空间又占时间,所以取模!
优化空间复杂度
上面的做法我一直在想着优化时间复杂度,事实上,每个状态只和之前的状态有关,所以很容易想到优化空间复杂度。
使用10个变量,分别保存每个位置能取到的状态数,然后人为的把每个状态能通过其他的状态得到的代码给写出来就行了。
代码如下,真的很简洁,为什么我没有想到优化空间!!优化之后时间降到了264 ms,这个告诉我们,优化空间同样可以大规模地降低时间,如果DP问题超时的话,优先考虑空间!
时间复杂度是O(N),空间复杂度O(1).时间264 ms.
def knightDialer(self, N):
"""
:type N: int
:rtype: int
"""
if N == 1: return 10
x1 = x2 = x3 = x4 = x5 = x6 = x7 = x8 = x9 = x0 = 1
MOD = 10 ** 9 + 7
for i in range(N - 1):
x1, x2, x3, x4, x5, x6, x7, x8, x9, x0 = (x6 + x8) % MOD,\
(x7 + x9) % MOD, (x4 + x8) % MOD, (x3 + x9 + x0) % MOD, 0, (x1 + x7 + x0) % MOD,\
(x2 + x6) % MOD, (x1 + x3) % MOD, (x2 + x4) % MOD, (x4 + x6) % MOD
return (x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x0) % MOD
"""
:type N: int
:rtype: int
"""
if N == 1: return 10
x1 = x2 = x3 = x4 = x5 = x6 = x7 = x8 = x9 = x0 = 1
MOD = 10 ** 9 + 7
for i in range(N - 1):
x1, x2, x4, x0 = (x6 + x8) % MOD, (x7 + x9) % MOD, (x3 + x9 + x0) % MOD, (x4 + x6) % MOD
x3, x5, x6, x7, x8, x9 = x1, 0, x4, x1, x2, x1
return (x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x0) % MOD
X. https://buptwc.com/2018/11/05/leetcode-935-Knight-Dialer/
https://medium.com/@lenchen/leetcode-935-knight-dialer-73311d4c7afa
def knightDialer(self, N):
"""
:type N: int
:rtype: int
"""
# approach: use DFS to run all possible hops
initPosition = [(i, j) for i in range(0, 3) for j in range(0, 3)]
initPosition.append((1, 3))
count = 0
for x, y in initPosition:
count += self.move(x, y, N)
return count % (10**9 + 7)
def move(self, x, y, N):
if N == 1:
return 1
directions = [(1, 2), (2, 1), (-1, 2), (-2, 1),
(-1, -2), (-2, -1), (1, -2), (2, -1)]
count = 0
for biasX, biasY in directions:
newX = x + biasX
newY = y + biasY
if (0 <= newX < 3 and 0 <= newY < 3) or (newX == 1 and newY == 3):
count += self.move(newX, newY, N - 1)
return count
X. Matrix Multiplication
https://zhanghuimeng.github.io/post/leetcode-935-knight-dialer/
https://leetcode.com/problems/knight-dialer/discuss/189252/O%28logN%29
This is like matrix multiplication. I believe it is the same technique to optimize calculation of
https://hackernoon.com/google-interview-questions-deconstructed-the-knights-dialer-f780d516f029
NEIGHBORS_MAP = {
1: (6, 8),
2: (7, 9),
3: (4, 8),
4: (3, 9, 0),
5: tuple(), # 5 has no neighbors
6: (1, 7, 0),
7: (2, 6),
8: (1, 3),
9: (2, 4),
0: (4, 6),
}
def neighbors(position):
return NEIGHBORS_MAP[position]
Google Interview Questions Deconstructed: The Knight’s Dialer
Level 1: Recursively Generating Numbers
Google Interview Questions Deconstructed: The Knight’s Dialer
Google Interview Questions Deconstructed: The Knight’s Dialer (Logarithmic Time Edition)
A chess knight can move as indicated in the chess diagram below:
.
This time, we place our chess knight on any numbered key of a phone pad (indicated above), and the knight makes
N-1
hops. Each hop must be from one key to another numbered key.
Each time it lands on a key (including the initial placement of the knight), it presses the number of that key, pressing
N
digits total.
How many distinct numbers can you dial in this manner?
Since the answer may be large, output the answer modulo
10^9 + 7
.
Example 1:
Input: 1 Output: 10
Example 2:
Input: 2 Output: 20
Example 3:
Input: 3 Output: 46
Similar to 花花酱 688. Knight Probability in Chessboard
We can define dp[k][i][j] as # of ways to dial and the last key is (j, i) after k steps
Note: dp[*][3][0], dp[*][3][2] are always zero for all the steps.
Init: dp[0][i][j] = 1
Transition: dp[k][i][j] = sum(dp[k – 1][i + dy][j + dx]) 8 ways of move from last step.
ans = sum(dp[k])
Time complexity: O(kmn) or O(k * 12 * 8) = O(k)
Space complexity: O(kmn) -> O(mn) or O(12*8) = O(1)
int knightDialer(int N) {
constexpr int kMod = 1e9 + 7;
int dirs[8][2] = {{-2, -1}, {-2, 1}, {-1, -2}, {-1, 2}, {1, -2}, {1, 2}, {2, -1}, {2, 1}};
vector<vector<int>> dp(4, vector<int>(3, 1));
dp[3][0] = dp[3][2] = 0;
for (int k = 1; k < N; ++k) {
vector<vector<int>> tmp(4, vector<int>(3));
for (int i = 0; i < 4; ++i)
for (int j = 0; j < 3; ++j) {
if (i == 3 && j != 1) continue;
for (int d = 0; d < 8; ++d) {
int tx = j + dirs[d][0];
int ty = i + dirs[d][1];
if (tx < 0 || ty < 0 || tx >= 3 || ty >= 4) continue;
tmp[i][j] = (tmp[i][j] + dp[ty][tx]) % kMod;
}
}
dp.swap(tmp);
}
int ans = 0;
for (int i = 0; i < 4; ++i)
for (int j = 0; j < 3; ++j)
ans = (ans + dp[i][j]) % kMod;
return ans;
}
V2
define dp[k][i] as # of ways to dial and the last key is i after k steps
init: dp[0][0:10] = 1
transition: dp[k][i] = sum(dp[k-1][j]) that j can move to i
ans: sum(dp[k])
Time complexity: O(k * 10) = O(k)
Space complexity: O(k * 10) -> O(10) = O(1)
int knightDialer(int N) {
constexpr int kMod = 1e9 + 7;
vector<vector<int>> moves{{4,6},{8,6},{7,9},{4,8},{3,9,0},{},{1,7,0},{2,6},{1,3},{2,4}};
vector<int> dp(10, 1);
for (int k = 1; k < N; ++k) {
vector<int> tmp(10);
for (int i = 0; i < 10; ++i)
for (int nxt : moves[i])
tmp[nxt] = (tmp[nxt] + dp[i]) % kMod;
dp.swap(tmp);
}
int ans = 0;
for (int i = 0; i < 10; ++i)
ans = (ans + dp[i]) % kMod;
return ans;
}
Approach 1: Dynamic Programming
Let
f(start, n)
be the number of ways to dial an n
digit number, where the knight starts at square start
. We can create a recursion, writing this in terms of f(x, n-1)
's.
Algorithm
By hand or otherwise, have a way to query what moves are available at each square. This implies the exact recursion for
f
. For example, from 1
we can move to 6, 8
, so f(1, n) = f(6, n-1) + f(8, n-1)
.
After, let's keep track of
dp[start] = f(start, n)
, and update it for each n from 1, 2, ..., N
.
At the end, the answer is
f(0, N) + f(1, N) + ... + f(9, N) = sum(dp)
.
public int knightDialer(int N) {
int MOD = 1_000_000_007;
int[][] moves = new int[][] { { 4, 6 }, { 6, 8 }, { 7, 9 }, { 4, 8 }, { 3, 9, 0 }, {}, { 1, 7, 0 }, { 2, 6 },
{ 1, 3 }, { 2, 4 } };
int[][] dp = new int[2][10];
Arrays.fill(dp[0], 1);
for (int hops = 0; hops < N - 1; ++hops) {
Arrays.fill(dp[~hops & 1], 0);
for (int node = 0; node < 10; ++node)
for (int nei : moves[node]) {
dp[~hops & 1][nei] += dp[hops & 1][node];
dp[~hops & 1][nei] %= MOD;
}
}
long ans = 0;
for (int x : dp[~N & 1])
ans += x;
return (int) (ans % MOD);
}
X. https://leetcode.com/problems/knight-dialer/discuss/189252/O(logN)
The 'log N' solution connects to the "power of matrix reveals the number of walks in an undirected graph".
https://math.stackexchange.com/questions/1890620/finding-path-lengths-by-the-power-of-adjacency-matrix-of-an-undirected-graph
https://math.stackexchange.com/questions/1890620/finding-path-lengths-by-the-power-of-adjacency-matrix-of-an-undirected-graph
X. DP
https://leetcode.com/problems/knight-dialer/discuss/189265/Concise-Java-DP-Solution
public int knightDialer(int N){
int[][] dirs = new int[][]{{4,6},{6,8},{7,9},{4,8},{3,9,0},{},{1,7,0},{2,6},{1,3},{2,4}};
int[][] dp = new int[ N + 1][10];
for(int j = 0; j < dp[0].length; j ++){
dp[1][j] = 1;
}
int mod = (int)1e9 + 7;
for(int i = 2; i < dp.length;i ++)
for(int j = 0; j < dp[0].length; j ++){
int[] dir = dirs[j]; //Where j comes from
for(int num : dir){
dp[i][j] += dp[i - 1][num];
dp[i][j] %= mod;
}
}
int count = 0;
for(int i = 0; i < dp[0].length; i ++){
count += dp[N][i];
count %= mod;
}
return count;
}
I don't want to call it dp, because I just simply simulate the process.
public int knightDialer(int N) {
if (N==1) return 10;
long[] cur= new long[10];
Arrays.fill(cur, 1);
cur[5]=0;
long res=0, M=(int)1e9+7;;
while (N-->1){
long[] next= Arrays.copyOf(cur, 10);
next[0]=(cur[4]+cur[6])%M;
next[1]=(cur[6]+cur[8])%M;
next[2]=(cur[7]+cur[9])%M;
next[3]=(cur[4]+cur[8])%M;
next[4]=(cur[3]+cur[9]+cur[0])%M;
next[6]=(cur[1]+cur[7]+cur[0])%M;
next[7]=(cur[2]+cur[6])%M;
next[8]=(cur[1]+cur[3])%M;
next[9]=(cur[2]+cur[4])%M;
cur=next;
}
for (long n: cur) res=(res+n)%M;
return (int)res;
}
https://zhanghuimeng.github.io/post/leetcode-935-knight-dialer/
- only need one dimension array
long long int f[10][5005]; public: int knightDialer(int N) { memset(f, -1, sizeof(f)); const long long int MOD = 1000000007; long long int ans = 0; for (int i = 0; i < 10; i++) f[i][0] = 1; for (int i = 1; i < N; i++) { f[0][i] = (f[4][i-1] + f[6][i-1]) % MOD; f[1][i] = (f[6][i-1] + f[8][i-1]) % MOD; f[2][i] = (f[7][i-1] + f[9][i-1]) % MOD; f[3][i] = (f[4][i-1] + f[8][i-1]) % MOD; f[4][i] = (f[0][i-1] + f[3][i-1] + f[9][i-1]) % MOD; f[5][i] = 0; f[6][i] = (f[0][i-1] + f[1][i-1] + f[7][i-1]) % MOD; f[7][i] = (f[2][i-1] + f[6][i-1]) % MOD; f[8][i] = (f[1][i-1] + f[3][i-1]) % MOD; f[9][i] = (f[2][i-1] + f[4][i-1]) % MOD; } for (int i = 0; i < 10; i++) ans = (ans + f[i][N-1]) % MOD; return ans; }
X. DFS + cache
https://leetcode.com/problems/knight-dialer/discuss/189271/Java-Top-Down-Memo-DP-O(N)
The problem can be transformed into:
Traverse a directed graph (each node with a number as label and edges are defined by Knight's moving rule)
Start from
Move
Return
Traverse a directed graph (each node with a number as label and edges are defined by Knight's moving rule)
Start from
0 to 9
Move
N - 1
stepReturn
how many ways to reach the end
Easy to come up with a DFS solution to start traversal from 0 to 9
In each recursion, move to one of the current node's neighbors and the remain step becomes
Stop recursion when
In each recursion, move to one of the current node's neighbors and the remain step becomes
N-1
Stop recursion when
N == 0
Optimization:
Observe the recursive problem. The variances are:
Observe the recursive problem. The variances are:
- Current Node
- Remain Steps
Therefore, we can store these two variables as the memo to speed up DFS (then it's a Top Down DP)
public static final int MOD = 1000000007;
public int knightDialer(int N) {
int[][] graph = new int[][]{{4,6},{6,8},{7,9},{4,8},{3,9,0},{},{1,7,0},{2,6},{1,3},{2,4}};
int cnt = 0;
Integer[][] memo = new Integer[N+1][10];
for (int i = 0; i <= 9; i++)
cnt = (cnt + helper(N-1, i, graph, memo)) % MOD;
return cnt;
}
private int helper(int N, int cur, int[][] graph, Integer[][] memo) {
if (N == 0)
return 1;
if (memo[N][cur] != null)
return memo[N][cur];
int cnt = 0;
for (int nei : graph[cur])
cnt = (cnt + helper(N-1, nei, graph, memo)) % MOD;
memo[N][cur] = cnt;
return cnt;
}
X. https://blog.csdn.net/fuxuemingzhu/article/details/83716573空间换时间,利用对称性
这是我在比赛最后的时间通过的代码,把所有状态给初始化了,这样好处是可以不用在循环中不停地copy原来的棋盘状态了,同时利用了对称性,只需要求出4个位置(1,2,4,0)的状态,其余状态可以直接利用对称性得到。
还有一个优化的地方在于在每次的过程中进行取模!虽然取模运算是耗时的运算,但是数字很大的时候,大整数既占空间又占时间,所以取模!
优化空间复杂度
上面的做法我一直在想着优化时间复杂度,事实上,每个状态只和之前的状态有关,所以很容易想到优化空间复杂度。
使用10个变量,分别保存每个位置能取到的状态数,然后人为的把每个状态能通过其他的状态得到的代码给写出来就行了。
代码如下,真的很简洁,为什么我没有想到优化空间!!优化之后时间降到了264 ms,这个告诉我们,优化空间同样可以大规模地降低时间,如果DP问题超时的话,优先考虑空间!
时间复杂度是O(N),空间复杂度O(1).时间264 ms.
def knightDialer(self, N):
"""
:type N: int
:rtype: int
"""
if N == 1: return 10
x1 = x2 = x3 = x4 = x5 = x6 = x7 = x8 = x9 = x0 = 1
MOD = 10 ** 9 + 7
for i in range(N - 1):
x1, x2, x3, x4, x5, x6, x7, x8, x9, x0 = (x6 + x8) % MOD,\
(x7 + x9) % MOD, (x4 + x8) % MOD, (x3 + x9 + x0) % MOD, 0, (x1 + x7 + x0) % MOD,\
(x2 + x6) % MOD, (x1 + x3) % MOD, (x2 + x4) % MOD, (x4 + x6) % MOD
return (x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x0) % MOD
如果在上面的解法上再利用好对称性的话,可以把时间再次降低到160 ms。
时间复杂度是O(N),空间复杂度O(1).时间160 ms。
def knightDialer(self, N):"""
:type N: int
:rtype: int
"""
if N == 1: return 10
x1 = x2 = x3 = x4 = x5 = x6 = x7 = x8 = x9 = x0 = 1
MOD = 10 ** 9 + 7
for i in range(N - 1):
x1, x2, x4, x0 = (x6 + x8) % MOD, (x7 + x9) % MOD, (x3 + x9 + x0) % MOD, (x4 + x6) % MOD
x3, x5, x6, x7, x8, x9 = x1, 0, x4, x1, x2, x1
return (x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x0) % MOD
X. https://buptwc.com/2018/11/05/leetcode-935-Knight-Dialer/
我认为这道题当你往dp方面去想的时候,就变得很简单,因为dp定义和递推式都显而易见。
定义dp[i][j]表示现在处于i位置,继续走j步所能走出的不同电话数量。
那么有
定义dp[i][j]表示现在处于i位置,继续走j步所能走出的不同电话数量。
那么有
dp[i][j] = sum(dp[nex][j-1] for nex in (i能走到的位置) )
使用一个字典
初始化所有的
d
来保存每个位置所能到达的下一个位置。初始化所有的
dp[i][0] = 1
,因为不能继续走,所以只有1种可能。def knightDialer(self, N): d = {0:[4,6],1:[6,8],2:[7,9],3:[4,8],4:[0,3,9],5:[],6:[0,1,7],7:[2,6],8:[1,3],9:[2,4]} dp = [[0]*10 for _ in range(N)] mod = 10**9 + 7 for i in range(10): dp[0][i] = 1 for i in range(1, N): for j in range(10): for nex in d[j]: dp[i][j] += dp[i-1][nex] dp[i][j] %= mod return sum(dp[N-1]) % modX. DFS
https://medium.com/@lenchen/leetcode-935-knight-dialer-73311d4c7afa
The
move
subroutine will not stop until N
is one, which means it will be executed 10 + 8 * (N — 1)
times. Therefore, its time complexity is O(n).
For space complexity, it depends on execution times of
move
subroutine, so its space complexity is O(n) as well."""
:type N: int
:rtype: int
"""
# approach: use DFS to run all possible hops
initPosition = [(i, j) for i in range(0, 3) for j in range(0, 3)]
initPosition.append((1, 3))
count = 0
for x, y in initPosition:
count += self.move(x, y, N)
return count % (10**9 + 7)
def move(self, x, y, N):
if N == 1:
return 1
directions = [(1, 2), (2, 1), (-1, 2), (-2, 1),
(-1, -2), (-2, -1), (1, -2), (2, -1)]
count = 0
for biasX, biasY in directions:
newX = x + biasX
newY = y + biasY
if (0 <= newX < 3 and 0 <= newY < 3) or (newX == 1 and newY == 3):
count += self.move(newX, newY, N - 1)
return count
X. Matrix Multiplication
https://zhanghuimeng.github.io/post/leetcode-935-knight-dialer/
这道题还是很简单的,就直接递推就可以了。令
f[i][j]
表示以数字i
结尾的走了j
步的数的总数(也就是长度为j+1
的数);然后从图中可以得出从哪些数字可以走到i
,于是就可以递推了。矩阵乘法
一个事实是,其实可以把每一步的推导过程写成矩阵乘法。由于
1 2 3 4 5 6 7 8 9 10 | f[0][i] = (f[4][i-1] + f[6][i-1]) % MOD; f[1][i] = (f[6][i-1] + f[8][i-1]) % MOD; f[2][i] = (f[7][i-1] + f[9][i-1]) % MOD; f[3][i] = (f[4][i-1] + f[8][i-1]) % MOD; f[4][i] = (f[0][i-1] + f[3][i-1] + f[9][i-1]) % MOD; f[5][i] = 0; f[6][i] = (f[0][i-1] + f[1][i-1] + f[7][i-1]) % MOD; f[7][i] = (f[2][i-1] + f[6][i-1]) % MOD; f[8][i] = (f[1][i-1] + f[3][i-1]) % MOD; f[9][i] = (f[2][i-1] + f[4][i-1]) % MOD; |
因此
1 2 3 4 5 6 7 8 9 10 | f[0][i] = [0 0 0 0 1 0 1 0 0 0] * f[0][i-1] f[1][i] = [0 0 0 0 0 0 1 0 1 0] * f[1][i-1] f[2][i] = [0 0 0 0 0 0 0 1 0 1] * f[2][i-1] f[3][i] = [0 0 0 0 1 0 0 0 1 0] * f[3][i-1] f[4][i] = [1 0 0 1 0 0 0 0 0 1] * f[4][i-1] f[5][i] = [0 0 0 0 0 0 0 0 0 0] * f[5][i-1] f[6][i] = [1 1 0 0 0 0 0 1 0 0] * f[6][i-1] f[7][i] = [0 0 1 0 0 0 1 0 0 0] * f[7][i-1] f[8][i] = [0 1 0 1 0 0 0 0 0 0] * f[8][i-1] f[9][i] = [0 0 1 0 1 0 0 0 0 0] * f[9][i-1] |
然后就可以利用矩阵快速幂的方法优化成
O(log(N))
了。真是非常优秀的想法……[1]struct Matrix { int n, m; long long int mat[20][20]; Matrix(int _n, int _m) { n = _n; m = _m; memset(mat, 0, sizeof(mat)); } friend Matrix operator * (const Matrix& a, const Matrix& b) { int n = a.n, m = b.m; Matrix c(n, m); for (int i = 0; i < n; i++) for (int j = 0; j < a.m; j++) for (int k = 0; k < m; k++) { c.mat[i][k] = (c.mat[i][k] + a.mat[i][j] * b.mat[j][k]) % MOD; } return c; } long long int sum() { long long int sum = 0; for (int i = 0; i < n; i++) for (int j = 0; j < m; j++) sum = (sum + mat[i][j]) % MOD; return sum; } }; public: static const long long MOD = 1000000007; int knightDialer(int N) { Matrix f(10, 1); for (int i = 0; i < 10; i++) f.mat[i][0] = 1; Matrix A(10, 10); A.mat[0][4] = A.mat[0][6] = 1; A.mat[1][6] = A.mat[1][8] = 1; A.mat[2][7] = A.mat[2][9] = 1; A.mat[3][4] = A.mat[3][8] = 1; A.mat[4][0] = A.mat[4][3] = A.mat[4][9] = 1; A.mat[6][0] = A.mat[6][1] = A.mat[6][7] = 1; A.mat[7][2] = A.mat[7][6] = 1; A.mat[8][1] = A.mat[8][3] = 1; A.mat[9][2] = A.mat[9][4] = 1; Matrix I(10, 10); for (int i = 0; i < 10; i++) I.mat[i][i] = 1; Matrix pow2 = A, pow = I; N--; while (N > 0) { if (N % 2 != 0) pow = pow * pow2; pow2 = pow2 * pow2; N >>= 1; } f = pow * f; return f.sum(); }
This is like matrix multiplication. I believe it is the same technique to optimize calculation of
fibonacci sequence
to O(logN).
In fact, we recursively did pow operation.
This can be optimised to
This can be optimised to
O(log)
time.
Construct a 10 * 10 transformation matrix M.
M[i][j] = 1
if i and j is connnected.
if N = 1, return 10.
if N > 1, return sum of
if N > 1, return sum of
[1,1,1,1,1,1,1,1,1,1] * M ^ (N - 1)
The power of matrix reveals the number of walks in an undirected graph.
Find more details on this link provide by @shankark:
https://math.stackexchange.com/questions/1890620/finding-path-lengths-by-the-power-of-adjacency-matrix-of-an-undirected-graph
Find more details on this link provide by @shankark:
https://math.stackexchange.com/questions/1890620/finding-path-lengths-by-the-power-of-adjacency-matrix-of-an-undirected-graph
def knightDialer(self, N):
mod = 10**9 + 7
if N == 1: return 10
M = np.matrix([[0, 0, 0, 0, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1, 0, 1, 0],
[0, 0, 0, 0, 0, 0, 0, 1, 0, 1],
[0, 0, 0, 0, 1, 0, 0, 0, 1, 0],
[1, 0, 0, 1, 0, 0, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 0, 0, 0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0, 1, 0, 0, 0],
[0, 1, 0, 1, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 1, 0, 0, 0, 0, 0]])
res, N = 1, N - 1
while N:
if N % 2: res = res * M % mod
M = M * M % mod
N /= 2
return int(np.sum(res)) % mod
https://leetcode.com/problems/knight-dialer/discuss/189287/O(n)-time-O(1)-space-DP-solution-%2B-Google-interview-question-writeup
According to the article, four possible solutions are (1) naive recursive number generation, (2) naive recursive counting, (3) recursion + memoization, and (4) dynamic programming. A candidate who coded either the memoization or DP solution is likely to receive a "strong hire" recommendation.
https://hackernoon.com/google-interview-questions-deconstructed-the-knights-dialer-f780d516f029
- It’s easy to state and understand.
- It has a number of solutions, each requiring varying degrees of algorithms and data structures knowledge. Also, a little bit of insight goes a long way.
- Each solution can be implemented in relatively few lines, making it perfect for a time-constrained environment.
With that being said, your first action after hearing the question should be stepping up to the whiteboard and solving small instances of the problem by hand. Never dive right into code! Solving small instances lets you spot patterns, observed and edge cases, and also helps crystallize a solution in your head. As an example, suppose you start on 6 and have two hops to make. Your sequences will be…
- 6–1–8
- 6–1–6
- 6–7–2
- 6–7–6
- 6–0–4
- 6–0–6
…for a total of six sequences. If you’re following along, try taking a pencil and paper and deriving these. This doesn’t translate well into a blog post, but trust me when I say there’s something magical about working out a problem by hand that leads to many more insights than just staring at it and thinking quietly.
With that being said, your first action after hearing the question should be stepping up to the whiteboard and solving small instances of the problem by hand. Never dive right into code! Solving small instances lets you spot patterns, observed and edge cases, and also helps crystallize a solution in your head. As an example, suppose you start on 6 and have two hops to make. Your sequences will be…
- 6–1–8
- 6–1–6
- 6–7–2
- 6–7–6
- 6–0–4
- 6–0–6
…for a total of six sequences. If you’re following along, try taking a pencil and paper and deriving these. This doesn’t translate well into a blog post, but trust me when I say there’s something magical about working out a problem by hand that leads to many more insights than just staring at it and thinking quietly.
As for the neighbors function here, given that it never changes you can simply create a map and return the appropriate value:
1: (6, 8),
2: (7, 9),
3: (4, 8),
4: (3, 9, 0),
5: tuple(), # 5 has no neighbors
6: (1, 7, 0),
7: (2, 6),
8: (1, 3),
9: (2, 4),
0: (4, 6),
}
def neighbors(position):
return NEIGHBORS_MAP[position]
Level 1: Recursively Generating Numbers
Anyway, on to the solution. Perhaps you’ve already noticed this problem can be solved by enumerating all possible numbers and counting them. You can use recursion to generate these values:
def yield_sequences(starting_position, num_hops, sequence=None):
if sequence is None:
sequence = [starting_position]
if num_hops == 0:
yield sequence
return
for neighbor in neighbors(starting_position):
yield from yield_sequences(
neighbor, num_hops - 1, sequence + [neighbor])
def count_sequences(starting_position, num_hops):
num_sequences = 0
for sequence in yield_sequences(starting_position, num_hops):
num_sequences += 1
return num_sequences
This works, and it’s a common starting point I saw in interviews. Notice, however, that we generate the numbers and never actually use them. This problem asks for the count of numbers, not the numbers themselves. Once we count a number we never revisit it. As a general rule of thumb, I recommend paying attention to when your solution computes something it doesn’t use. Often you can cut it out and get a better solution
Level 2: Counting Without Counting
How can we count phone numbers without generating them? It can be done, but not without an additional insight. Notice how the count of numbers that can be generated from a given starting position in N hops is equal to the sum of the counts of hops that can be generated starting from each of its neighbors in N-1 hops. Stated mathematically as a recurrence relation, it looks like this:
This is intuitively obvious when you consider what happens with one hop: 6 has 3 neighbors (1, 7, and 0) and in zero hops you can reach one number for each, so you can only dial three numbers.
How does one arrive at this insight, you might ask? If you’ve studied recursion, this should become evident after some exploration on the whiteboard. Many candidates who’ve practiced recursion immediately notice this problem breaks down into smaller subproblems, which is a dead giveaway. If you’re in an interview with me and you can’t seem to arrive at this insight, I will usually give hints to help get you there, up to and including outright giving it away if prodding fails.
Once you have this insight in hand, you can already move forward and solve this problem again. There are a number of implementations that use this fact, but let’s start with the one I see most often in interviews: the naive recursive approach:
from neighbors import neighbors
def count_sequences(start_position, num_hops):
if num_hops == 0:
return 1
num_sequences = 0
for position in neighbors(start_position):
num_sequences += count_sequences(position, num_hops - 1)
return num_sequences
This next question is one you’re going to be hearing a lot from me: what is the Big-O complexity of this solution? For those who don’t know, Big-O complexity is (informally) a sort of shorthand for the rate at which the amount of computation required by a solution grows as a function of the size of the input. For this problem, the size of the input is the number of hops. If you’re interested in the proper mathematical definition, you can read more here.
For this implementation, every call to
count_sequences()
recursively calls count_sequences()
at least twice, because each key has at least two neighbors. Since we recurse a number of times equal to the desired number of hops, and the number of calls to count_sequences()
at least doubles with each call, we’re left with a runtime complexity of at least exponential time.
Level 3: Memoization
Can we do better? Using only the mathematical relation above and nothing else, not really. One of the reasons I love this problem, though, is it features layers of insight that yield more and more efficient solutions. To find the next one, let’s map out the function calls this function makes. Let’s consider the case of
count_sequences(6, 4)
. Note I use C
as the function name for brevity:
You may notice something peculiar: the
C(6, 2)
call is performed three times, and each time it performs the same computation, and returns the same value. The crucial insight here is that these function calls repeat, each time returning the same value. After you compute their result once there’s no need to recompute them.
If you’re wondering how you’re supposed to arrive at this, the easiest way is through good old-fashioned whiteboarding: staring at this problem statement in the abstract is nice, but I always encourage candidates to throw a sample solution up on the board. Solving out a problem like this and drawing the tree like I did above will see you writing the subtree for
C(6, 2)
multiple times, which you’re sure to notice. This is sometimes enough of an insight to allow candidates to bypass solutions 1 and 2 altogether and jump straight to this stage. Needless to say, that’s a huge time save in an interview where you only have 45 minutes to solve a problem.
Armed with this insight, how do we solve this problem? We can use memoization (not memorization), which basically means we record results of function calls we’ve seen before and use those instead of redoing the work. This way, when we encounter a place in the call graph where we would unnecessarily recompute an entire subtree, we instead immediately return the result we already computed. Here’s an implementation:
def count_sequences(start_position, num_hops):
cache = {}
def helper(position, num_hops):
if (position, num_hops) in cache:
return cache[ (position, num_hops) ]
if num_hops == 0:
return 1
else:
num_sequences = 0
for neighbor in neighbors(position):
num_sequences += helper(neighbor, num_hops - 1)
cache[ (position, num_hops) ] = num_sequences
return num_sequences
res = helper(start_position, num_hops)
return res
Alright, what’s the runtime complexity (Big-O) now? That’s tougher to answer. For the previous implementation, computing the runtime was as simple as counting the number of times the recursive function was called, which was always two or three times per call. This time counting is more complicated because the recursive call is guarded by a conditional. On the face of it there’s no obvious way to count function calls.
We can solve this mystery by instead looking at the cache. Each function call’s result is stored in the cache, and it’s inserted there exactly once. This allows us to reframe the question as “how does the size of the cache grow with the size of the input?” Given that the cache is keyed by position and number of hops, and there are exactly ten positions, we can conclude that the cache grows in direct proportion to the number of requested hops. This follows from the pigeonhole principle: once we have an entry in the cache for every combination of position and jump count, all calls will hit the cache rather than result in a new function call.
So we’re done right? Well, not quite. This solution has two drawback, one major(ish) and one minor. The major-ish drawback is that it’s recursive. Most languages place limits on the maximal size of their call stacks, which means there’s always a maximal number of hops an implementation can support. On my machine it fails after about 1000 hops. This is a major-ish limitation rather than a major one because any recursive function can be re-implemented in an iterative way, but still, it’s a hassle. As for the minor limitation, that actually leads us into the next solution…
Level 4: Dynamic Programming
The minor limitation of the recursive memoizing solution is clear when you look at the recurrence relation from above:
Notice that the results for N hops depend only on the results for calls with N-1 hops. Meanwhile, the cache contains entries for every (nonzero) number of hops. I call this a minor issue because it doesn’t actually cause any real problems, given that the cache grows only linearly with the number of hops. Requiring linear space isn’t the end of the world, but still, it’s inefficient.
What gives? Again, the result is clear when you look at a written-out solution and the code. Notice that the code starts with the largest number of hops and recurses directly down to the smallest:
If you imagine the entire function call graph as a sort of virtual tree, you’ll quickly see we’re performing a depth-first traversal. This is fine, it gives a proper solution, but it doesn’t take advantage of the shallow dependency property I pointed out above.
Can you perform a breadth-first traversal instead, where you start at the top and “visit” function calls for N-1 hops only after you’ve visited those for N hops? Sadly, no. The values of function calls with nonzero hops absolutely require the values from smaller hop counts, so you won’t get any results until you reach the zero-hop layer and start returning numbers rather than additional function calls (note the zero-hop layer isn’t depicted here).
You can however, reverse the order: visit layers with N hops only after you’ve visited layers with N-1 hops. Those of you who studied or are studying discrete mathematics will recognize all the necessary ingredients for an induction: we know that the values of zero-hop function calls are always one (the base case). We also know how to combine N-1 hop values to get N hop values, using the recurrence relation (the inductive step). We can start with a base case of zero hops and induce all values greater than zero. Here’s an implementation:
def count_sequences(start_position, num_hops):
prior_case = [1] * 10
current_case = [0] * 10
current_num_hops = 1
while current_num_hops <= num_hops:
current_case = [0] * 10
current_num_hops += 1
for position in range(0, 10):
for neighbor in neighbors(position):
current_case[position] += prior_case[neighbor]
prior_case = current_case
return current_case[start_position]
So what’s better about this version than the recursive, depth-first solution? Not a ton, but it has a few benefits. First off, it’s not recursive, meaning it can run for extremely large values without crashing. Second off, it uses constant memory, because it only ever needs two arrays of fixed size rather than the ever-growing cache of the memoization solution. Finally, it’s still linear time: I can compute 200,000 hops in just under twenty seconds.
What about the other solutions, you might ask? Unfortunately it’s impossible to rate an abstract candidate. Interviews are chaotic things; they can start late, people can be nervous, and they often arrive at insights and solutions late in the session, leaving them with little time to code anything. There’s also a conversation happening: I pay attention to how well the candidate communicates their thoughts and incorporates ideas and feedback. I always take these factors into account before making a hire/no-hire recommendation, and you just can’t do that in the abstract.
Instead of potential recommendations, I’ll focus on the things I’d like to be able to say. When assessing algorithms and data structures, I want to say something like “TC (The Candidate) explored the problem and produced a solution that addressed all edge cases, and improved the solution when presented with its shortcomings. In the end, they arrived at an optimal solution.” I also want to be able to say “TC chose appropriate data structures for the solution, and correctly answered questions about the Big-O of their solution’s runtime and space requirements.”
When assessing coding, my ideal statement would be “TC quickly and concisely translated their ideas into code. The code uses standard language constructs and is easy to read. All edge cases are addressed, and TC walked through their code to debug it and verify it’s correct.” For entry-level roles I give bonus points if there’s some sort of testing, but more experienced roles I penalize candidates who don’t at least list relevant test cases.
As for speed of progress, I’d love to be able to say “TC drove the problem solving process: they developed most of their own solution, and were able to identify and address shortcomings without my pointing them out. TC required only minimal hints to get them moving in the right direction.”
Anyone I can say these things about gets a “Strong Hire” in my book. However, “Hire” and “Leaning Hire” are also positive endorsements. If you come up short in one area but shine in another then I can probably still justify a positive recommendation.
- Always start by solving a small instance of the problem by hand. In this problem the recurrence relation and the repetition of function calls become much more obvious when you hand-solve a problem.
- Pay attention to when your solution is computing things you don’t need, like how the naive counting solution generates the sequences but doesn’t actually use them. Reducing unnecessary computation can often provide simpler solutions, if not open the door to more efficient ones.
- Know your recursion. It’s almost useless in most production code because it blasts through the stack, but it’s a very powerful algorithm design tactic. Recursive solutions can often be adapted and improved: the difference between the exponential time naive solution and the linear time nearly-optimal memoization solution is minimal.
- Know your Big-O analysis! You’re practically guaranteed to be asked this at some point during the interview process.
- Always be on the lookout for opportunities to memoize. If your function is deterministic and you’ll be calling it multiple times with the same inputs, your solution may benefit from memoization.
- Find and write out the recurrence relation. In this case writing it out makes it obvious that counts for N hops depend only on counts for N-1 hops.
Google Interview Questions Deconstructed: The Knight’s Dialer
Perhaps you’ve already noticed this problem can be solved by enumerating all possible numbers and counting them. You can use recursion to generate these values:
This works, and it’s a common starting point I saw in interviews. Notice, however, that we generate the numbers and never actually use them. This problem asks for the count of numbers, not the numbers themselves. Once we count a number we never revisit it. As a general rule of thumb, I recommend paying attention to when your solution computes something it doesn’t use. Often you can cut it out and get a better solution.