## Sunday, June 5, 2016

### Boggle Princeton Part II - Programming Assignment 4

http://coursera.cs.princeton.edu/algs4/assignments/boggle.html
The Boggle game. Boggle is a word game designed by Allan Turoff and distributed by Hasbro. It involves a board made up of 16 cubic dice, where each die has a letter printed on each of its sides. At the beginning of the game, the 16 dice are shaken and randomly distributed into a 4-by-4 tray, with only the top sides of the dice visible. The players compete to accumulate points by building valid words out of the dice according to the following rules:

• A valid word must be composed by following a sequence of adjacent dice—two dice are adjacent if they are horizontal, vertical, or diagonal neighbors.
• A valid word can use each die at most once.
• A valid word must contain at least 3 letters.
• A valid word must be in the dictionary (which typically does not contain proper nouns).
Scoring. Words are scored according to their length, using this table:
 word length points 0–2 0 3–4 1 5 2 6 3 7 5 8+ 11

The Qu special case. In the English language, the letter Q is almost always followed by the letter U. Consequently, the side of one die is printed with the two-letter sequence Qu instead of Q (and this two-letter sequence must be used together when forming words). When scoring, Qu counts as two letters; for example, the word QuEUE scores as a 5-letter word even though it is formed by following a sequence of 4 dice.

Your task. Your challenge is to write a Boggle solver that finds all valid words in a given Boggle board, using a given dictionary. Implement an immutable data type BoggleSolver with the following API:
```
public class BoggleSolver
{
// Initializes the data structure using the given array of strings as the dictionary.
// (You can assume each word in the dictionary contains only the uppercase letters A through Z.)
public BoggleSolver(String[] dictionary)

// Returns the set of all valid words in the given Boggle board, as an Iterable.
public Iterable<String> getAllValidWords(BoggleBoard board)

// Returns the score of the given word if it is in the dictionary, zero otherwise.
// (You can assume the word contains only the uppercase letters A through Z.)
public int scoreOf(String word)
}

https://github.com/nastra/AlgorithmsPartII-Princeton/tree/master/src```
`https://github.com/nastra/AlgorithmsPartII-Princeton/blob/master/src/BoggleBoard.java`
```private BoggleTrieST<Integer> dict = new BoggleTrieST<>();
public BoggleSolver(String[] dictionary) {
for (String s : dictionary) {
dict.put(s, 1);
}
}
private static class BoggleTrieST<Value> {
private static final int R = 26; // A-Z letters
private static final int OFFSET = 65; // Offset of letter A in ASCII table

private Node root = new Node();

private static class Node {
private Object val;
private Node[] next = new Node[R];
}

public enum NodeType {
PREFIX, MATCH, NON_MATCH
}

/****************************************************
* Is the key in the symbol table?
****************************************************/
public boolean contains(String key) {
return get(key) != null;
}

public Value get(String key) {
Node x = get(root, key, 0);
if (x == null)
return null;
return (Value) x.val;
}

private Node get(Node x, String key, int d) {
if (x == null)
return null;
if (d == key.length())
return x;
char c = key.charAt(d);
return get(x.next[c - OFFSET], key, d + 1);
}

/****************************************************
* Insert key-value pair into the symbol table.
****************************************************/
public void put(String key, Value val) {
root = put(root, key, val, 0);
}

private Node put(Node x, String key, Value val, int d) {
if (x == null)
x = new Node();
if (d == key.length()) {
x.val = val;
return x;
}
char c = key.charAt(d);
x.next[c - OFFSET] = put(x.next[c - OFFSET], key, val, d + 1);
return x;
}

// find the key that is the longest prefix of s
public String longestPrefixOf(String query) {
int length = longestPrefixOf(root, query, 0, 0);
return query.substring(0, length);
}

// find the key in the subtrie rooted at x that is the longest
// prefix of the query string, starting at the dth character
private int longestPrefixOf(Node x, String query, int d, int length) {
if (x == null)
return length;
if (x.val != null)
length = d;
if (d == query.length())
return length;
char c = query.charAt(d);
return longestPrefixOf(x.next[c - OFFSET], query, d + 1, length);
}

public Iterable<String> keys() {
return keysWithPrefix("");
}

public Iterable<String> keysWithPrefix(String prefix) {
Queue<String> queue = new Queue<String>();
Node x = get(root, prefix, 0);
collect(x, prefix, queue);
return queue;
}

public boolean isPrefix(String prefix) {
return get(root, prefix, 0) != null;
}

public NodeType getNodeType(String key) {
Node x = get(root, key, 0);
if (x == null)
return NodeType.NON_MATCH;
else if (x.val == null)
return NodeType.PREFIX;
else
return NodeType.MATCH;
}

private void collect(Node x, String key, Queue<String> queue) {
if (x == null)
return;
if (x.val != null)
queue.enqueue(key);
for (int c = 0; c < R; c++)
collect(x.next[c - OFFSET], key + (char) c, queue);
}

public Iterable<String> keysThatMatch(String pat) {
Queue<String> q = new Queue<String>();
collect(root, "", pat, q);
return q;
}

public void collect(Node x, String prefix, String pat, Queue<String> q) {
if (x == null)
return;
if (prefix.length() == pat.length() && x.val != null)
q.enqueue(prefix);
if (prefix.length() == pat.length())
return;
char next = pat.charAt(prefix.length());
for (int c = 0; c < R; c++)
if (next == '.' || next == c)
collect(x.next[c - OFFSET], prefix + (char) c, pat, q);
}

public void delete(String key) {
root = delete(root, key, 0);
}

private Node delete(Node x, String key, int d) {
if (x == null)
return null;
if (d == key.length())
x.val = null;
else {
char c = key.charAt(d);
x.next[c - OFFSET] = delete(x.next[c - OFFSET], key, d + 1);
}
if (x.val != null)
return x;
for (int c = 0; c < R; c++)
if (x.next[c - OFFSET] != null)
return x;
return null;
}
}
```

```public Iterable<String> getAllValidWords(BoggleBoard board) {
TreeSet<String> words = new TreeSet<>();
for (int i = 0; i < board.rows(); i++) {
for (int j = 0; j < board.cols(); j++) {
searchWords(board, i, j, words);
}
}
return words;
}

private void searchWords(BoggleBoard board, int i, int j, TreeSet<String> words) {
boolean[][] visited = new boolean[board.rows()][board.cols()];
dfs(board, i, j, words, visited, "");
}
private void dfs(BoggleBoard board, int i, int j, Set<String> words, boolean[][] visited, String prefix) {
if (visited[i][j]) {
return;
}

char letter = board.getLetter(i, j);
prefix = prefix + (letter == 'Q' ? "QU" : letter);

if (prefix.length() > 2 && dict.contains(prefix)) {
}
if (!dict.isPrefix(prefix)) {
return;
}

visited[i][j] = true;

// do a DFS for all adjacent cells
if (i > 0) {
dfs(board, i - 1, j, words, visited, prefix);
if (j > 0) {
dfs(board, i - 1, j - 1, words, visited, prefix);
}
if (j < board.cols() - 1) {
dfs(board, i - 1, j + 1, words, visited, prefix);
}
}
if (j > 0) {
dfs(board, i, j - 1, words, visited, prefix);
}
if (j < board.cols() - 1) {
dfs(board, i, j + 1, words, visited, prefix);
}
if (i < board.rows() - 1) {
if (j > 0) {
dfs(board, i + 1, j - 1, words, visited, prefix);
}
if (j < board.cols() - 1) {
dfs(board, i + 1, j + 1, words, visited, prefix);
}
dfs(board, i + 1, j, words, visited, prefix);
}
visited[i][j] = false;
}
```

```https://segmentfault.com/a/1190000005345079

```

```
```

checklist也不是圣经，有独立思考问题的意识才可能发现更大的世界：在实现非递归版DFS时明显感到比较吃力，同时需要追踪维护很多变量，而DFS的逻辑本身也更适合递归；非递归版DFS全当练手，而能够转回递归实现版本则是大胆独立思考的结果；

```
https://blog.niallconnaughton.com/2015/12/10/solving-boggle-boards-at-scale/
```
By searching neighbours of neighbours we end up with an exponential algorithm, and we don’t stop trying new letters on the end of the word until we run out of neighbours to try. So, the worst case word candidates will use every letter on the board, in every possible arrangement. It’s very unlikely that any of these candidates will be real words. It’s not even likely that any word longer than 10 or so letters will be valid. Appending a new letter to “AAPYNU” isn’t likely to make anything better.

The real problem here is that the exponential search space through the board is immensely larger than the number of valid words in the dictionary. The brute force approach on a 4×4 grid of letters evaluates 12 million distinct paths, but there are only ~264,000 words in the test dictionary. The vast majority of all those distinct paths are going to be a complete waste of time.

Earlier I mentioned that we end up tracing through words that start with a prefix that no real word starts with. If we can prune the evaluation tree as soon as we hit such a prefix, we can save a lot of time. But having our set of dictionary words in a hash table doesn’t let us do that kind of a prefix search. We need a data structure better suited to this task – a trie.

The basic trie approach I implemented searches the dictionary for each possible word from the root of the trie each time. So in the above example, if the solver is evaluating the letters “TE”, it searches from the root, through each letter. Then we check “TEA”, and it searches again starting from the first letter. We already know “TE” is a valid prefix, we only need to search the next letter we’re evaluating.
Approach #3 – stepping through the board and trie in sync
```
```
Instead of checking if the whole current word we’re looking at is a valid prefix each time we add a new letter, we can keep track of where we’re up to in the trie, and check if the next letter on the board is a valid next letter in the dictionary from where we are.

No amount of micro optimisations on the wrong implementation are going to get anywhere near a better algorithm here. A profiler can show you where your code is taking lots of time, but it can’t look at your algorithm and tell you that you’re approaching the problem the wrong way.

```