# Intersection-based word suggestion algorithm

**Status**: Approved | Implemented | 2025

**Author**: Victor Ma

**Reviewer**: Jonathan Blandford

## Introduction

The intersection-based [word suggestion algorithm](word-suggestion.md) generates
word suggestions that are constrained by the current filter and the intersecting
filter at the current cell. The algorithm is implemented by the function
`word_list_find_intersection()`.


## The algorithm

The algorithm works like this, where $n$ is the offset of the current cell in
the current clue, and $m$ is the offset of the current cell in the intersecting
clue:
1. **(Phase 1)** Get the set of possible letters for the current cell,
   constrained by the current filter:
    1. Get the set of words that match the current filter.
    1. Return the set of letters that appear as the $n^{\text{th}}$ letter, in
       the set of words.
1. **(Phase 2)** Get the set of possible words for the intersecting clue, and
   the set of possible letters for the current cell---both constrained by the
   intersecting filter and the current filter:
    1. Get the set of words that satisfy both these constraints:
        * The word matches the intersecting filter.
        * The $m^{\text{th}}$ letter of the word is in the set of letters from
          Phase 1.
    1. Get the set of letters that appear as the $m^{\text{th}}$ letter, in the
       set of words.
    1. Return the set of words and the set of letters.
1. **(Phase 3) (optional)** Get the set of possible words for the current clue,
   constrained by the current filter and intersecting filter:
    1. Get the set of words that satisfy both these constraints:
        * The word matches the current filter.
        * The $n^{\text{th}}$ letter of the word is in the set of letters from
          Phase 2.
    1. Return the set of words.
1. **(Return)** Return the set of words from Phase 2 and (optionally) Phase 3.


(limitations)=
## Limitations

This algorithm only considers two constraints:
* The filter for the current clue.
* The filter for the clue that intersects the current clue at the current cell.

This means that the algorithm is unaware of all the other intersecting
clues---and the constraints that they impose on the current clue.

Consider the following grid:
```
+---+---+---+---+
|   |   |   | Z |
+---+---+---+---+
|   |   |   | E |
+---+---+---+---+
|   |   |   | R |
+---+---+---+---+
| W | O | R |   | < current clue
+---+---+---+---+
```
4-Down begins with *ZER*, so the only word it can be is *ZERO*. This constrains
the bottom-right cell to the letter *O*.

4-Across starts with *WOR*. We know that the bottom-right cell must be *O*, so
that means that 4-Across must be *WORO*. But *WORO* is not a word. So, 4-Down
and 4-Across are both unfillable, because no letter fits in the bottom-right
cell that works for both 4-Down and 4-Across.

Now, suppose that the current clue is 4-Across, and the current cell is the
bottom-right cell. Then, the intersection algorithm checks the filters for
4-Across and 4-Down. It detects that the two clues are unfillable, and
so it correctly returns the empty set.

But what if the current cell is one of the first three cells in 4-Across? Then,
the algorithm never checks the filter for 4-Down, and so it doesn't know about
the constraint that it imposes on the current clue. And so, the algorithm
returns all the words that match the filter *WOR?*, like *WORD* and
*WORM*---even though they do not actually fit in the clue, because they turn
4-Down into a nonexistent word.

### Consequences

There are two undesirable consequences of this:
* The intersection algorithm can generate dead-end words. These are frustrating
  for the user. The user might enter a word from the word suggestion list,
  reach a dead end, and then have to undo all their progress since adding that
  suggested word.
* The intersection algorithm's output can vary based on the current cell. This
  is semantically incorrect, because the word suggestions for a clue should not
  change based on the cursor's position.