I am writing a picross solver, and I am going with the “human logic” solving, which attempts to reproduce the reasoning a human might have when confronted to such a puzzle, in an iterative manner.
Having solved a lot of those by hand, I found the human logic to be very algorithmic-ish given how the same kind of reasoning takes place in most situations (which is also what motivated me in writing a solver).
However, I am encountering problems when trying to formalize some of that logic.
What is picross
Also called nonograms, they are puzzles consisting of an empty grid with clues on the top and left borders:
The clues indicate the length of groups of consecutive cells that need to be checked, and groups must be separated by at least one empty (unchecked) cell. Clues are ordered.
For example, clues
7 2 in the first row indicate that the following must take place somewhere in that row:
- 7 consecutive cells must be checked
- at least one cell must be left empty after that 7-long sequence
- another 2 consecutive cells must be checked after the empty cell(s)
Here is what the solved grid looks like:
In order to help in the process of solving, it is also possible to cross cells to indicate that they cannot be checked, as in they must remain empty for sure.
The easiest part of the solving is filling the empty grid with what you can deduce from the clues alone in each row and column. In that regard, some clues are nice, and some clues are not so nice.
For example, given that the previous grid is 10 cells wide, and that clues
7 2 fit in exactly 10 cells (7 checks + 1 space + 2 checks), there is only one possible solution for the first row.
However, by looking at the second row and its clue
6, you will only be able to find that only cells at columns 5 and 6 can be checked with certainty:
|■|■|■|■|■|■| | | | | 1. Check the 6 leftmost cells ↓ | | | | |■|■|■|■|■|■| 2. Slide the group all the way to the right ↓ | | | | |■|■| | | | | 3. Keep cells that are *always* checked during the sliding
You can generalize this reasoning to a group of several clues by calculating the minimum space in which they fit, computing the difference with the actual available space and checking cells at only certain indices depending on those two numbers.
As mentioned, this is the easy part, because it only depend on the clues. It assumes the row or column is empty and thus it doesn’t take its state into account.
Further (iterative) solving
From that point onwards, several other pieces of reasoning can be performed in a loop on each row and column, and that will be sufficient to solve the entire grid in most cases. This paper, from section 2.2.2 (p. 16), identifies all such techniques. There is one problem however: these reasonings take into account the current state of a row or column, they build upon it. What that means is that in order for most of those to be carried out, they need to know which group of checked cells may correspond to which clue.
For example, considering the following row (10-wide with clues
3 4║ | | | | |■| | |■| ║
an algorithm identifying which cell may belong to which clue would tell me the following:
both cells 6 and 9 belong to clue
Reason: if cell 6 belonged to clue
3, a group satisfying clue
4 would not fit in the row (at least while being properly separated by an empty cell, as required). Cell 9 cannot belong to clue
3 for the same reason, as well as because cell 6 is located before and was found to exclusively belong to clue
4, which is after clue
For the following row:
3 4║ | | |■|■| |■| | | ║
the algorithm would tell me:
cells 4 and 5 belong to clue
3, and cell 7 belongs to clue
Reason: if cells 4 and 5 belonged to clue
4, a group satisfying clue
3 would not fit in the row. In the same way, there would not be enough space to satisfy clue
4 if cell 7 belonged to clue
Finally, for the following row:
3 4║ | | | |■| |■| | | ║
the algorithm would tell me: cell 5 belongs to either clue
3 or clue
4, but cell 7 belongs to clue
4 for sure
- if cell 5 belongs to clue
3, there is still enough space for both a group of three ending at cell 5 and a group of four beginning at cell 7. Cell 5 may belong to clue
- if cell 5 belongs to clue
4, there is still enough space for both a group of three starting at cell 1 and a group of four starting at cell 5. Cell 5 may belong to clue
- if cell 7 belongs to clue
4, there is plenty of space for both groups required by clues. Cell 7 may belong to clue
- if cell 7, however, belonged to clue
3, there would not be enough space for a group satisfying clue
4 afterwards. Cell 7 may not belong to clue
And this is exactly that algorithm which I’m having trouble writing down.
It also turns out I was able to come up with a solution while writing this question. See my (rather long) self-answer below.