Configurations and CNF formula for neighboring configuration


A configuration of a Turing machine $ M$ which runs in space $ S(n)$ contains the state, the head positions, and the content of non-blank cells of all the tapes. For $ M$ and an input $ x$ , we define its configuration graph $ G_{M,x}$ as a directed graph whose nodes correspond to all the possible configurations and there is an edge from a configuration $ C$ to $ C’$ if $ C’$ can be reached from $ C$ in one step.

First question: In Arora-Barak (snapshot below), they say that these nodes can be encoded in a binary string of length $ O(S(n))$ . Such encoding contains the state, all symbols under the head, and non-blank content of work tapes with special marking to denote the location of the head. I think this is not correct since such an encoding doesn’t contain the input head position. If we don’t store the input head position, which requires O(\log n) bits, then two nodes can map to the same encoding, which seems wrong. Am I right? Although storing the input head position will not increase the length of the encoding since we are assuming in the book that $ S(n) > \log n$ .

Second question: Next Arora-Barak says that we can construct a CNF formula $ \phi_{M,x}$ such that for any two strings $ C$ and $ C’$ , $ \phi_{M,x}(C,C’) = 1$ iff $ C$ and $ C’$ encode two neighboring configuration in $ G_{M,x}$ . I am not able to figure out the proof of this claim with the kind of encoding that I have described above i.e. encoding in which we also store the index of input head. Suppose a configuration C has $ q_1, q_2, \dots, q_c$ bits for state, $ h_1,h_2, \dots, h_{O(\log n)}$ bits for input head positions, $ w_1, w_2, \dots, w_{S(n)}$ bits for work tape content and $ wh_1, wh_2, \dots, wh_{S(n)}$ bits for work tape head position, where $ wh_i$ is equal to 1 if the head is on the $ i$ th cell, else it is equal to 0. Then how with such an encoding we can construct a CNF formula as described at the beginning of this question?

enter image description here